[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] Lua 5.2.1 (work1) now available
- From: KHMan <keinhong@...>
- Date: Fri, 23 Mar 2012 17:37:59 +0800
On 3/22/2012 2:37 AM, Luiz Henrique de Figueiredo wrote:
Lua 5.2.1 (work1) is now available at
http://www.lua.org/work/lua-5.2.1-work1.tar.gz
[snip]
Lua 5.2.1 introduces better handling of string collisions based on a
random seed. This work version is meant to let the community assess
the usefulness and the effectiveness of this experimental feature.
[snip]
A few more cheap data points comparing hex md5-like hash strings
to see if files have changed given two datasets. Input data is
output from md5deep -rl on linux-3.2 and linux-3.3 trees on Cygwin.
32 byte - MD5 hex strings
40 byte - SHA1 hex strings
64 byte - SHA256 hex strings
The main loop is repeated 10 times to artificially run more
string-string compares. The non-scientific results are:
same files (timing in sec, lower of 2 runs)
=========================================
dataset -> md5 sha1 sha256
--------------------------------------
lua-5.1.5 0.658 0.670 0.715
lua-5.2.0 0.624 0.635 0.692
lua-5.2.1wk1
shortlen=16 0.621 0.651 0.709
shortlen=32 0.605 0.639 0.688
shortlen=48 0.594 0.589 0.673
shortlen=64 0.596 0.593 0.653
shortlen=128 0.614 0.622 0.683
It can be seen that lua-5.2.1wk1 is fastest for sha256 compares
when the sha256 strings are interned with LUA_MAXSHORTLEN=64.
lua-5.2.1wk1 is mostly slightly faster than lua-5.2.0 here
probably because of the difference in processing loads for the
initial lines of data from io.lines(). Of course, the string
compares here are limited to strings of 32/40/64 bytes... If
interning-at-first-compare is easy to implement, I'll try it and
add a data point.
test-same-files.lua
===================
local io = require "io"
local string = require "string"
local setA = {}
for l in io.lines(arg[1]) do
local hashA, fpathA = string.match(l, "^(%S+)%s+(.+)$")
setA[fpathA] = hashA
end
local hashesB, fpathsB = {}, {}
for l in io.lines(arg[2]) do
local h, fp = string.match(l, "^(%S+)%s+(.+)$")
hashesB[#hashesB + 1] = h
fpathsB[#fpathsB + 1] = fp
end
local identicaln, changedn = 0, 0
for i = 1, 10 do -- repeated 10 times
local identical, changed = {}, {}
for j = 1, #fpathsB do
local hashB = hashesB[j]
local fpathB = fpathsB[j]
local hashA = setA[fpathB]
if hashA then
if hashA == hashB then
identical[#identical + 1] = fpathB
else
changed[#changed + 1] = fpathB
end
end
end
identicaln = identicaln + #identical
changedn = changedn + #changed
end
print("Files identical: "..identicaln)
print("Files changed: "..changedn)
--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia