[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] Lua 5.2.1 (work1) now available
- From: KHMan <keinhong@...>
- Date: Fri, 23 Mar 2012 02:43:46 +0800
On 3/22/2012 2:37 AM, Luiz Henrique de Figueiredo wrote:
Lua 5.2.1 (work1) is now available at
http://www.lua.org/work/lua-5.2.1-work1.tar.gz
[snip]
Lua 5.2.1 introduces better handling of string collisions based on a
random seed. This work version is meant to let the community assess
the usefulness and the effectiveness of this experimental feature.
The complete diffs from Lua 5.2.0 to 5.2.1 are available at
http://www.lua.org/work/diffs-lua-5.2.0-lua-5.2.1-work1.txt
[snip]
It doesn't seem any slower... Here are a few cheap data points :-)
Ran a few very simple scripts processing md5-like data. Input data
is output from md5deep -rl on a linux-3.3 tree on Cygwin:
linux33-md5.dat 2.39MB (32 byte MD5 hex strings)
linux33-sha1.dat 2.68MB (40 byte SHA1 hex strings)
linux33-sha256.dat 3.55MB (64 byte SHA256 hex strings)
Each file have 38,069 lines in the form of:
<hash> <relative-path>
Tried two simple scripts:
(a) load data, parse, dump into two arrays
(b) load data, parse, use table lookup to find
duplicate hash strings
The non-scientific results are:
load list (timing in sec, lower of 2 runs)
=========================================
dataset -> md5 sha1 sha256
lua-5.1.5 0.259 0.258 0.286
lua-5.2.0 0.261 0.274 0.285
lua-5.2.1wk1
shortlen=16 0.180 0.188 0.204
shortlen=32 0.224 0.201 0.220
shortlen=48 0.240 0.247 0.235
shortlen=64 0.249 0.248 0.263
shortlen=128 0.259 0.256 0.279
hash dupe (timing in sec, lower of 2 runs)
=========================================
dataset -> md5 sha1 sha256
lua-5.1.5 0.236 0.247 0.265
lua-5.2.0 0.252 0.264 0.304
lua-5.2.1wk1
shortlen=16 0.188 0.200 0.228
shortlen=32 0.215 0.211 0.234
shortlen=48 0.236 0.240 0.241
shortlen=64 0.242 0.244 0.264
shortlen=128 0.253 0.263 0.297
The results tend to approach 5.2.0 times with increasing
LUA_MAXSHORTLEN, but isn't significantly slower.
System is an AMD Athlon 64 X2 5000+ (64 byte cache lines)
WinXP SP3 32-bit Cygwin gcc 4.5.3 lua "make generic"
test-load-list.lua
==================
local io = require "io"
local string = require "string"
local hash, fpath = {}, {}
for l in io.lines(arg[1]) do
local h, fp = string.match(l, "^(%S+)%s+(.+)$")
hash[#hash + 1] = h
fpath[#fpath + 1] = fp
end
print("Items loaded: "..#fpath)
test-hash-dupe.lua
==================
local io = require "io"
local string = require "string"
local hash = {}
local dupe = 0
for l in io.lines(arg[1]) do
local h, fp = string.match(l, "^(%S+)%s+(.+)$")
if hash[h] then
hash[h] = hash[h] + 1
dupe = dupe + 1
else
hash[h] = 1
end
end
print("Items loaded: "..#hash)
print("Hash duplicates: "..dupe)
A bigger dataset would be better, but at least in the above all
disk I/O get cached in memory. Have not tried extreme string
comparisons yet. Failed to dream up a short script that needs
mind-boggling amounts of it...
--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia