[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] LuaJIT-2.0.0-beta8
- From: Mike Pall <mikelu-1106@...>
- Date: Thu, 23 Jun 2011 21:00:42 +0200
Josh Haberman wrote:
> I noticed that your varint encoding uses continuation bits
> (protocol buffers do this also). You can achieve the same
> bit density by putting all the continuation bits in the first
> byte instead. For example, instead of:
>
> 1aaa aaaa 1bbb bbbb 0ccc cccc
>
> do:
>
> 110a aaaa aabb bbbb bccc cccc
>
> The latter encoding has the following advantages:
>
> 1. you don't have to branch on every byte.
> 2. you don't have to transform the value with shifts,
> you're just adding or masking away high bits.
For the bytecode I'm already delta-encoding most values and use a
couple of other tricks to reduce their range. So the majority of
values for the ULEB128 encoding fit into 0-127 and only need a
single byte. Then it _is_ faster to do this:
v = *p++;
if (LJ_UNLIKELY(v >= 0x80)) { ... decode the remaining bytes ... }
return v
Since the byte framing and the skewed distribution of values is
still intact, the saved bytecode is still amenable to compression
with standard data compression algorithms.
[I've even considered to integrate a minimal decompressor into the
bytecode reader. However it would only save around 35%. The more
complex algorithms can save up to 60%, but these are way too big
for integration.]
BTW: Source code vs. bytecode load/save benchmark script attached.
[us = microseconds, lower is better.]
$ lua-5.1.4-x64 loadbench.lua scimark.lua
556 us 11855 bytes parse source code
108 us 20004 bytes save bytecode
98 us load bytecode
$ luajit-2.0.0-beta8-x64 loadbench.lua scimark.lua
338 us 11855 bytes parse source code
25 us 8731 bytes save bytecode
14 us 5270 bytes save bytecode, no debug info
11 us load bytecode
8 us load bytecode, no debug info
--Mike
-- Source code parser vs. bytecode load/save benchmark.
-- Written by Mike Pall. Public domain.
local function bench(msg, f)
local n = 100
local s = f()
repeat
local start = os.clock()
for i=1,n do f() end
local diff = os.clock() - start
if diff > 0.5 then
if s then
io.write(string.format("%5.0f us %5d bytes %s\n", diff*1000000/n, s, msg))
else
io.write(string.format("%5.0f us %s\n", diff*1000000/n, msg))
end
break
end
n = n+n
until false
collectgarbage()
end
if not arg[1] then io.write("usage: ", arg[0], " file.lua"); os.exit(1) end
local str = assert(io.open(arg[1])):read("*a")
local func = assert(loadstring(str, ""))
if jit then jit.off() end
collectgarbage()
bench("parse source code", function() loadstring(str, ""); return #str end)
bench("save bytecode", function() return #string.dump(func) end)
if jit and jit.version_num >= 20000 then
bench("save bytecode, no debug info",
function() return #string.dump(func, true) end)
end
local dump1 = string.dump(func)
bench("load bytecode", function() loadstring(dump1, "") end)
if jit and jit.version_num >= 20000 then
local dump2 = string.dump(func, true)
bench("load bytecode, no debug info", function() loadstring(dump2, "") end)
end