[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Native Complex numbers for LuaJIT-2 [was Re: Benchmark shootout shows LuaJIT 2.0]
- From: Mike Pall <mikelu-0911@...>
- Date: Mon, 2 Nov 2009 15:43:28 +0100
Leo Razoumov wrote:
> I would like to give it a try. I implemented complex numbers as
> userdata for the Lua interpreter. But userdata is allocated on the
> heap and, thus, is too slow for tight loops commonly found in
> numerics.
> Bringing down box/unbox overhead could save the day.
Yes, the JIT compiler ought to be able to remove the overhead in
tight loops. It might not work too well for branchy loops, though.
> Also I am a bit worried about function dispatch. Adding two
> doubles is a native Lua opcode and it does not go through the
> trouble of metamethods. Using __add, __mul, etc metamethods
> dispatch for complex numbers is slow. Could it be avoided?
That's an issue for the interpreter, yes. But the JIT compiler
treats metamethod dispatch like any other table lookup. It's
usually able to disambiguate it, to hoist it and so on.
[If the complex data type were to be defined as a special kind of
userdata, the JIT compiler could shortcut the dispatch even under
more difficult circumstances.]
To see the metamethod dispatch hoisting, try this program:
local t = {}
for i=1,100 do t[i] = tostring(i) end
local x = 0
for i=1,100 do x = x + t[i]:len() end
print(x)
The dispatch in the second loop first involves a lookup of the
"__index" table in the string metatable. Then "len" is looked up
in this table and the resulting function (string.len) is called.
Ok, so run it with:
luajit -jdump=im test.lua
Here's the loop part of the second trace:
->LOOP:
f7f21e20 cmp edi, ecx // Array bounds check
f7f21e22 jnb 0xf7f1a010 ->2
f7f21e28 cmp dword [eax+edi*8+0x4], -0x05 // Type check for array load
f7f21e2d jnz 0xf7f1a010 ->2
f7f21e33 mov esi, [eax+edi*8] // temp1 = t[i]
f7f21e36 xorps xmm6, xmm6
f7f21e39 cvtsi2sd xmm6, [esi+0xc] // temp2 = #temp1
f7f21e3e addsd xmm7, xmm6 // x = x + temp2
f7f21e42 add edi, +0x01
f7f21e45 cmp edi, +0x64
f7f21e48 jle 0xf7f21e20 ->LOOP
f7f21e4a jmp 0xf7f1a014 ->3
Pretty short, eh? As you can see, all dispatch has been hoisted.
--Mike