[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: proposal: a specialized VM for numeric computation, called as a Lua library
- From: David Manura <dm.lua@...>
- Date: Thu, 11 Mar 2010 23:50:10 -0500
One thing that struck me about LPeg and regex engines in an
interpreted language like Lua is that they are specialized virtual
machines for strings data, and they are called as a library from
within the general purpose Lua virtual machine, which in turn may be
called as a library from another environment like C. I was wondering
how much benefit there would be to likewise have a specialized virtual
machine, callable from Lua as a library, for numeric processing, sort
of analogous to an FPU or GPU but for the Lua VM. Sometimes I have a
section of performance critical code, perhaps similar to the inner for
loop of mandelbrot.lua [1] or some cryptographic code. This code uses
only locals, arithmetic operations on numbers, and control structures.
It doesn't need strings and metamethods and all that extra stuff.
I've sometimes found it worthwhile for increased performance to
convert code like this into a C function and call it from Lua.
Although the code can be translated fairly mechanically (even
automatically [2]), do we really need to convert to C and bring in the
complexities of a C compiler (even be it inlined via tcc [4])?
Obviously, there's LuaJIT, but what I have in mind is something much
simpler, smaller, and more portable but still offering a good speedup
over regular Lua and maintaining much of the Lua syntax. Optimizing
mandelbrot.lua might look something like this:
local f = compile [[
local x,M,Ci = ...
local Cr = x*M-1.5
local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
local badd = 0
for i=1,49 do
Zi = Zr*Zi*2 + Ci
Zr = Zrq-Ziq + Cr
Ziq = Zi*Zi
Zrq = Zr*Zr
if Zrq+Ziq > 4.0 then badd = 1; break; end
end
return badd
]]
.....
for x=0,N-1 do
local badd = f(x,M,Ci)
.....
end
Here, "compile" would convert the given Lua-like code snippet to a
function object, under the assumption that all variables in the
snippet are local and of type number. These restrictions are made so
that the code can be compiled to a more specialized bytecode format
and interpreted under a specialized VM optimized for numeric code.
The language recognized by "compile" might also include extensions to
the Lua language, such as intrinsic operators like trigonometric and
bitwise, as you would see in a CPU and FPU. These extensions are only
made within the secondary interpreter loaded as a library; no patches
are made to the main Lua interpreter. Hints may even be added to
support parallel computation, but it would remain compilable under
ANSI C by simply serializing ops. SSE-like ops may be added as well,
which can give some speedups even in ANSI C [3].
As an initial test, I eliminated metamethods by converting things like
"arith_op(luai_numadd, TM_ADD)" in lvm.c to "setnvalue(ra,
luai_numsub(nvalue(RKB(i)), nvalue(RKC(i))))" and ripped out the debug
hooks, but the speedup was quite small. One could go further and
eliminate the type (tt) field from TValue, since all values will be
numbers, and rip out the garbage collection and basically anything
else not strictly necessary, but that requires more extensive code
reductions, and I don't really intend to seriously spend the time on
it lacking an imminent need for it.
Has anyone done anything like this before?
[1] http://shootout.alioth.debian.org/u64/program.php?test=mandelbrot&lang=lua&id=1
[2] http://lua-users.org/wiki/LuaToCee
[3] http://lua-users.org/wiki/SimdExperiment
[4] http://lua-users.org/wiki/InlineCee