Optimisation Coding Tips |
|
t[#t+1] = 0
is faster than table.insert(t, 0)
.
x * (1/3)
is just as fast as x * 0.33333333333333333
and is generally faster than x/3
on most CPUs (see multiplication note below). 1 + 2 + x
, which is the same as (1+2) + x
should be just as fast as 3 + x
or x + (1 + 2)
but faster than x + 1 + 2
, which is the same as (x + 1) + 2
as is not necessary equivalent to the former. Note that addition of numbers on computers is generally not associative when overflow occurs, and the compiler doesn't even know whether x
is a number or some other type with a non-associative __add
metamethod. - LuaList:2006-03/msg00363.html . It's been reported that Roberto is seriously thinking about removing constant folding from Lua 5.2 since constant folding has been a source of bugs in Lua (though some of us really like constant folding -- DavidManura).
x*0.5
is faster than division x/2
.
x*x
is faster than x^2
x*y+x*z+y*z
--> x*(y+z) + y*z
. Lua will not do this for you, particularly since it can't assume distributive and other common algebraic properties hold during numerical overflow.
Note that Roberto Ierusalimschy's article Lua Performance Tips from the excellent [Lua Programming Gems] book is [available online].
The following information concerns optimization of Lua 4 and is kept here for historical reference.
GameState
, needs global scope for access from C, make a secondary variable that looks like 'local GSLocal = GameState
' and use GSLocal
within the module. This technique can also be used for functions that are called repetitively, too. (see OptimisingUsingLocalVariables)
lua_rawcall()
to call other functions. The overhead of a setjmp()
call for exceptions (and a few other things) is avoided. I would not recommend using lua_rawcall()
outside of a callback in case something goes wrong during execution. Without the setjmp() call, the error handler that exits the application is called.
lua_rawget()
and lua_rawgeti()
for table access, since it avoids the tag method checks. Be sure to use lua_rawgeti()
for indexed access. It's still a hash lookup, but it's probably the fastest way to get there by index.
lua_ref()
wherever possible. lua_ref()
behaves similarly to a local variable in terms of speed.
lua_getglobal()
) from C are translated to a Lua string on entry. If a string is to be reused across multiple frames of the game, do a lua_ref()
operation on it, too.
This information was written for Lua, pre v4.0 -- Nick Trout
assert(x <= x_max, "exceeded maximum ("..x_max..")")
function fast_assert(condition, ...) if not condition then if getn(arg) > 0 then assert(condition, call(format, arg)) else assert(condition) end end end
fast_assert(x <= x_max, "exceeded maximum (%d)", x_max)
This is the VM code generated:
assert(x <= x_max, "exceeded maximum ("..x_max..")") GETGLOBAL 0 ; assert GETGLOBAL 1 ; x GETGLOBAL 2 ; x_max JMPLE 1 ; to 6 PUSHNILJMP PUSHINT 1 PUSHSTRING 3 ; "exceeded maximum (" GETGLOBAL 2 ; x_max PUSHSTRING 4 ; ")" CONCAT 3 CALL 0 0 fast_assert(x <= x_max, "exceeded maximum (%d)", x_max) GETGLOBAL 5 ; fast_assert GETGLOBAL 1 ; x GETGLOBAL 2 ; x_max JMPLE 1 ; to 17 PUSHNILJMP PUSHINT 1 PUSHSTRING 6 ; "exceeded maximum (%d)" GETGLOBAL 2 ; x_max CALL 0 0
Edit: April 23, 2012 By Sirmabus The code above will not actually work with 5.1 Also added some enhancements like pointing back to the actual assert line number, and a fall through in case the assertion msg arguments are wrong (using a "pcall()").
function fast_assert(condition, ...) if not condition then if next({...}) then local s,r = pcall(function (...) return(string.format(...)) end, ...) if s then error("assertion failed!: " .. r, 2) end end error("assertion failed!", 2) end end
table = { "harold", "victoria", "margaret", "guthrie" }
The "proper" way to iterate over this table is as follows:
for i=1, getn(table) do -- do something with table[i] end
However if we aren't concerned about element order, the above iteration is slow. The first problem is that it calls getn(), which has order O(n) assuming as above that the "n" field has not been set. The second problem is that bytecode must be executed and a table lookup performed to access each element (that is, "table[i]").
A solution is to use a table iterator instead:
for x, element in pairs(table) do -- do something with element end
The getn() call is eliminated as is the table lookup. The "x" is a dummy variable as the element index is normally not used in this case.
There is a caveat with this solution. If library functions tinsert() or tremove() are used on the table they will set the "n" field which would show up in our iteration.
An alternative is to employ the list iteration patch listed in LuaPowerPatches.
(lhf) Tables are the central data structure in Lua. You shouldn't have to worry about table performance. A lot of effort is spent trying to make tables fast. For instance, there is a special opcode for a.x
. See the difference between a.x
and a[x]
... but, like you said, the difference here is essentially an extra GETGLOBAL
.
a,c = {},"x" CREATETABLE 0 PUSHSTRING 2 ; "x" SETGLOBAL 1 ; c SETGLOBAL 0 ; a b=a.x GETGLOBAL 0 ; a GETDOTTED 2 ; x SETGLOBAL 3 ; b b=a["x"] GETGLOBAL 0 ; a GETDOTTED 2 ; x SETGLOBAL 3 ; b b=a[c] GETGLOBAL 0 ; a GETGLOBAL 1 ; c GETTABLE SETGLOBAL 3 ; b END