[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: inline Lua?
- From: David Manura <dm.lua@...>
- Date: Sun, 26 Feb 2006 05:13:45 +0000 (UTC)
David Given <dg <at> cowlark.com> writes:
>
> On Saturday 25 February 2006 12:41, Chris wrote:
> [...]
> > Probably the easiest way to test this would be with the Microsoft compiler
> > because it can inline function calls from object code (maybe ICC can do
> > this as well; ?).
> ...
> I have a little script here:
http://svn.sourceforge.net/viewcvs.cgi/primemover/pm/build-tools/collapse.lua?view=markup&rev=6
> ...which I use to collapse the Lua 5.0 VM into a single C file.
David, I did two simple tests comparing the non-collapsed with the collapsed
generated by your script. test1 is based on sieve.lua and test2 is based on
sort.lua (on 10000 random numbers)--both with I/O removed. This is
ming/gcc-3.4/winxp. It's an interesting idea, but "global optimization" didn't
seem to make much of a difference as shown:
time ./lua -e N=100000 test1.lua
separate: 6.380s, 6.477s, 6.393s
combined: 6.394s, 6.472s, 6.406s
time ./lua ./test2.lua
separate: 16.374, 16.413, 16.414
combined: 16.874, 16.855, 16.869
What does sometimes make a difference is upping the optimization level from -O2
(Lua's default) to -O3. -O3 includes -finline-functions (inlining). Here's the
results (non-collapsed) using various optimization levels:
$ time ./lua.exe -e N=100000 ../../test1.lua
O0: 11.279s, 11.279s
O1: 13.843s, 13.817s
O2: 11.684s, 11.679s
O3: 5.217s, 5.178s **
$ time ./lua.exe ../../test2.lua
O0 29.497s, 29.484s
O1 16.987s, 16.991s
O2 17.347s, 17.334s
O3 17.115s, 17.123s **
Using -O2 with -finline-functions gives similar results to -O3. Maybe the
improvement only on test1 is due to test2 being memory bound--just a guess.