lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


David Given <dg <at> cowlark.com> writes:
> 
> On Saturday 25 February 2006 12:41, Chris wrote:
> [...]
> > Probably the easiest way to test this would be with the Microsoft compiler
> > because it can inline function calls from object code (maybe ICC can do
> > this as well; ?).
> ...
> I have a little script here:
http://svn.sourceforge.net/viewcvs.cgi/primemover/pm/build-tools/collapse.lua?view=markup&rev=6
> ...which I use to collapse the Lua 5.0 VM into a single C file.

David, I did two simple tests comparing the non-collapsed with the collapsed
generated by your script.  test1 is based on sieve.lua and test2 is based on
sort.lua (on 10000 random numbers)--both with I/O removed.  This is
ming/gcc-3.4/winxp.  It's an interesting idea, but "global optimization" didn't
seem to make much of a difference as shown:

  time ./lua -e N=100000 test1.lua
  separate: 6.380s, 6.477s, 6.393s
  combined: 6.394s, 6.472s, 6.406s

  time ./lua ./test2.lua
  separate: 16.374, 16.413, 16.414
  combined: 16.874, 16.855, 16.869

What does sometimes make a difference is upping the optimization level from -O2
(Lua's default) to -O3.  -O3 includes -finline-functions (inlining).  Here's the
results (non-collapsed) using various optimization levels:

  $ time ./lua.exe -e N=100000 ../../test1.lua
  O0:  11.279s, 11.279s
  O1:  13.843s, 13.817s
  O2:  11.684s, 11.679s
  O3:   5.217s, 5.178s  **

  $ time ./lua.exe ../../test2.lua
  O0  29.497s, 29.484s
  O1  16.987s, 16.991s
  O2  17.347s, 17.334s
  O3  17.115s, 17.123s  **

Using -O2 with -finline-functions gives similar results to -O3.  Maybe the
improvement only on test1 is due to test2 being memory bound--just a guess.