Thought I'd share some memory usage numbers from my tests of 5.3.0 (rc4) compared to 5.2.3. This is running on an ARMv7-M CPU (THUMB2 instruction set with 96KB RAM) in a highly custom environment. Lua is configured with 32-bit integers and no floating point (on 5.2, lua_Number is 32-bit 'long', on 5.3 both types of number are 32-bit 'long'), and all other code and config is identical except for the different Lua version being used. I wasn't using any legacy 5.1 stuff so the upgrade procedure was pretty smooth. Mainly a case of cooking up a new luaconf.h with the nonstandard number sizes and some patches to fudge floating point API calls.
The first figure on each line of the stats below is the memory used according to the allocator, and the second the figure returned from lua_gc. The allocator is one which doesn't have any allocation overhead (except for 8-byte alignment padding), and in this runtime there are no allocations that aren't via Lua, so the two figures should in theory match pretty closely. lua_gc seems to underestimate quite heavily, however. I assume that either it is not counting some of its own state in the figures it returns, or maybe the underestimation is simply the sum of all the strings whose lengths aren't a multiple of 8 (and thus Lua thinks they're smaller than the allocator does).
The test is written in C so I can guarantee no memory is allocated by the function that's printing the results. I assume lua_gc(LUA_GCCOUNT[B]) won't allocate either.
The modules mentioned are a selection from my codebase, and the numbers listed are the memory used to 'require' them, minus the baseline figure (which is the memory usage of setting up a new runtime and requiring an empty module). The runtime is reset and all memory freed between each module require. All modules were precompiled and stripped with luac, and loaded from XIP ROM. These numbers won't really be comparable to any other system - as mentioned these are custom modules, and the Lua runtime itself is pretty stripped down, with no io or math modules, for instance. lua_gc(L, LUA_GCCOLLECT, 0) is run before taking each of the module stats.
So, on to the figures:
Lua 5.2.3
=========
lua_newstate: 2032 (1908)
luaL_openlibs: 6400 (6016)
Baseline mem usage: 10120
Module membuf: 1744 (1217)
Module membuf.types: 7880 (7168)
Module membuf.print: 4256 (3649)
Module int64: 1176 (688)
Module oo: 2360 (1842)
Module misc: 1928 (1434)
Module runloop: 5736 (5102)
Module interpreter: 13736 (12808)
Module input.input: 11904 (11012)
Module bitmap.bitmap: 4064 (3461)
Module bitmap.transform: 4096 (3524)
Module tetris.tetris: 41144 (39387)
Lua 5.3.0
=========
lua_newstate: 2616 (2483)
luaL_openlibs: 8224 (7793)
Baseline mem usage: 12968
Module membuf: 1872 (1301)
Module membuf.types: 8096 (7357)
Module membuf.print: 4352 (3705)
Module int64: 1240 (708)
Module oo: 2384 (1830)
Module misc: 1960 (1426)
Module runloop: 5904 (5226)
(Test crashed here, not Lua's fault)
Lua 5.3.0 no utf8, bit32, package.path
======================================
lua_newstate: 2616 (2483)
luaL_openlibs: 6776 (6385)
Baseline mem usage: 11392
Module membuf: 1872 (1334)
Module membuf.types: 8096 (7396)
Module membuf.print: 4352 (3738)
Module int64: 1240 (741)
Module oo: 2384 (1863)
Module misc: 2024 (1518)
Module runloop: 5968 (5318)
Module interpreter: 14312 (13439)
Module input.input: 13632 (12752)
Module bitmap.bitmap: 4368 (3758)
Module bitmap.transform: 4272 (3689)
Module tetris.tetris: 42696 (41051)
The last stats above were after trimming out a few things I didn't need from 5.3 - eg I rewrote all the bit32 operations using the new operators and disabled LUA_COMPAT_BITLIB, and stopped loading utf8.
Based on the allocator stats (because that is how much memory actually gets used) lua_newstate uses 584 bytes more (+29%), luaL_openlibs uses an extra 376 bytes (+5.9%), modules increase by between 1% and 15%, and loading the biggest module in my test (tetris is 461 LOC as measured by 'cloc' tool) overall used 2824 bytes more if you also include the baseline usage increase, so (42696 + 11392) - (41144 + 10120).
I've no idea why the input module took 15% more RAM, the size and shape of the code isn't much different to say the interpreter module.
Not huge increases, but on this board 2824 bytes is unfortunately over 3% of the available memory, so every byte makes a big difference, and I was already on the limit of what would work with 5.2. Hope these figures are of interest (to someone!), if anyone has any suggestions for possible ways of saving RAM I'd love to hear them. I'm not using coroutine or most of debug table, so I can stop loading them fairly easily for a quick win, and I'm already stripping all the modules.
Cheers,
Tom