[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Memory usage stats for 5.2 vs 5.3
- From: Roberto Ierusalimschy <roberto@...>
- Date: Tue, 13 Jan 2015 16:09:53 -0200
> I meant that lua_newState seemed to be creating a lot of strings,
> before luaL_openLibs. Thinking about it some more, that'll be for the
> reserved words in the lexer, and now of course there are a few more of
> those with the new bitops.
Besides reserved words, Lua also creates the metamethod names (__add,
etc.). At all, there would be ~50 strings (22 reserved words + 24
metamethods + _ENV + "not enough memory") in bare Lua.
> Taking into account the alignment of a conventional 32-bit CPU (like
> the ARM), it increases the overhead of TStrings from 16 bytes to
> 24, which is a quite considerable increase if you have a lot of
> short strings like variable names. Generally I'm all in favour of
> simplification (setfenv being replaced by _ENV was supremely elegant
> and is my go-to example of good language evolution) but the cost in
> memory here might be a bit too high. In the test case I mentioned
> where it requires the tetris module, that one change uses over 2%
> of available RAM before the module has even set up most of its data
> structures.
>
> Out of interest why does string data need to be strongly aligned? I
> can't think of a situation offhand where it would ever be needed?
Because you can store binary data in strings. E.g., you could move
an entire struct to a string and then type-cast the string back
to the struct. But of course the idea would be not to waste space
with that.
> I was wondering if there was an optimisation that could be applied
> for when Lua strings are constructed from string literals? Since Lua
> strings are immutable anyway, and C string literals cannot ever mutate
> or go out of scope (in any implementation I can think of) copying them
> into RAM seems a bit wasteful. There's even a lua_pushliteral API
> already isn't there, although it is a simple macro wrapper at present?
For small strings, we want to internalize them so that we can use
simple pointer equality when comparing strings (good for a fast
hashing). Moreover, the internal string would need 4 bytes anyway (for a
pointer to the external string), so the gains would not be too big. For
large strings, however, we are considering a good API to allow the use
of external strings.
> Back of an envelope calculation suggests there are about 100 reserved
> words and variable names in the standard libraries,
More like 200, only for variable names.
-- Roberto