|
The typical answer in Lua is to iterate until you hit nil, thusOn Thu, Aug 18, 2016 at 5:48 PM, ThePhD <jm3689@columbia.edu> wrote:
> Well, I looked into this deeper after everyone gave me the conventional
> wisdom to NOT depend on `lua_next`'s iteration order and pray for no
> breaking changes.
>
> It turns out that the correct wisdom -- iterate using an index value --
> seems to also be the most performant wisdom from the C++ or C side. [1]
> compares the performance of calling `lua_next()` in a while loop and
> assuming it iterates in the right order, pushing a table and calling
> `lua_geti` with the indices (and knowing the size and indices before hand),
> and the other measures the doing a lua_pushinteger + lua_gettable. Measured
> in a computer with an i7 Intel chip, lots of ram, Windows 10, Lua 5.3.3
> compiled as a DLL.
>
> Interestingly, the pushinteger + gettable method is about equal to the
> lua_next implementation. So, for Lua 5.2 and lower, since there's no
> lua_geti intrinsically built into the API, the correct way is equal in
> performance to the UB way.
>
> lua_geti was introduced in Lua 5.3 I believe, and it correctly adds a very
> heavy performance gain for the lookup (sorry LuaJIT and Lua 5.2 users,
> you're not gonna get the benefits here).
>
> My only problem is: how do I know how big the array part is? I could use
> `#t` or fight with its equivalent in the Lua 5.1/2/3 code, which is either
> lua_objlen or lua_rawlength (I can't remember which). Right now I have a
> fixed size at compile-time... I could just keep iterating integer keys from
> 1 to infinity until I hit a snag, but that seems less than useful (and
> doesn't allow me to pre-allocate the size to save performance).
>
> Any suggestions for getting the length, or is lua_objlen/rawlength (or
> whatever it is for each version of Lua) good enough?
>
> [1] - https://dl.dropboxusercontent.com/u/17644642/lua% 20iteration.html
>
> On Thu, Aug 18, 2016 at 2:41 PM, nobody <nobody+lua-list@afra-berlin.de >
> wrote:
>>
>> On 2016-08-18 19:20, Ross Berteig wrote:
>> > Honestly, the cost of writing a loop over integers from 1 to #t is so
>> > low (even in C) that you should just do that rather than assume that an
>> > undocumented behavior will work under all conditions.
>>
>> That reminds me of the "dual" situation that I once found myself in:
>>
>> Question: How do you iterate just the "non-array part" (i.e. all entries
>> excluding those with integer keys 1,2,... up to the first 'nil')?
>> (Assume that the "array part" contains millions of values (already
>> traversed in order by numeric for loop), and the "non-array part"
>> contains optional metadata, i.e. 0 or just a few values, that you now
>> want to inspect.)
>>
>> Answer: You don't. You use two tables, one for the array-like part,
>> one for the metadata, and place one inside the other at a "well-known"
>> key (i.e. one that you can get without calling `next`). The 56 byte
>> cost of the extra table is negligible. (If the array part may be
>> smaller so that the memory matters, put the metadata in an optional
>> field in the array and check whether it exists or is nil.)
>>
>> Non-Answers:
>>
>> * Trying to jump over part of the next()-traversal by swapping the
>> current key passed to next from 1 to #t. Usually works, but can
>> break horribly, for the same reasons previously mentioned in this
>> thread.
>>
>> * Making the keys well-known by arithmetic coding tricks, e.g. store
>> t[k]=v as t[i+0.25] = k, t[i+0.75] = v or just keep a list of keys
>> at t[i+0.5] = k and store t[k] = v normally, for i=1,2,...
>> Really cool but useless hack: costs 64 instead of 32 bytes per
>> entry (=costs more starting at 2 entries), needs own iterator, ...
>>
>
signifying the end of the sequence. Using the length-acquisition
function is either O(n) or O(log n) (I forget which version switched
to O(log n)). If you need nil as a valid element in your array, heaven
help you, because this list will argue until eternity over what the
right way to handle that is.
/s/ Adam