Hi list,
Noticed that OP_CONCAT and LoadString in undump.c uses memcpy() twice if the resulting string is not already interned. Since the result length is known early the long string case can be quite easily optimized to copy data just once. Seems to slash 25+25 concat time by around 12%, probably more for longer strings. I see no reproducible regression in the short strings case.
$ cat testcase.lua
local a = string.rep('A', 25)
local b = string.rep('B', 25)
for i = 1, 1e8 do local c = a..b end
$ time ./lua-5.3.1 testcase.lua
real 0m8.262s
user 0m8.105s
sys 0m0.006s
$ time ./lua-patched testcase.lua
real 0m7.282s
user 0m7.205s
sys 0m0.011s
Please, can anyone verify the correctness of the patch or run it against a real test suite? Thanks!