lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Note that for octuple precision (256-bit) floating-point in IEEE754, the minimum buffer size is 82 bytes (including the null terminator, but excluding any optional group separator), so with alignment set to 8-bytes (64-bit architecture) or 16-bytes (128-bit arch), the suitable buffer size in the stack would be 96 bytes to display the full precision (you could still reduce it to 64 bytes if you accept to not display the full precision that would typically be used only for intermediate calculations of aggregates on large datasets with high precision, such as collecting many measurements: this is still enough for nuclear research today given the existing precision of units, today around 20 decimal digits for some physical scales, this may evolve soon with the reform of the meter in SI; the high precision is just needed to collect many quantic measurements at very high frequencies before processing them and rounding them because they have large margins of randomness where the wanted precision is hidden with lot of quantic noises).
256-bit floating point is already interesting searchers on IA and for processing "big data" sets that are for now still seen as very chaotic (e.g. in automated financial/trading applications, or meteorological or simulations in fluid mechanic, or in massively parallelized applications with lots of users like multiplayer online games on commercial game servers, where many users could feed their own scripts: many concurrent small Lua scripts, changeable/loadable in real-time without stopping the server)

Quad precision (128-bit) is already used in high-precision 3D manufacturing to control bots. I think they are already used in radioastronomy for controlling the shape of mirrors, or in the nuclear research industry in accelerators or for simulations of for researches on black matter and black energy (I've read an article suggesting its use for large arrays of telescopes); it may already have application in cryptography to speed up and secure the generation of keys with more challenging algorithms. Given it is already available on consumer markets since several decennials and there's an incentive with 3D rendering of light effects and raytracing in popular games, the existing 64-bit architecture will support it as part of their native vector instruction extensions.
Cloud computing with giant servers may already use them but would benefit as well from hardware implementations instead of relying on slow emulation (and energy-inefficient) with software libraries.

If you don't want to extend the stack size (and still don't plan to support octuple precision with their full precision, except by tweaking luaconf.h for these corner-side experimental architectures), using 48 instead of 50 would just be fine.


Le mer. 19 août 2020 à 00:50, Philippe Verdy <verdyp@gmail.com> a écrit :
The default 50 would be fine only on 16-bit machines, which are out of use since long. But even in that case, those machines had more limited precisions for the floating points, and number would be complied most probably only as a 32-bit float (still requiring often software emulation), so the maximum string length 50 is also overkill. Only on those machines you would want to save some bytes on the stack, so you would reduce 50 to just 32 and it would still mich be large enough for 32-bit floats; may be "long double" could be supported by (slow) libraries but the compiler would just compute at most 64-bits like for "double", four which a 32-byte buffer (including the null terminator) is still enough. As these specialized types of small machines have lot of other limitations, "luaconf.h" would need to be tweaked specially to compile Lua on them (e.g. for small microcontrollers driving simple hardware bots or tools not requiring high-speed control).
50 bytes is wrong for all cases. 48 bytes would be better and would still work (the extra 16 bytes on the stack are for the one or two parameters of the function call and possibly one for the returning point or a save slot ion the stack for a register that will be used by inlined functions). Still there won't be any alignment problem with today's 32-bit or 64-bit machines, or newer 128-bit machines (or existing 64-bit machines like x64 or ARM64 with 128-bit vector extension that could have hardware support integrated in their ISA).
The value "50" must just have been a rough estimation for humans and makes no sense, it just adds extra complications for the compilers and possible causes of bugs (or could prevent some optimizations to occur). Let's remember that Lu very frequently performs conversions between numbers and strings: this should be efficient without extra complication, and all optimizer hints should be usable (notably compilers could use inlinable intrinsics with more aggressive register allocation and serialized instructions or vector instructions, rather than using the stack and function calls if it can. This can boost a Lua program quite a lot in frequent hotspots: these number<->string conversions are such hotspots in Lua.


Le mer. 19 août 2020 à 00:07, Ranier Vilela <ranier.vf@gmail.com> a écrit :


Em ter., 18 de ago. de 2020 às 18:38, Philippe Verdy <verdyp@gmail.com> escreveu:
could this be related to
/* maximum length of the conversion of a number to a string */
#define MAXNUMBER2STR   50
where the string is allocated on the stack with an array of bytes whose size (including the null terminator) is not a multiple of the word size? Causing some internal bug in the stack slots allocator in GCC 10.1?
Note that "void luaO_tostring" is the only function where it is allocated this way. This may cause issue when this function is inlined (probably alignment problems).

May be this is solved by just making this a multiple of 8 bytes (64-bit architectures) or 16 bytes (128-bit architectures).
However how can even on a 64-bit architecture this generate a numeric string that could be 49 bytes long plus a null ?
May be the type for number can have its bitsize asserted to define the length that is needed for the mantissa, the exponent, the signs and the dot. If this is too complicate, why not just aligning 50 to the next multiple of 8 or 16, i.e. setting it to 56 or 64?
#define MAXNUMBER2STR   56
I agree. Could be 64.
This can avoid mistakes with address calcs, mainly with aggressive optimizations.
Everyone is power of two, it makes no sense to use decimal values, which are for humans, not machines.

regards,
Ranier Vilela