lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I think, this trap is really the burden to write a correct JSON parser
in Lua. The JSON parser I've written (lunajson) fails in this corner
case, and I think most of other parsers will also fail.

I will update lunajson, to avoid the issue by checking the number of
digits in the number token, and by taking special care for large integer.

P.S. I'd also like to update lunajson to use Travis CI testing. I had
been busy writing papers for conferences, and writing grant application...

On 05/19/2016 02:28 AM, Roberto Ierusalimschy wrote:
>> It was thus said that the Great Roberto Ierusalimschy once stated:
>>>
>>> One problem with all proposed solutions is math.mininteger. The
>>> lexer reads numbers without signal. (Several languages do the
>>> same, e.g. C).  This means that -9223372036854775808 is read as
>>> -(9223372036854775808) (a unary minus applied over a positive
>>> constant). But 9223372036854775808 is not a valid integer, so it would
>>> result in a float (or nil, or inf, or error). Whatever the choice, the
>>> final value of -9223372036854775808 would not be an integer, despite
>>> it being a valid integer.
>>
>>   I haven't looked at the code, but couldn't the lexer read the minus sign
>> and set a flag. Then read in the number as an unsigned quantity and as long
>> as it's not a float (no decimal point, no 'e' or 'E', etc.) then check
>> against LLONG_MIN and LLONG_MAX? 9223372036854775808 *is* representable as
>> an unsigned quantity.
> 
> The lexer cannot distinguish between a "-9223372036854775808" in
> "x = -9223372036854775808" and in "x = x-9223372036854775808". It
> handles both the same way, so the only way is a minus and then
> a constant. (As I said, this is a common thing in programming
> languages; C and Java, for instance, have this same rule.) There
> are work arounds, but they are work arounds... Java, for instance,
> has an axplicit provision for that case:
> 
>   It is a compile-time error if a decimal literal of type int is larger
>   than 2147483648 (2^31), or if the decimal literal 2147483648 appears
>   anywhere other than as the operand of the unary minus operator
> 
> Moreover, I don't think unsigned quantities solve the problem that
> people are trying to solve here (unespected results from long
> integers.) People that don't undestand what is going on don't
> undertand unsigned quantities, either.
> 
> (I also think that people that don't undestand what is going on do not
> write numbers like 10000000000000000000 in configuration files, so the
> real problem seems more nit-picking than anything else to me, but I
> may be wrong.)
> 
> -- Roberto
>