[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: utf8.codes ignores spurious continuation bytes
- From: Christian Ludwig <cl@...>
- Date: Mon, 19 Sep 2022 20:32:54 +0200
Hello Roberto Ierusalimschy,
> [...]
> >
> > Any spurious/fake conti-bytes are ignored in utf8.codes.
> > https://www.lua.org/manual/5.4/manual.html#pdf-utf8.codes
> > says: "It raises an error if it meets any invalid byte sequence."
> >
> [...]
> Thanks for the feedback. This is a bug.
>
> Note that the fix you propose doesn't work, because then the iterator
> will not return to the program the position of the character being
> traversed, but the position of the next one. The loop is there to
> go to the next character after each iteration. However, as you mentioned,
> it can skip more than intended.
Thank you for taking a look at this. And thank you to all the persons
who made utf8-support in lua possible.
Bye
C. Ludwig