Re: utf8 library may cause heap corruption

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: utf8 library may cause heap corruption
From: Kim Alvefur <zash@...>
Date: Thu, 9 Feb 2017 15:08:03 +0100

On Thu, Feb 09, 2017 at 02:30:39PM +0200, Dirk Laurie wrote:
> 2017-02-09 14:05 GMT+02:00 云风 Cloud Wu <cloudwu@gmail.com>:
> 
> > But there is another problem.
> >
> > local s = "\xE4\xBA"
> > assert(utf8.len(s, 1, 2) == utf8.len(s .. "\x91",1,2)) -- failed
> 
> Why is this a problem? It should fail. s is not a valid UTF8 codepoint
> ("\xE4" promises three bytes, but there are only two). When you
> supply the extra byte, there is one valid codepoint. starting between
> charaters 1 and 2.

The manual says:
> Returns the number of UTF-8 characters in string s that **start**
> between positions i and j (both inclusive).

Extra emphasis on **start**. The 3 byte does sequence starts within the
range given.


-- 
Zash

Attachment: signature.asc
Description: PGP signature

References:
- utf8 library may cause heap corruption, 云风 Cloud Wu
- Re: utf8 library may cause heap corruption, Dirk Laurie
- Re: utf8 library may cause heap corruption, 云风 Cloud Wu
- Re: utf8 library may cause heap corruption, 云风 Cloud Wu
- Re: utf8 library may cause heap corruption, Dirk Laurie

Prev by Date: Re: utf8 library may cause heap corruption
Next by Date: Re: ...what the heck?
Previous by thread: Re: utf8 library may cause heap corruption
Next by thread: Why I can't use escape character in a string ?
Index(es):
- Date
- Thread