Re: Of Unicode in the next Lua version

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Of Unicode in the next Lua version
From: Dirk Laurie <dirk.laurie@...>
Date: Wed, 12 Jun 2013 17:02:09 +0200

2013/6/12 Pierre-Yves Gérardy <pygy79@gmail.com>:
> I just read Roberto's slides from the 2012 Lua workshop, and I have a
> suggestion for the UTF-8 library.
>
> It is efficient, and often practical, to deal with byte indices, even
> in Unicode strings. It is the approach taken by Julia, and I use it in
> LuLPeg. The API is simple:
>
>     char, next_pos = getchar(subject, position)
>
>     S = "∂ƒ"
>     getchar(S, 1) --> '∂', 4
>     getchar(S, 4) --> 'ƒ', 6
>     getchar(S, 6) --> nil, nil
>
> A similar function could return code points instead of strings.
>
> What do you think about this?

If `pos` comes before `char`, one can write an iterator on the model
of `ipairs`:

    for pos,char in utf8(str) do ...

Follow-Ups:
- Re: Of Unicode in the next Lua version, Pierre-Yves Gérardy

References:
- Of Unicode in the next Lua version, Pierre-Yves Gérardy

Prev by Date: Re: [ANN] LuLPeg v0.1
Next by Date: Re: Of Unicode in the next Lua version
Previous by thread: Of Unicode in the next Lua version
Next by thread: Re: Of Unicode in the next Lua version
Index(es):
- Date
- Thread