[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: unicode support in lua
- From: Klaus Ripke <paul-lua@...>
- Date: Thu, 26 Apr 2007 14:17:17 +0200
hi
On Thu, Apr 26, 2007 at 01:47:02PM +0200, David Kastrup wrote:
> The only documentation I have been able to find is in "unitest", and
> it is very, very sketchy.
or let's say terse ;)
"UTF-8 operates on UTF-8 sequences as of RFC 3629".
Even "format ... uses character counts for precision in %s".
The grapheme module counts grapheme clusters.
> -- NOTE: find positions are in bytes for all ctypes!
> -- use ascii.sub to cut found ranges!
right, utf8.find _returning_ byte positions has a special note,
exactly because utf8.sub does NOT work with byte counts.
> It does not exactly sound like character-based indexing to me.
sorry if this is confusing.
Would be great if somebody would write some serious documentation.
However, a quick look at the test cases reveals not only what
the module is supposed to do, but what it actually does.
cheers
Klaus