lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Jun 12, 2003 at 10:47:55PM +0100, Abigail Brady wrote:
> I'm confused though, by the suggestion that a "sensible subset" of
> Unicode excludes Cyrillic and Greek.  Perhaps you meant something else?

"Normalization Form KC" or something like that, see down from
<http://www.cl.cam.ac.uk/~mgk25/unicode.html#ucsutf>. Alternatively
perhaps NFKD could be used, but for as simple as possible string
processing, again, IMHO combining characters should have been coded
in a standard way to the charater numbers instead of being separate
"semi-characters". I don't care if it doesn't fit in 16 bits, UTF-8
is more sensible than UCS* anyway.

-- 
Tuomo