|
David Kastrup wrote:
When working on games that were localized into far eastern languages, LuaPlus offered us a 16-bit wide character string type that accomplished our needs nicely. I do not claim it is UCS- or UTF- compatible, but it was sufficient for our localization needs. However, it enabled us to deal with strings in the following forms:Apropos: could someone clue me in what the proposed way of operation is for dealing with utf-8 strings? When one is using lua as an embedded interpreter, having efficient strings with a natural Unicode character type (internally represented with utf-8) would save a lot of headaches.
str = L"\x30A0\x30A1\x30A2" -- Katakana letters str2 = L"\x30A3\x30A4\x30A5" -- Katakana letters str = str .. str2 -- Concat works print(str) -- Works-- The entire Lua string library was also found as a 16-bit wstring library. File reading understands the 16-bit format, too. We'd use it to read "Unicode" .csv files from Excel.
In addition, LuaPlus could read its input files as 16-bit entities. It didn't allow 16-bit identifiers in Lua, but it did let you directly put the Katakana letters (or whatever) directly in strings and not map them through a hexadecimal equivalent.
Josh