[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: RE: lua for unicode
- From: james@...
- Date: Mon, 2 Dec 2002 18:53:39 -0600
Title: Message
It would take a bit more work than just
replacing the ASCII
calls with wide character calls. There are some places
where
ASCII strings are iterated across.
If you search for strcoll,
strncpy, strcpy, strcmp, etc, you'll
find most of these. Some string
constants might need the addition
of the L"" macro as well.
Another
issue is the parser - it's not designed to handle
Unicode files, so your
source files would have to be in ASCII.
This could present some issues on
systems that don't support
ASCII files, or operating systems which save
multi-lingual files
in Unicode. It also doesn't seem to support UTF8 files. I
tried
a simple test (this won't show up right for most unless you have
an
HTML mail reader with a
Japanese font installed):
-- 建築
print( "建築" )
建築 = {}
print( 建築 )
Which is essentially
the same as:
-- test
print( "11" )
aa = { }
print( aa )
But with Japanese text stored as UTF8 in
the file. The Lua parser
dies on the assignment statement.
Also with UTF8, it makes it tough on the
developer. Say for
example, a Japanese user to enter Japanese
characters in their
print statements, or function names, etc..
When these show up
or are referenced by name, they both have to
be in the same encoding,
or things won't work
right.
All in all, not an easy problem to
solve.
Regards,
Jim
>
-----Original Message-----
> From: owner-lua-l@tecgraf.puc-rio.br
>
[mailto:owner-lua-l@tecgraf.puc-rio.br] On
Behalf Of Scott Morgan
> Sent: Monday, December 02, 2002 3:43 PM
>
To: Multiple recipients of list
> Subject: Re: lua for
unicode
>
>
> lua+Steven.Murdoch@cl.cam.ac.uk wrote:
>
> Initially Unicode was limited to 2^16 positions (65,536),
> but this
was found
> > to be inadequate. The first 2^16 characters of Unicode
are
> known as the Basic
> > Multilingual Plane (BMP) and is
intended be enough to
> represent all living
> > languages,
however as other messages have suggested it does
> not contain
>
> historical characters. This space is not yet full so there
> may be
further
> > characters added in the future.
>
>
>
> What issues would be involved getting lua to natively use
the
> MS Windows
> 16-bit unicode effort? I know its a little (well
quite a lot :) )
> against the spirit of lua, but it seems like a good
quick fix
> just to go
> through the lua source and replace all the
str* function calls with
> win32 wstr* calls. Of course all scripts would
have to be
> saved into the
> same 16-bit text files in this
situation.
>
> Just to make clear (and avoid flames) I wouldn't
consider
> this a proper
> fix but just a quick way to get win32
unicode support into lua.
>
> Scott
>
>
>