[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: UTF-8 [was Re: LuaSocket http ftp smpt...]
- From: Philippe Lhoste <PhiLho@...>
- Date: Mon, 18 Sep 2006 16:21:51 +0200
Robert Raschke a écrit :
You can find most of the Plan 9 tools (including the UTF-8 support and
many, many, many uses of it) ported to unix at
http://swtch.com/plan9port
Robby
------------------------------------------------------------------------
Sujet:
Re: UTF-8 [was Re: LuaSocket http ftp smpt...]
Expéditeur:
Klaus Ripke <paul-lua@malete.org>
Date:
Thu, 3 Feb 2005 13:01:44 +0100
Destinataire:
Lua list <lua@bazar2.conectiva.com.br>
Destinataire:
Lua list <lua@bazar2.conectiva.com.br>
On Thursday 03 February 2005 12:29, David Burgess wrote:
phillip hazel's PCRE implementation does UTF-8 rather well. If you
are looking for a UTF-8 base. It may be worth a look.
thx, that's pretty much the right thing, but also kind of a biggy.
Their character property table amounts to 88K,
while the Plan 9 thingy is about 12K
(not 2K, dropped the 1 in the previous post).
FYI:
In PCRE 6.7 ChangeLog:
Version 6.5 01-Feb-06
[...]
18. Changes to the handling of Unicode character properties:
(a) Updated the table to Unicode 4.1.0.
(b) Recognize characters that are not in the table as "Cn" (undefined).
(c) I revised the way the table is implemented to a much improved
format
which includes recognition of ranges. It now supports the
ranges that
are defined in UnicodeData.txt, and it also amalgamates other
characters into ranges. This has reduced the number of entries
in the
table from around 16,000 to around 3,000, thus reducing its size
considerably. I realized I did not need to use a tree structure
after
all - a binary chop search is just as efficient. Having reduced the
number of entries, I extended their size from 6 bytes to 8 bytes to
allow for more data.
(d) Added support for Unicode script names via properties such as
\p{Han}.
PCRE 6.4's ucptable.c: 443KB
PCRE 6.7's ucptable.c: 87KB
Should reduce slightly the compiled size.
--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --