[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Lua 5.1 and UTF-8 ?
- From: whisper@...
- Date: Sun, 22 May 2005 15:04:08 -0700
Asko Kauppi wrote:
I've been thinking about UTF-8 and Lua lately, and wonder how much
work it would be to actually support that in Lua "out of the box".
There are some programming languages (s.a. Tck) that claim already to
do that, and I feel the concept would match Lua's targets and
philosophy rather nicely.
-ak
UTF-8 in a nutshell:
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
I would like to see some language that really supports "universal
language" and why not Lua?
Consider ICU as a base:
"
ICU is a mature, widely used set of C/C++ and Java libraries for Unicode
support, software internationalization and globalization (i18n/g11n). It
grew out of the JDK 1.1 internationalization APIs, which the ICU team
contributed, and the project continues to be developed for the most
advanced Unicode/i18n support. ICU is widely portable and gives
applications the same results on all platforms and between C/C++ and
Java software.
*ICU Features*
As computing environments become more heterogeneous, software
portability becomes more important. The International Components for
Unicode (ICU) libraries provide robust and full-featured Unicode
services on a wide variety of platforms, without sacrificing
performance. It supports the most current version of the Unicode
standard, and provides support for supplementary Unicode characters
(needed for support of the repertoires of GB 18030, HKSCS, and JIS X
0213). It offers great flexibility to extend and customize the supplied
services, which include:
*
Text: Unicode text handling, full character properties and
character set conversions (500+ codepages)
*
Analysis: Unicode regular expressions; full Unicode sets;
character, word and line boundaries
*
Comparison: Language sensitive collation and searching
*
Transformations: normalization, upper/lowercase, script
transliterations (50+ pairs)
*
Locales: Comprehensive locale data (230+) and resource bundle
architecture
*
Complex Text Layout: Arabic, Hebrew, Indic and Thai
*
Time: Multi-calendar and time zone
*
Formatting and Parsing: dates, times, numbers, currencies,
messages and rule based
ICU is an open source development project sponsored, supported, and used
by IBM. It is dedicated to providing robust, full-featured, commercial
quality, freely available Unicode-based technologies. The ICU project is
licensed under the X License
<http://www-306.ibm.com/software/globalization/icu/license.jsp> (see
also the x.org original <http://www.x.org/Downloads_terms.html>), which
is compatible with GPL
<http://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses>
but with fewer restrictions on commercial use of the software. The ICU
library supports multi-threading environments, and is available in C,
C++ and Java.
"
It actually brings a lot more to the table than just Unicode!
Dave LeBlanc
Seattle, WA USA