Re: Could Lua itself become UTF8-aware?

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Could Lua itself become UTF8-aware?
From: "Soni L." <fakedme@...>
Date: Sun, 30 Apr 2017 00:24:03 -0300



On 2017-04-30 12:05 AM, Patrick Donnelly wrote:

On Sat, Apr 29, 2017 at 9:41 AM, Dirk Laurie <dirk.laurie@gmail.com> wrote:

2017-04-29 15:21 GMT+02:00 Roberto Ierusalimschy <roberto@inf.puc-rio.br>:

At present all the entries from 0x80 to 0xFF in the constant array
luai_ctype in lctype.c are zero: no bit set.

There are three unused bits. Couldn't two of them be used to mean
UTF8_FIRST and UTF8_CONT?

This is only the first step, but if the idea is shot down here already,
the others need not be mentioned.

This particular idea has very low cost, so I don't see why to shot it
down before knowing the rest of the story. What does it mean for Lua
to be "UTF-8 aware"?

-- Roberto

The next step would be a compiler option under which the lexer
accepts a UTF-8 first character followed by the correct number
of UTF-8 continuation characters as being alphabetic for the
purpose of being an identifier or part of one.

I'm very against even inching towards this destination. Lua is a
*language*. As soon as we start allowing identifiers outside of ASCII,
we begin to cultivate "dialects". Only with full support would
anyone's Lua be able to load scripts written with identifiers from
another language. And, of course, programmers not fluent in that
language would be at great disadvantage.

Air traffic control for flight standardized on English so any pilot
can communicate with any flight controller. In the same way, I think
it makes a lot of sense for programmers to accept that English is the
lingua franca for programming, including comments, documentation, and
identifiers. There really is no upside to allowing non-English (ASCII)
identifiers.

Maybe that's self-serving as an American for which I apologize.

Just allow all 0x80-0xFF to be used in identifiers. Solves most of ourissues.


Also, y'know...

emoji.

(I'd argue there's a HUGE upside to allowing emoji. I think everythingshould just allow emoji. Actually why don't we get rid of ASCII while atit, make things emoji-only?)


--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.

References:
- Could Lua itself become UTF8-aware?, Dirk Laurie
- Re: Could Lua itself become UTF8-aware?, Roberto Ierusalimschy
- Re: Could Lua itself become UTF8-aware?, Dirk Laurie
- Re: Could Lua itself become UTF8-aware?, Patrick Donnelly

Prev by Date: Re: Could Lua itself become UTF8-aware?
Next by Date: Re: Could Lua itself become UTF8-aware?
Previous by thread: Re: Could Lua itself become UTF8-aware?
Next by thread: Re: Could Lua itself become UTF8-aware?
Index(es):
- Date
- Thread