Unicode Identifers

lua-users home
wiki

Platform independent approach to Unicode literals in Lua.

Situation without this change:

Add this to the section Local configuration (luaconf.h):

#ifdef LUA_CORE
// all utf-8 chars are always alphabetic character (everything higher then
// 2^7 is always a valid char), end of stream (-1) is not valid
#define isalpha(zeich) (((0x80&zeich)||isalpha(zeich))&&zeich!=-1)
// all utf-8 chars are always alphabetic character or numbers, end of
// stream (-1) is not valid
#define isalnum(zeich) (((0x80&zeich)||isalnum(zeich))&&zeich!=-1)
// all utf-8 chars are never numbers, end of stream (-1) is not valid
#define isdigit(zeich) ((!(0x80&zeich)&&isdigit(zeich))&&zeich!=-1)
// all utf-8 chars are never whitespace characters, end of stream (-1) is
// not valid
#define isspace(zeich) ((!(0x80&zeich)&&isspace(zeich))&&zeich!=-1)
#endif

Please note that all Unicode characters will be allowed. (this is maybe a problem, because some characters look similar to Lua keywords, operators and whitespace.)

Then recompile Lua and try these samples:

local function Grüße(message)
    print(message)
end

GrüßeAusDeutschland = "Hallo Welt äöüß"
   -- As you see we are using a global variable with UTF-8 characters
Grüße(GrüßeAusDeutschland) -- call to local function with UTF-8 characters

-- Please add other Language samples here ;)

-- Just to prove my point some google translate gibberish:

日本からのご挨拶 = "ハローワールド" -- japanese
Grüße(日本からのご挨拶)

HendrikPolczynski 2009-11-20 - First Revision

For Unicode String support and validation please see:

For general information about Unicode in Lua see:


RecentChanges · preferences
edit · history
Last edited February 22, 2018 5:49 am GMT (diff)