Lua Locales |
|
Lua is heavily influenced by Standard C, so much of the C treatment of locales carries over info Lua.
The current locale can be get or set using the [os.setlocale] function.
Warning: os.setlocale
internally calls the C [setlocale] function, which globally sets the locale for all Lua states and all OS threads in the current process. However, some C implementations provide non-standard functions (e.g. Microsoft's [_configthreadlocale]) so that only the current OS thread is affected.
A brief background on locales in [PiL 22.2] (warning: see Historical notes below).
In Lua 5.1, identifiers in the Lua language were locale dependent [1]. In 5.2, "Lua identifiers cannot use locale dependent characters [2].
Numbers in Lua code must be written in the C locale style:
print(1.23) -- always valid print(1,23) -- always interpreted as two numbers: "1" and "23".
> assert(os.setlocale('fr_FR')) > = 1,23 , 1.23 1 23 1,23 > =loadstring("return 1,23 , 1.23")() 1 23 1,23 > print(1.23, tostring(1.23), string.format("%0.2f",1.23)) 1,23 1,23 1,23 > > ="1.23" + 0 stdin:1: attempt to perform arithmetic on a string value stack traceback: stdin:1: in main chunk [C]: in ? > ="1,23" + 0 1,23 > =tonumber("1.23"), tonumber("1,23") nil 1,23
The above will depend on luaconf.h settings, which in 5.2.0rc2 mentions this:
/* @@ lua_str2number converts a decimal numeric string to a number. @@ lua_strx2number converts an hexadecimal numeric string to a number. ** In C99, 'strtod' do both conversions. C89, however, has no function ** to convert floating hexadecimal strings to numbers. For these ** systems, you can leave 'lua_strx2number' undefined and Lua will ** provide its own implementation. */ #define lua_str2number(s,p) strtod((s), (p)) #if defined(LUA_USE_STRTODHEX) #define lua_strx2number(s,p) strtod((s), (p)) #endif
If you disable LUA_USE_STRTODHEX
, then Lua's own lua_strx2number
implementation is used. This relies on locale-independent functions (e.g. lisspace
from lctype.c
). Observe the curious behavior:
Lua 5.2.0 Copyright (C) 1994-2011 Lua.org, PUC-Rio > assert(os.setlocale'fr_FR') > return tonumber'1.5', tonumber'1,5' nil 1,5 > return tonumber'0x1.5', tonumber'0x1,5' 1,3125 nil
function findrange(pat) for i=0,255 do if string.char(i):match(pat) then print(i, string.char(i)) end end end
assert(os.setlocale'C') findrange'%a' --> A-Z,a-z assert(os.setlocale'en_US.ISO-8859-1') findrange'%a' --> A-Z,a-z,\170,\181,\186,\192-\255 findrange'%l' --> a-z,\181,\223-\255
Other classes like isspace (%w) and isdigit (%d) potentially might return more than the C locale [3].
string.lower
and string.upper
are also locale dependent.
> assert(os.setlocale'C') > return "é" < "e" false > assert(os.setlocale('fr_FR')) > return "é" < "e" true
> assert(os.setlocale('C')) > =os.date() Sat Nov 26 13:22:56 2011 > assert(os.setlocale('fr_FR')) > =os.date() sam. 26 nov. 2011 13:22:56 EST
-- This error occurred in 5.0 (but not 5.1 or above) -- http://www.lua.org/pil/22.2.html print(os.setlocale('pt_BR')) --> pt_BR (Portuguese-Brazil) print(3,4) --> 3 4 print(3.4) --> stdin:1: malformed number near `3.4'