Adrian Perez a écrit :
(Yes, I checked this even in recent Linux/BSD and older Solaris
systems, at it still works, the same goes for most text utils... but
don't expect character ranges in regexps like '[あ-う]' to work,
because most apps assume one byte per glyph).
I expect that apps making correct use of good RE libraries like PCRE
(compiled with UTF-8 support) should be OK with the above. I never
tried this...
Just my two cents. I would really appreciate Unicode support in Lua. I
vote for enforcing UTF-8 as encoding for source files. Python is a
somewhat hackish: it tries to detect encoding by using a special comment
on the first 5 lines of code like '# -*- encoding: utf-8 -*-'. It works
but I think it's quite awkward...
No more awkward than shbang or XML way...
I am not a fan of enforcing an encoding scheme.