Hi Adam,
Yes I am aware of the utf8 library, I was puzzled by the appearance of SYN.
The SYN character is being generated in the call to print, apparently to
represent unprintable characters. Obvious now.
Writing strings to a binary file and opening in hex editor reveals that
string.sub is doing exactly as expected. The world makes sense again.
local space = string.byte(' ')
local text = utf8.char(0x92e,0x947,0x930,0x93e, space, 0x928, 0x93e, 0x92e, space, 0x932,0x942,0x905, space, 0x939,0x948, 0x964)
function wb(filename, str)
local fh = io.open(filename, 'wb')
fh:write(str)
fh:close()
end
local split = 1
local lh = string.sub(text, 1, split)
local rh = string.sub(text, split+1)
wb('text', text)
wb('lh', lh)
wb('rh', rh)
--text: E0 A4 AE E0 A5 87 E0 A4 B0 E0 A4 BE 20 E0 A4 A8 E0 A4 BE E0 A4 AE 20 E0 A4 B2 E0 A5 82 E0 A4 85 20 E0 A4 B9 E0 A5 88 E0 A5 A4
-- lh: E0
-- rh: A4 AE E0 A5 87 E0 A4 B0 E0 A4 BE 20 E0 A4 A8 E0 A4 BE E0 A4 AE 20 E0 A4 B2 E0 A5 82 E0 A4 85 20 E0 A4 B9 E0 A5 88 E0 A5 A4
0x92e 100100101110
0xE0 0xA4 100100101110
0xE0 11100000
0xA4 10100100