[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Lua 5.3.0-work2: When does utf8.offset work?
- From: Dirk Laurie <dirk.laurie@...>
- Date: Fri, 11 Apr 2014 08:25:02 +0200
The manual says:
---
utf8.offset (s, n [, i])
Returns the byte index where the encoding of the n-th character of s starts,
counting from position i. A negative n gets characters before position i.
The default for i is 1. Returns nil if the subject does not have such character.
As a special case, when n is 0 the function returns the start of the encoding
of the character that contains the i-th byte of s.
This function assumes that s is a valid UTF-8 string.
---
Actually, the routine seems always to return something, even if s is not valid.
The result when n>0 seems to be correct if there are n-1 valid UTF-8 characters.
> s='voilà'
> #s
6
> utf8.offset(s,6)
7
> s=s:sub(1,-2).."\xFC"
> s
voil�
> utf8.offset(s,5)
5