Re: question about Unicode

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: question about Unicode
From: David Given <dg@...>
Date: Tue, 05 Dec 2006 15:10:43 +0000

Glenn Maynard wrote:
[...]
> Out of curiosity, what use is that?  In particular, if a function
> returns a character offset, and you want to use it to address the string,
> you have to convert it to a byte offset--which is an expensive operation.
> I've used UTF-8 for years, and I can't remember the last time I wanted
> a character offset.  (Even if you use wide strings, you still don't
> get those directly, due to combining characters.)

I want to write a text editor, and so there'll be lots of nasty
fetch-the-character-from-column-Z issues. Assuming each grapheme cluster
renders into a single character cell --- which I know is not strictly valid,
as some clusters will occupy multiple cells --- then dealing with character
offsets instead of byte offsets will make life much easier.

-- 
╭─┈David Given┈──McQ─╮ "There are two major products that come out of
│┈┈dg@cowlark.com┈┈┈┈│ Berkeley: LSD and Unix. We don't believe this to be
│┈(dg@tao-group.com)┈│ a coincidence." --- Jeremy S. Anderson
╰─┈www.cowlark.com┈──╯

Attachment: signature.asc
Description: OpenPGP digital signature

Follow-Ups:
- RE: question about Unicode, Jerome Vuarand
- Re: question about Unicode, Rici Lake

References:
- question about Unicode, Roberto Ierusalimschy
- Re: question about Unicode, Klaus Ripke
- Re: question about Unicode, Roberto Ierusalimschy
- Re: question about Unicode, David Given
- Re: question about Unicode, Glenn Maynard

Prev by Date: Re: How do you deal with event functions?
Next by Date: RE: question about Unicode
Previous by thread: Re: question about Unicode
Next by thread: RE: question about Unicode
Index(es):
- Date
- Thread