[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: question about Unicode
- From: David Given <dg@...>
- Date: Tue, 05 Dec 2006 15:10:43 +0000
Glenn Maynard wrote:
[...]
> Out of curiosity, what use is that? In particular, if a function
> returns a character offset, and you want to use it to address the string,
> you have to convert it to a byte offset--which is an expensive operation.
> I've used UTF-8 for years, and I can't remember the last time I wanted
> a character offset. (Even if you use wide strings, you still don't
> get those directly, due to combining characters.)
I want to write a text editor, and so there'll be lots of nasty
fetch-the-character-from-column-Z issues. Assuming each grapheme cluster
renders into a single character cell --- which I know is not strictly valid,
as some clusters will occupy multiple cells --- then dealing with character
offsets instead of byte offsets will make life much easier.
--
╭─┈David Given┈──McQ─╮ "There are two major products that come out of
│┈┈dg@cowlark.com┈┈┈┈│ Berkeley: LSD and Unix. We don't believe this to be
│┈(dg@tao-group.com)┈│ a coincidence." --- Jeremy S. Anderson
╰─┈www.cowlark.com┈──╯
Attachment:
signature.asc
Description: OpenPGP digital signature