[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Matching multibyte alphabetical characters with LPeG
- From: Miles Bader <miles@...>
- Date: Mon, 18 Jun 2012 15:54:50 +0900
William Ahern <william@25thandclement.com> writes:
>> Hardly; it very much depends on what you're doing with it -- and note
>> that in many cases (Apple, I'm looking at you...) normalization is
>> downright harmful.
>
> I think my MTA truncated this message. Care to resend the explanation? =)
Go read the Git mailing list for painful, painful, details.
Basically, "normalization" makes a change, and if that change is
persistent (i.e., not made temporarily during comparison or whatever),
then parts of your system which weren't expecting a change (because
nothing "real" changed) may get confused (or simply do excess work
because of the change).
[You might say "well then _always_ keep everything in normalized form,
then no problem!" ... but one doesn't always have control over every
part of the system and every tool.]
"Temporary normalization" (only when sorting strings or whatever, and
not saving the result) is safer of course.
-miles
--
"Though they may have different meanings, the cries of 'Yeeeee-haw!' and
'Allahu akbar!' are, in spirit, not actually all that different."