[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Sucks being in the huge minority...
- From: RLake@...
- Date: Thu, 7 Aug 2003 12:11:55 -0500
>>
>> Perhaps you're using a file created on a different system and not
translated
>> as a text file during transfer.
> This is text typed in by the user directly from TextEdit (I know, don¹t
> cringe mac users out that that I'm still using TextEdit; it's on my list
to
> change! :)
> TE always has had \r at new lines...
You should change text editors. :) But I don't think that will solve your
problems.
I'm assuming you're using Mac OSX -- my new iMac showed up last week so
now I'm learning how to deal with its quirks, too.
The problem is that OSX is schizophrenic about line endings; the
underlying BSD environment uses line-feed and the compatibility Mac
environment uses carriage-return. So either could show up in any file.
Furthermore, TextEdit is liberal about what it will accept as a
line-ending (it accepts either line-feed, carriage-return, or
carriage-return/line-feed (DOS style), and I think it also accepts the
Unicode line end character which no OS that I know of uses), but it does
not force all line-endings in a given file to be consistent, so if you
edit a file with TextEdit you can end up with inconsistent line endings
and no way of knowing.
Lua uses standard C; it opens files in text mode (i.e. without the "b"
attribute) and assumes that the standard C library will correct line
endings to whatever the byte the compiler translates "\n" into, which is
what the standard says it should do. Unfortunately, MacOSX doesn't have a
consistent line ending character. So the included C library does no
translation, but different development tools available on the Mac might
compile "\n" in different ways. Irritating, isn't it?
<Extremely technical detail>
There is a widespread myth that \n in a C string means hex 0A. It doesn't.
(Although lots of C compilers use hex 0A to replace \n.) The ANSI/ISO
standard simply says that there is some "new line" character which fits
into a single char, and which is represented in a string or character
literal by \n. The standard also says that the standard IO library must
translate text files on input and output so that they appear to use \n
(whatever it is) as a line ending character. Since the traditional
Apple/Mac operating systems actually used hex 0C as a line ending
character, Mac-oriented development environments tended to translate \n
into hex 0C; this is the representation of \r on most DOS/Unix development
environments. Note that in DOS/Windows environments, the external
representation of the file uses the two-octet sequence 0x0C0A to represent
a line end; however, the C library distributed in those environments
replaces this with 0x0A, which the C compiler uses for \n.
Actually, neither Unix nor DOS really has a line ending character, but the
C behaviour effectively creates a protocol. Unix and DOS file systems
simply represent text files as octet-sequences; the OS itself doesn't have
a concept of lines. There are other operating systems which actually
represent text files as a list of lines, often of fixed lengths; the C
standard allows the standard library to delete trailing blanks in a line
(and then later pad the line back out with blanks) in order to maintain
the fiction that such a file is actually a Unix-like stream. (The C
standard does not guarantee that you can even write a line longer than 254
characters into a file, by the way.)
</Extremely technical detail>
None of this helps you much, Ando, sorry. I think that to properly support
the MacOSX, we would need at least to write a filter function for
loadfile. I'm sure I'll be forced to do that soon if no one else gets
around to it. (Perhaps CFStringGetLineBounds is the appropriate MacOSX API
function.) Meanwhile, your best bet is probably to download BBEdit Light
or buy the full version, or use the Project Builder text editor, or one of
the utilities kicking around to normalise line endings.
R.