Re: Skipping leading "shebangs" in a file

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Skipping leading "shebangs" in a file
From: Coda Highland <chighland@...>
Date: Tue, 9 Aug 2016 14:58:23 -0700

On Tue, Aug 9, 2016 at 2:51 PM, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>> As such, since text files on disk in most current filesystems do not
>> carry encoding metadata, it isn't a bug to put a BOM at the beginning
>> of a UTF-8 text file.
>
> It is a bug in the spec, in the sense that it breaks the main raison
> d'être of utf-8 (being compatible with ascii).

I disagree on the grounds that it is neither required nor recommended
and the spec is playing "be liberal in what you accept" -- that is,
the spec is saying "because this can happen, a compliant
implementation should be prepared to handle it."

It's also a non-goal to be forwards-compatible with ASCII. It's
backwards-compatible with it (all valid 7-bit ASCII documents are
valid UTF-8) but the converse is obviously false (not all valid UTF-8
documents are valid 7-bit ASCII documents, independent of the BOM
issue). You can't expect tools that process ASCII data to correctly
handle arbitrary UTF-8 documents.

/s/ Adam

References:
- Skipping leading "shebangs" in a file, Robert Virding
- Re: Skipping leading "shebangs" in a file, Henrik Ilgen
- Re: Skipping leading "shebangs" in a file, Robert Virding
- Re: Skipping leading "shebangs" in a file, Coda Highland
- Re: Skipping leading "shebangs" in a file, Robert Virding
- Re: Skipping leading "shebangs" in a file, Roberto Ierusalimschy
- Re: Skipping leading "shebangs" in a file, Coda Highland
- Re: Skipping leading "shebangs" in a file, Roberto Ierusalimschy

Prev by Date: Re: Skipping leading "shebangs" in a file
Next by Date: Re: Managing Unicode (UTF-8 and UTF-16) data in Lua
Previous by thread: Re: Skipping leading "shebangs" in a file
Next by thread: Re: Skipping leading "shebangs" in a file
Index(es):
- Date
- Thread