[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Skipping leading "shebangs" in a file
- From: Coda Highland <chighland@...>
- Date: Tue, 9 Aug 2016 14:58:23 -0700
On Tue, Aug 9, 2016 at 2:51 PM, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>> As such, since text files on disk in most current filesystems do not
>> carry encoding metadata, it isn't a bug to put a BOM at the beginning
>> of a UTF-8 text file.
>
> It is a bug in the spec, in the sense that it breaks the main raison
> d'être of utf-8 (being compatible with ascii).
I disagree on the grounds that it is neither required nor recommended
and the spec is playing "be liberal in what you accept" -- that is,
the spec is saying "because this can happen, a compliant
implementation should be prepared to handle it."
It's also a non-goal to be forwards-compatible with ASCII. It's
backwards-compatible with it (all valid 7-bit ASCII documents are
valid UTF-8) but the converse is obviously false (not all valid UTF-8
documents are valid 7-bit ASCII documents, independent of the BOM
issue). You can't expect tools that process ASCII data to correctly
handle arbitrary UTF-8 documents.
/s/ Adam