lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On Feb 20, 2014, at 20:29 , Cezary H. Noweta wrote:

On 2014-02-20 12:13, René Rebe wrote:
On Feb 19, 2014, at 19:42 , Roberto Ierusalimschy wrote:

First of all:

C99 7.19.3[11]:
C11 7.21.3[11]: "...The byte input functions read characters from the
stream as if by successive calls to the fgetc function."

So, as long as ,,fgetc()'' returned { 0, 0, '\n', ... }, ,,fgets''
should have returned { 0, 0, '\n', 0, ... }; { 0, 0, EOF } => { 0, 0, 0,
... }; and so on.

So for the special case of non newline terminated files we unfortunately
need to pre fill the whole buffer with \n.

There is no guarantee that the buffer is not trashed beyond a
terminating NUL. ,,fgets'' and '\0's cause that it is
impossible to determine the number of read chars unambiguously. Even if
we can fulfill the buffer with some magical NaN (NaC?) value, which is
impossible for ,,fgets'' to reproduce, still we have an ambiguity:
,,fgets'' returns { 0, 0, 0, '\n', 0, #, #, ... }, where ,,#'' is an
untouched placeholder => such buffer can come from ,,fgetc'' sequence:
0, EOF; or 0, 0, EOF; or 0, 0, 0, '\n'. Ending '\n', 0 could be remnants
of some strange binary=>text decoding, which can be done in the buffer.

The sole guarantees about a buffer content are: (1) untouched, if EOF at
the beginning; (2) valid data until an appended NUL, which is hard to
determine if a data contains NULs itself.

IMHO, a discussion about using ,,fgets'' to read zeroes is like a
discussion about using a microwave to boil an egg. Using ,,strlen'' on a
result of ,,fgets'' is very fine as ,,fgets'' is not for reading of 0s.
Just my few cents.

The discussion is about lines(), that it using fgets is just an implementation detail.

If Roberto would not kind of implied performance loss is not that acceptable with his bible test case then a fgetc() look without all this troubles would have been very fine for me, too.

I can certainly give up improving vanilla Lua and convincing some that random data loss is usually considered a bug, and live very happily with the fix that works for me just fine.

Have fun parsing MIME, CGI data, or financial programs exports using \0 field delimiters. Or wherever a zero comes along.

-- 
 ExactCODE GmbH, Jaegerstr. 67, DE-10117 Berlin
 http://exactcode.com | http://exactscan.com | http://ocrkit.com | http://t2-project.org | http://rene.rebe.de