Re: Lpeg and malformed input / Lpeg and subjects that do not fit into memory

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Lpeg and malformed input / Lpeg and subjects that do not fit into memory
From: Roberto Ierusalimschy <roberto@...>
Date: Thu, 18 Sep 2008 10:07:41 -0300

> 1. Lpeg is great when the subject follows strictly a given
> grammar. But how to parse *malformed* CSV files, for instance? (and
> maybe generate "warnings" or "errors")

Usually we add other options to the approriate parts of the grammar to
handle erroneus input. Something like this:

    CVS    <- (<record> (%nl <record>)*) -> {}
    record <- ( <field> (',' <field>)* ) -> {} (%nl / !.) / <ErrorCase>

ErrorCase should match the input up to some anchor point (e.g., a newline).


> What I would like to do with Lpeg is the following:
> subject = read a chunk of N=4096 bytes of my big CSV file, when |!.| matches in the defined grammar (ie. 'end of subject'), use a Lpeg callback to see if:
> - more input is needed (because the record is not matched yet)
> - or if the match was successful
> - or if it is the real end of subject (ie. 'end of file')
> Is that possible with current version of Lpeg or is another way of
> solving this planned in a near future?

I don't see any problem in doing that, but you have to manage the buffer
yourself. That is, when matching the initial part of the buffer, use
a position capture to tell how far the match went. Then read another
piece, concatenate it with the unhandled part of the previous buffer,
and repeat.

-- Roberto

Follow-Ups:
- Re: Lpeg and malformed input / Lpeg and subjects that do not fit into memory, Mark Meijer

References:
- Lpeg and malformed input / Lpeg and subjects that do not fit into memory, Aladdin Lampé

Prev by Date: Re: Lua string library partially reimplemented in Lua
Next by Date: Re: Lua string library partially reimplemented in Lua
Previous by thread: Lpeg and malformed input / Lpeg and subjects that do not fit into memory
Next by thread: Re: Lpeg and malformed input / Lpeg and subjects that do not fit into memory
Index(es):
- Date
- Thread