lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Feb 02, 2012 at 11:50:34AM +0900, Miles Bader wrote:
> Tony Finch <dot@dotat.at> writes:
> > Miles Bader <miles@gnu.org> wrote:
> >> [Hmm, that brings to mind another question:  How much of the input
> >> string does it have to accumulate in memory?  Can it start discarding
> >> earlier portions of it at some point?  If not, it wouldn't be so useful
> >> for the use I have in mind: parsing really big files without needing to
> >> have them entirely in memory.]
> >
> > lpeg doesn't bother trying to discard unneeded string prefixes. In theory
> > it can only discard the prefix of a string that is not covered by any
> > captures and which has no alternation backtracking points in it.
> 
> I suspect that in many cases, backtracking doesn't cover much of the
> file, or the grammar can be arranged so that this is the case...
> 
> [I don't know how easy the implementation makes _detecting_ when parts
> of the text are discardable though...]
> 
> As has been discussed elsewhere on this thread, the capture issue could
> probably be sorted out, e.g. by lazy conversion of capture contents into
> real Lua strings.
> 

Perhaps the engine could ask the buffer object for a window. If the object
can't provide the window (because it discarded data already), then bail. But
also add a new match-time capture type, similar to lpeg.Cmt, which
immediately internalizes string captures and executes a function which could
be used to discard data in the buffer.