[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Questions about Lpeg (semantics of captures)
- From: roberto@... (Roberto Ierusalimschy)
- Date: Wed, 14 Mar 2007 16:46:36 -0300
> I still don't understand lpeg very well, and I have the (naive?)
> impression that patterns-with-captures are implemented on top of
> patterns-without-captured in a way that even allows "projecting" a
> pattern-with-captures into the lowel level, by discarding all the
> information about captures... also, matching a pattern-with-captures
> involves some backtracking, and some operations on the captures - like
> "patt / function" - should only be performed after the (super)pattern
> succeeds; so, in a first moment lpeg.match keeps backtracking
> information and instructions for performing captures; at some point
> the pattern is "closed", the backtracking information is dropped, and
> the instructions for performing captures are executed...
>
> Is that mental model correct?
Yes, although I think it is not "healthy" to use captures with side
effects (that is, that depend on when they are executed).
> Is there a way to force a subpattern to be closed, and its captures
> performed?
Not now.
> Now let me show why I stumbled on that question, and why I was
> somewhat surprised when I discovered that the execution of the
> function in "patt / function" is delayed.
>
> [...]
>
> but then I discovered that the the "/ setsymbol" part was being
> executed after the "lpeg.P(isheadsymbol)", not before...
>
> My current solution (which works!) is like this - again, I'm
> reconstructing this from from memory; the real implementation is more
> complex:
>
> SSymbol = lpeg.R("AZ", "az", "09") + lpeg.S("-+_")
>
> headsymbols = { ["info"]=true, ["man"]=true }
>
> setmark = function (subj, pos)
> mark = pos
> return pos
> end
> isheadsymbol = function (subj, pos)
> local symbol = string.sub(subj, mark, pos - 1)
> return headsymbols[symbol] and pos
> end
>
> SHeadSymbol = lpeg.P(setmark) * SSymbol * lpeg.P(isheadsymbol)
This is one option. Another option is to write a function that matches
the whole SHeadSymbol inside it (by calling lpeg again to match
SSymbol). Something like this:
SHeadSymbol = lpeg.P(function (subj, pos)
local e = lpeg.match(SSymbol, subj, pos)
if not e then return nil end
local symbol = string.sub(subj, pos, e - 1)
return headsymbols[symbol] and pos
end)
-- Roberto