[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPEG and indentation based syntax
- From: "Francisco Sant'anna" <francisco.santanna@...>
- Date: Mon, 23 Apr 2007 00:33:09 -0300
What about this use for nested html tags?
tag = {
'<'* Bs(1-'>') *'>' -- matches any <tag1>
* (1 -- followed by any char
- (#('<'*(1-'/')) *
m.V(1)) -- unless opening another <tag2>
- ('</'*Br(1)*'>' ) -- or closing it with </tag1>
) ^0
}
On 4/16/07,
Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
Conventional backreferences (indexed from the beginning of the pattern)
is not very meaningful in LPeg, but I guess that a backreference indexed
from the beginning of the rule could solve your problem.
Assume that
lpeg.Bs(patt) stores a backreference to the substring that
matched 'patt', and that lpeg.Br(n) matches the 'n'-th stored substring
in the current rule. Then I think the next pattern would do your trick:
-- Not tested (as lpeg does not support Bs-Br...)
S = lpeg.P" "
line = -S * (1 - lpeg.P"\n")^0 * "\n"
block = { -- defines an idented block
lpeg.Bs(S^0) * line -- capture identation of first line
* ( lpeg.Br(1) * line -- block may contain lines with the same identation
+ #(lpeg.Br(1) * S^1) * lpeg.V(1) -- or idented blocks
)^0
}
I am not much in favor of adding backreferences to LPeg. But, if that
code is correct, it would be the second example of a "useful" application
for them (the first being long strings in Lua).
-- Roberto