[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [LPeg] How can I make a recursive pattern?
- From: "Soni L." <fakedme@...>
- Date: Wed, 27 Jul 2016 14:26:20 -0300
On 27/07/16 02:15 PM, Sean Conner wrote:
It was thus said that the Great Soni L. once stated:
On 27/07/16 12:34 PM, Patrick Donnelly wrote:
Soni,
On Wed, Jul 27, 2016 at 10:27 AM, Soni L. <fakedme@gmail.com> wrote:
Sean Conner's incomplete solution uses lpeg.Carg(1) which doesn't work
recursively.
At the time, I wasn't sure of the output format and I had assumed (badly,
I see) that you wanted something like:
{
key = value
}
where both key and value are specified in the text you are parsing. Doing
that in LPeg isn't straightforward and thus, lpeg.Carg() was used to pass in
a resulting table. Given the example, it's both easy and hard. Easy,
because collecting results into an array is easy in LPeg:
a = lpeg.P"a"
r = lpeg.Ct(lpeg.C(a)^0)
x = r:match "aaaaaaaa"
Even recursively:
nested = lpeg.P {
"start",
start = lpeg.Ct((lpeg.V"a" + lpeg.V"list")^0),
a = lpeg.C(lpeg.P"a"),
list = lpeg.P"("
* lpeg.V"start"
* lpeg.P")",
}
r = lpeg.Ct(nested)
x = r:match "aaa(aaaa)aaa(a)aaa"
Hard, because it appears you want to start with index [0], which makes it
... an interesting problem (there doesn't appear to be an easy way to get to
the table that lpeg.Ct() creates, which is why this is an interesting
problem). Perhaps some combination of lpeg.Cg() with lpeg.P(true) divided
by a function that returns a new table and lpeg.Cb() but it's something
you'll have to experiment with.
Or you could relax the [0] index requirement ...
-spc (You're in luck ... work is slow today ... )
The [0] is only for the tag name.
Also I'm currently looking into something like this:
lpeg.Cg(tag, 0) * lpeg.Cg(lpeg.Ct(lpeg.Cg(lpeg.Cb(0), "tag")), 0)
This puts the tag name in a table like {tag=<tagname>}. I just need to
make it only do that if there's a namespace.
Only issue is that the tag namespace could come right after the tag name
or after all attributes (or, at least in theory, even in the middle of
content...).
In other words, this:
#Tag
> Content
####TagNS
is equivalent to this:
#Tag
####TagNS
> Content
no matter the content. That is, if there are more tags in the content,
they cannot cause any interference with how this is parsed.
(*Technically*, doing that is undefined. For all you know, that document
could set your computer on fire. But, you know, w/e.)
No idea how to handle that.
--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.