[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Proposal for new LPeg function: lpeg.Ctab()
- From: Sean Conner <sean@...>
- Date: Tue, 13 Sep 2016 14:59:13 -0400
Seeing how there's going to be a bug fix for LPeg Real Soon Now (TM), I
thought it might be a good time to float a proposal for a new lpeg function.
Some background: I parse a lot of Internet related messages and URLs
(email and SIP messages, sip:, tel:, http: and https: URLs, etc.) and it's
amazing how often name/value pairs keep popping up. Usually there are a
fixed number of defined name/value pairs but the grammars almost always
allow user defined pairs. Since I use LPeg for all of my parsing needs, I
like to parse the data into Lua tables and the most problematic part is
handing open ended name/value pairs.
Let me give a simplified example: A simple file of name/value pairs
(alpha characters only---I want to keep things really simple) one per line,
name and value separated by an '=' sign; order does not matter. There are
two fields defined, "foo" and "bar" (which if not provided, default values
will be given). Two examples follow:
Example 1:
foo=de
bar=true
alpha=Sean
bravo=Conner
Example 2:
yankee=Sean
zulu=Conner
foo=se
I would prefer to return a table like:
{
foo = "de",
bar = "true",
other =
{
alpha = "Sean",
bravo = "Conner"
}
}
It is not easy to get that. If I do (assume everything defined):
-- for foo abd bar, assume more error checking than you see here
foo = P"foo" * EQ * Cg(value,"foo") * EOL
bar = P"bar" * EQ * Cg(value,"bar") * EOL
other = C(name) * EQ * C(value) * EOL
list = Ct( -- CAPTURE INTO A TABLE
Cg(Cc"en","foo") -- DEFAULT VALUE
* Cg(Cc"false","bar") -- DEFAULT VALUE
* (foo + bar + other)^0
)
I get:
{
[1] = "alpha",
[2] = "Sean",
[3] = "bravo",
[4] = "Conner",
bar = "true",
foo = "de"
}
Nothing at all what I want. The next solution is to use a folding capture:
foo = Cg(C"foo" * EQ * C(value)) * EOL
bar = Cg(C"bar" * EQ * C(value)) * EOL
other = Cg(C(name) * EQ * C(value)) * EOL
list = Cf( -- FOLDING CAPTURE
Ct(Cc()) -- SEE [1]
* Cg(Cc "foo" * Cc "en")
* Cg(Cc "bar" * Cc "false")
* (foo + bar + other)^0,
function(t,n,v)
t[n] = v
return t
end
)
It's closer to what I want, and certainly usable:
{
bravo = "Conner",
bar = "true",
alpha = "Sean",
foo = "de"
}
and yes, I can complicate the folding function to stuff the non-standard
headers into a sub table, but honestly, I'd rather not do that.
I *can* get what I want with Carg():
function set(t,name,val) t[name] = val end
function set2(t,name,val) t.other[name] = val end
foo = Cg(Carg(1) * C"foo" * EQ * C(value)) / set * EOL
bar = Cg(Carg(1) * C"bar" * EQ * C(value)) / set * EOL
other = Cg(Carg(1) * C(name) * EQ * C(value)) / set2 * EOL
list = Cg(Carg(1) * Cc "foo" * Cc "en") / set
* Cg(Carg(1) * Cc "bar" * Cc "false") / set
* Cg(Carg(1) * Ct(Cc())) / function(t,h) t.other = h end
* (foo + bar + other)^0
* Carg(1)
but at the expense of a more complicated invocation:
x = list:match(data,1,{})
instead of nicer (to me):
x = list:match(data)
Finally, we get to the proposal: an LPeg function that returns the table
created by Ct(), which I'm calling Ctab() (but I'm not wedded to that
name). I would work like:
foo = P"foo" * EQ * Cg(value,"foo") * EOL
bar = P"bar" * EQ * Cg(value,"bar") * EOL
other = Cg(Ctab() * C(name) * EQ * C(value))
-- ^^^^^^ return table created by Ct()
/ function(t,n,v)
t.other[n] = v
end
* EOL
list = Ct(
Cg(Cc"en","foo")
* Cg(Cc"false","bar")
* Cg(Ct(Cc()),"other")
* (foo + bar + other)^0
)
While I could probably add the function myself, I'd rather not have to
rely upon a custom version of LPeg for parsing modules I write.
-spc (I'm including working examples of the above)
[1] Doing a Cc{} don't work here, as that always returns the *same*
table across different parses (the table is created at compile time
and returned at runtime). A bare Ct() fails as it expects a
pattern; thus, the call Ct(Cc()). Cc() returns a pattern (satisfies
Ct()) of nothing and returns nil, which doesn't affect the table to
any degree.
lpeg = require "lpeg"
Cg = lpeg.Cg
Ct = lpeg.Ct
Cc = lpeg.Cc
C = lpeg.C
P = lpeg.P
R = lpeg.R
function dump(name,t)
print(name)
for n,v in pairs(t) do
print("",n,v)
end
print()
end
test1 = [[
foo=de
bar=true
alpha=Sean
bravo=Conner
]]
test2 = [[
yankee=Sean
zulu=Conner
foo=se
]]
ALPHA = R("AZ","az")
EQ = P"="
EOL = P"\n"
name = ALPHA^1
value = ALPHA^1
foo = P"foo" * EQ * Cg(value,"foo") * EOL
bar = P"bar" * EQ * Cg(value,"bar") * EOL
other = C(name) * EQ * C(value) * EOL
list = Ct(
Cg(Cc"en","foo")
* Cg(Cc"false","bar")
* (foo + bar + other)^0
)
x = list:match(test1)
dump("x",x)
x = list:match(test2)
dump("x",x)
lpeg = require "lpeg"
Ct = lpeg.Ct
Cg = lpeg.Cg
Cf = lpeg.Cf
Cc = lpeg.Cc
C = lpeg.C
P = lpeg.P
R = lpeg.R
function dump(name,t)
print(name)
for n,v in pairs(t) do
print("",n,v)
end
print()
end
test1 = [[
foo=de
bar=true
alpha=Sean
bravo=Conner
]]
test2 = [[
yankee=Sean
zulu=Conner
foo=se
]]
ALPHA = R("AZ","az")
EQ = P"="
EOL = P"\n"
name = ALPHA^1
value = ALPHA^1
foo = Cg(C"foo" * EQ * C(value)) * EOL
bar = Cg(C"bar" * EQ * C(value)) * EOL
other = Cg(C(name) * EQ * C(value)) * EOL
list = Cf(
Ct(Cc())
* Cg(Cc "foo" * Cc "en")
* Cg(Cc "bar" * Cc "false")
* (foo + bar + other)^0,
function(t,n,v)
t[n] = v
return t
end
)
x = list:match(test1)
dump("x",x)
x = list:match(test2)
dump("x",x)
lpeg = require "lpeg"
Carg = lpeg.Carg
Cg = lpeg.Cg
Ct = lpeg.Ct
Cc = lpeg.Cc
C = lpeg.C
P = lpeg.P
R = lpeg.R
function dump(name,t)
print(name)
for n,v in pairs(t) do
print("",n,v)
end
print()
end
test1 = [[
foo=de
bar=true
alpha=Sean
bravo=Conner
]]
test2 = [[
yankee=Sean
zulu=Conner
foo=se
]]
ALPHA = R("AZ","az")
EQ = P"="
EOL = P"\n"
name = ALPHA^1
value = ALPHA^1
function set(t,name,val)
t[name] = val
end
function set2(t,name,val)
t.other[name] = val
end
foo = Cg(Carg(1) * C"foo" * EQ * C(value)) / set * EOL
bar = Cg(Carg(1) * C"bar" * EQ * C(value)) / set * EOL
other = Cg(Carg(1) * C(name) * EQ * C(value)) / set2 * EOL
list = Cg(Carg(1) * Cc "foo" * Cc "en") / set
* Cg(Carg(1) * Cc "bar" * Cc "false") / set
* Cg(Carg(1) * Ct(Cc())) / function(t,h) t.other = h end
* (foo + bar + other)^0
* Carg(1)
x = list:match(test1,1,{})
dump("x",x)
x = list:match(test2,1,{})
dump("x",x)