[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: yet another pattern-matching library for Lua
- From: Philippe Lhoste <PhiLho@...>
- Date: Wed, 03 Jan 2007 11:26:45 +0100
Roberto Ierusalimschy a écrit :
I am releasing a prototype of (yet) another pattern-matching
library for Lua, called LPeg:
Very interesting. I printed out your page and the Wikipedia entry and
read them twice to impregnate the concepts...
Funnily, I recently thought it would be nice to have a real
pattern-matching language, that would be more flexible and more
readable, with a gentler learning curve than regular expressions.
I am not too sure about the readability of LPeg... But it is close of
what I had in mind.
I didn't knew this PEG technology, it looks very promising.
Now that I master regular expressions, I have to learn a new syntax and
even more new concepts/way of thinking...
Some remarks (based on the doc, I didn't tried the library yet):
- I understand that the LPeg syntax leverages metamethods, and thus is
limited by the capability of these. It is a bit annoying that it forces
to change from the "traditional" syntax, although it is still logical.
Too bad that priority rules prevents you from keeping / as ordered
choice and .. as sequence.
And by using #, you limit the library to Lua 5.1.
- I wondered how to emulate {n,m} syntax of REs. It seems that patt^n is
close of that. Perhaps for newbies like me, you should provide more
examples, something with both bounds, if possible. Classical example in
REs is [a-z]{2,6} for TLDs (perhaps more on the upper bound).
I understand that, per the doc, we have:
^n -> {n,}
^-n -> {0,n}
^0 -> {0,} -> *
^1 -> {1,} -> +
^-1 -> {0,1} -> ?
so, to have patt{n,m}, can we write
patt^n * patt^(n - m)
? Is it possible to express this without writing the pattern twice? (or
is it a non-issue?)
- I suppose you plan to allow names (keys) as grammar table indexes, or
is it really hard to implement? Or is it again a non-issue? Oh, I see
now the last example using variables as rule names, so it is indeed a
non-issue.
- Looking at the CSV example, is (lpeg.P(1) - '"') the same as the (more
used elsewhere) (1 - lpeg.P"'")? I suppose the rule here is to have at
least one pattern in the expression to trigger metamethods.
Typo alert!
- where the /math/ occurs.
- we can use /to/ following transformer:
- the use of a dot to /denote/ concatenation.
(not sure about this one, this verb looks like having a strange use (per
WordReference translation/definition), but it can be just a limitation
of my own English understanding...)
Overall, it is an interesting technique. I am starting to phantasm on
using such parser to describe flexible but fast and powerful lexers for
Scintilla, allowing to add (or change) lexers without recompiling the
component...
I suppose your library (parser) uses the Lua license, like most others.
It could be a good starting point, I saw no reference to a packrat
parser written in C...
--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --