[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: LPeg capturing too much?
- From: Peter Odding <xolox@...>
- Date: Sat, 07 Apr 2007 07:25:00 +0200
Hi all! I’m trying to write an LPeg parser for semi-structured text ala Textile
& Markdown. It’s coming along very nicely: I think I’m almost done! However I’ve
run into a pattern with LPeg that I can’t seem to get right. Quoting from my
command-prompt:
> = lpeg.match(lpeg.P'*' * lpeg.C((1 - lpeg.S'\r\n\f*')^1) * '*', '*emphasis*')
emphasis*
Did I mis-interpret the manual or is this a bug? Quoting part of the summary
table from the LPeg manual:
patt1 - patt2 Matches patt1 if patt2 does not match.
As you probably guessed I’m trying to match the asterisks and their content but
only capture what’s between the asterisks. If I mis-interpreted the manual, do
any of you know the correct way to capture what I have in mind? At the moment
I’m using a post processing pass over the parse tree:
token[2] = token[2]:match '^(.-)%*?$'
to trim trailing delimiters like the asterisk, double asterisk and several quote
variants. I would however much prefer to match the correct input to begin with
because it saves a lot of repetition and reduces the possibility of bugs.
Thanks in advance,
- Peter Odding