lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On Feb 16, 2018, at 8:26 PM, Albert Chan <albertmcchan@yahoo.com> wrote:

I think it is just luck that xpattern work for my previous post.

After reading xpattern.lua a bit, it does NOT do backtracking.
So, previous xpattern work only because it had an 'and' anchor.

-- this will *NOT* work, X'(.*)' consumed everything -> always fail

p1 = X'(.*)' * (X'possibly' + 'likely' + 'definitely') * X'%s+(.*)'

Even with 'and' anchor, xpattern was still wrong, this was the original xpattern:

p1 = X'(.*)and%s+' * (X'possibly' + 'likely' + 'definitely') * X'%s+(.*)'
p1_match = p1:compile()

It will fail with text = "x and possibly y and z"
It was plain luck that it worked at all.

Once you breakup the problem into pieces, string.match loses its
ability to backtrack correctly (xpattern examples clearly showed)

Mixing lpeg and lua string library could introduce the same type of bug.
Plain lpeg  re is much better for the job (below without my '>' patch).

z = re.compile[[ 'and' %s+ ('possibly' / 'likely' / 'definitely') %s+ ]]
p2 = re.compile("{(. (g <- &%z / .g))*} %z {.*}", {z=z})

For any %z match, it set a backtrack mark just before the match,
and keep trying until it failed, THEN backtrack back to last match.
--> greedy match up to %z (without consuming %z)