[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Possible pattern bug
- From: Coda Highland <chighland@...>
- Date: Sun, 17 Nov 2013 01:27:20 -0800
On Sat, Nov 16, 2013 at 1:48 PM, Liam Devine <liamdevine@oolua.org> wrote:
>
> On 16/11/13 21:22, Liam Devine wrote:
>>
>>
>>
>> On 16/11/13 20:59, Michael Savage wrote:
>>>
>>>
>>> Hi lua-l,
>>>
>>> ( "[a] [b] c" ):match( "%[(.-)%] c" )
>>> -> "a] [b"
>>>
>>>> From http://www.lua.org/manual/5.1/manual.html#5.4.1 or
>>>
>>> http://www.lua.org/manual/5.2/manual.html#6.4.1:
>>>
>>>> a single character class followed by '-', which also matches 0 or more
>>>> repetitions of characters in the class. Unlike '*', these repetition
>>>> items will always match the shortest possible sequence;
>>>
>>>
>>> The shortest possible sequence for the above is "b". Is this a bug?
>>>
>>> Mike
>>>
>>
>> It is not, as it returns the first match [1]. If you wanted to capture
>> the last sequence in square brackets you could prefix it with a greedy
>> match:
>> ( "[a] [b] c" ):match( ".*%[(.-)%] c" )
>>
>> [1] http://www.lua.org/manual/5.2/manual.html#pdf-string.match
>
>
> I say could because it all depends on if you know your data format, the more
> you know the more specific you can be.
> - Single lower case character
> - Multiple lower case characters
> - Numbers and lower case characters
> - Upper and lower case characters
> ...
>
>> print( ( "[a] [b] c" ):match( "(%l)%] c$" ) )
> b
>> print( ( "[a] [bddd] c" ):match( "(%l-)%] c$" ) )
> bddd
>> print( ( "[a] [b1d2c] c" ):match( "([%l%d]-)%] c$" ) )
> b1d2c
>> print( ( "[a] [KJhhgGGj] c" ):match( "(%a-)%] c$" ) )
> KJhhgGGj
>
> As you can see I also like to anchor the match when possible.
>
> --
> Liam
>
>From that perspective, the simplest modification of the original
pattern is to replace the .- with [^[]- so that it can't match any
intervening left brackets.
/s/ Adam