[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: quoting unquoted token?
- From: Duncan Cross <duncan.cross@...>
- Date: Wed, 5 Oct 2011 22:05:39 +0100
On Wed, Oct 5, 2011 at 9:29 PM, Martijn Hoekstra
<martijnhoekstra@gmail.com> wrote:
> Depends on how much sophistication you want. Something like this could
> probably be done fastest with a quick simple parser. Runtime
> performance wouldn't be great, but it would be fairly straightforward.
LPEG has already been suggested, and I would definitely agree with
that. However, if sticking to standard functions is preferable, here
is my attempt at a generic iterator-based tokenizer:
local function itokens_aux(str, startpos)
local token, nextpos = string.match(str, '^%s*"(.-)"()', startpos)
if not token then
token, nextpos = string.match(str, '^%s*(%w+)()', startpos)
end
return nextpos, token
end
function itokens(str)
return itokens_aux, str, 1
end
Note that the first value generated by the iterator is not useful and
should be ignored by assigning it to _.
So, an example of usage:
t = {}
for _, token in itokens [[foo "hello world" bar]] do
t[#t+1] = [["]] .. token .. [["]]
end
print(table.concat(t, ' '))
-Duncan