[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: A citation on Lua
- From: Rici Lake <lua@...>
- Date: Tue, 19 Dec 2006 22:47:47 -0500
On 19-Dec-06, at 11:44 AM, Andrew Wilson wrote:
Typo, in second timing loop the time function should use strsplit.
time(splitstr, ("a"):rep(i), ",") should be
time(strsplit, ("a"):rep(i), ",")
Quite right, sorry.
By the way, great code, how many verson of this function can this list
produce?
OK, here's another one. This one works even if the pattern has
captures; on each iteration except the last it returns:
<segment>, <captures>...
(if there are no captures, it returns the full separator)
The last segment is returned as a single value, making it easy to
identify the last segment in a loop. After a certain amount of
experimentation, I'm pretty convinced that this is the best interface
for 'split', at list with my coding style.
The function, now only 11 lines:
function string:split(pat)
local st, g = 1, self:gmatch("()("..pat..")")
local function getter(self, segs, seps, sep, cap1, ...)
st = sep and seps + #sep
return self:sub(segs, (seps or 0) - 1), cap1 or sep, ...
end
local function splitter(self)
if st then return getter(self, st, g()) end
end
return splitter, self
end
As an example of how this might be used for a slightly non-trivial
splitting problem, consider the problem parsing IRC protocol lines,
which consist of some number of whitespace-separated words,
possibly ending with an argument whose first character is ':'
and which extends to the end of the line (if I remember all
the details correctly).
Here's an implementation using the above split interface:
function ircsplit(cmd)
local t = {}
for word, colon, start in cmd:split"%s+(:?)()" do
t[#t+1] = word
if colon == ":" then
t[#t+1] = cmd:sub(start)
break
end
end
return t
end
The pattern captures the leading colon, if there is any, as
well as the string index of the character following the
separator. The loop body uses this information to terminate
the loop.