|
It was thus said that the Great Minh Ngo once stated:
lpeg = require "lpeg"> There are a lot of examples on splitting a string on the net.
> What I'm looking for is the ability to include the delimiter in the strings.
>
> The following desirable example with '\' as the escape character:
>
> ````
> 'foo','bar','foo,bar' = split('foo,bar,foo\\,bar' , ',')
> ````
local Ct = lpeg.Ct
local C = lpeg.C
local P = lpeg.P
function split(s,delim)
-- ----------------------------------------------------------------------
-- Use the given delimeter, or if not supplied, use ':' as the delimeter.
-- Change this if you want a different default.
-- ----------------------------------------------------------------------
local delim = delim or P":"
-- ------------------------------------------------------
-- Make sure our delimeter is an LPeg _expression_.
-- ------------------------------------------------------
if type(delim) == 'string' then
delim = P(delim)
end
-- ---------------------------------------------------------------------
-- A single character. This can be '\' followed by any character (even
-- the given delimeter) or a single character that doesn't match our
-- delimeter.
-- ---------------------------------------------------------------------
local char = P[[\]] * P(1)
+ (P(1) - delim)
-- ----------------------------------------------------------------------
-- A word is 0 or more characters. After we have collected our word,
-- remove any '\'. There's a bug here (a corner case), but I'll leave
-- that as an exercise for the reader to fix.
-- ----------------------------------------------------------------------
local word = char^0 / function(s) s = s:gsub([[\]],'') return s end
-- ----------------------------------------------------------------------
-- a list of words is a word, followed by zero or more delimeter and word.
-- ----------------------------------------------------------------------
local words = word * (delim * word)^0
-- --------------------------------------
-- collect all the results into a table.
-- --------------------------------------
local match = Ct(words)
return match:match(s)
end
print(unpack(split('foo,bar,foo\\,bar' , ',')))
foo bar foo,bar
-spc (Is this a good enough example?)