[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: htmlify a string?
- From: Rici Lake <lua@...>
- Date: Wed, 19 Oct 2005 01:38:34 -0500
I can't help thinking that all the proposed solutions are a lot more
complicated than necessary.
Also, this is a perfect use case for Mike Pall's patch to string.gsub,
with which I concur (although I might extend it a bit....)
Anyway, here's the simplest html escaper I know of (just the three
vital characters):
do
local escapes = {["&"] = "&", ["<"] = "<", [">"] = ">"}
local function escape(c) return escapes[c] end
function html_escape(str) return (str:gsub("[&<>]", escape)) end
end
Can't get much simpler than that, except that with Mike's patch you
wouldn't need the function "escape"; you could just provide the table
"escapes" as the last argument to gsub. (By the way, the redundant
parentheses in the last return statement are deliberate; they avoid
returning the second return value of gsub.)
If the string to be escaped is ISO-8859-1, and you really want to
escape high-ascii numerically, just extend the escapes table:
do
local escapes = {["&"] = "&", ["<"] = "<", [">"] = ">"}
for i = 128, 255 do escapes[string.char(i)] = "&#"..i..";" end
local function escape(c) return escapes[c] end
function html_escape(str) return (str:gsub("[&<>\128-\255]", escape))
end
end
If you really want named escapes, insert them after the for loop, but I
don't see the point; with numeric escapes you don't need to worry about
browser support. However:
do
local escapes = {["&"] = "&", ["<"] = "<", [">"] = ">"}
for i = 128, 255 do escapes[string.char(i)] = "&#"..i..";" end
escapes['á'] = "á"
-- etc.
local function escape(c) return escapes[c] end
function html_escape(str) return (str:gsub("[&<>\128-\255]", escape))
end
end
Perhaps it is better to just output straight in UTF-8:
do
local escapes = {["&"] = "&", ["<"] = "<", [">"] = ">"}
for i = 128, 191 do escapes[string.char(i)] = "\194"..string.char(i)
end
for i = 192, 255 do escapes[string.char(i)] =
"\195"..string.char(i-64) end
local function escape(c) return escapes[c] end
function html_escape(str) return (str:gsub("[&<>\128-\255]", escape))
end
end
In no case should it be necessary to scan the string more than once.
R.