Xml Iter |
|
(This is part of LazyKit.)
This package supplies a number of tools for iterating through child elements of an XmlTree.
Return an iterator over the attributes of tree
,
returning attribute names and values. Note that this only
returns keys of type string. (LuaExpat? uses numeric keys to
mark attributes that were defaulted from the DTD.)
Counts the children of tree
; roughly equivalent to
table.getn
. This is necessary because
table.getn(tree)
does not explicitly call for
tree.n
, instead using rawget(tree, "n")
. Fancy tree implementations may need to use a
metatable call to find the number of children.
Return an iterator over tree
that returns each
index and its child. Example:
parent = lazytree.parsestring("<p>a<z>cdef</z>b</p>") for i,x in xpairs(parent) do if type(x) == "string" then print("string:", x) else print("tag:", x.name) end end
prints:
string: a tag: z string: b
Note that it does not descend into child elements (as "cdef" was not printed).
Return an iterator over tree
that ignores character
data elements. It returns an index, subtree, and element name
(which may be ignored):
for i,x in xnpairs(parent) do print("tag:", x.name) end for i,x,name in xnpairs(parent) do print("tag:", name) end
Either of the above prints:
tag: z
Iterate through the children of parent
, using function
definitions from ftable
.
Each child of parent
is looked up in
ftable
. For a child "<foo/>
",
the function ftable.foo(child, parent)
is called.
For character data, ftable[""](str, parent)
is
called. If an unknown tag is found, the function
ftable[true](parent, child)
is called.
If such an entry in ftable does not exist, the child is ignored (unless certain options are set.)
If the handler returns a true value, switch
stops
iterating and returns a (possibly different) true value along
with any second return value. (Interaction with consumption TBD,
and possibly using the first return value as a count of how many
levels to escape out of.)
Example:
s = '<log><entry time="12:30"/><checkpoint/><entry time="12:35"/></log>' parent = lazytree.parsestring(s) ftable = { entry=function (entry, parent) print (entry.attr.time) end } xmliter.switch(parent, ftable)
prints:
12:30 12:35
(Note that since we do not care about the parent, the function
could have been declared as "function (entry)
".)
Entries may contain nested ftable
s instead of
functions; switch
(or switch_c
) is
called recursively with the nested ftable.
Example:
s = [[ <log> <entry id='0'> <time clock="12:50"/> <msg text="foo"/> <extra/> </entry> </log>]] parent = lazytree.parsestring(s) ftable = { entry={ time=function (time) print (time.attr.clock) end; msg=function (msg) print (msg.attr.text) end; } } xmliter.switch(parent, ftable)
prints:
12:50 foo
As an aid to use of nested ftables, ftable[0](parent, [previous_parent])
is called before any children are
processed, and ftable[-1](parent, [previous_parent])
is called
after all children have been processed:
parent = lazytree.parsestring(s) ftable = { entry={ [0]=function (entry) print("id ", entry.attr.id) entry.message_txt = "(no message)" entry.time_txt = "(no time)" entry.level_txt = "(no level)" end; time=function (time, entry) entry.time_txt = time.attr.clock end; msg=function (msg, entry) entry.message_txt = msg.attr.text end; [-1]=function (entry) print("message", entry.message_txt, entry.time_txt, entry.level_txt) end; } } xmliter.switch(parent, ftable)
prints:
id 0 message foo 12:50 (no level)
This takes advantage of the fact that XML trees do not mind
extraneous table entries (as long as you avoid "n
",
"attr
", and "name
" and keys starting
with an underscore.)
Nested tables may not be the most concise way to express code, however. A simpler way of writing the previous would be:
parent = lazytree.parsestring(s) ftable = { entry=function (entry) print("id", entry.attr.id) local v = xmlview.element(entry) local message_txt = "(no message)" local time_txt = "(no time)" local level_txt = "(no level)" if v.time then time_txt = v.time.attr.clock end if v.msg then message_txt = v.msg.attr.text end print("message", message_txt, time_txt, level_txt) end } xmliter.switch(parent, ftable)
Any use of [0]
and [-1]
may be
rewritten in terms of a function that performs the
[0]
action, recursively calls switch
,
and performs the [-1]
action.
Recursive searches for elements can be performed by setting the
[true]
action to the ftable itself. For example:
parent = lazytree.parsefile("xhtml-spec.xml") local count = 0 local ftable ftable = { a=function (a) if a.attr.href then count = count + 1 end -- uncomment to search for <a> elements inside other <a> elements -- xmliter.switch(a, ftable) end } ftable[true] = ftable xmliter.switch(parent, ftable) print(count)
(Note that we cannot write "local ftable={... switch(ftable)
}" as ftable
will
not be in scope for itself.)
The opts
table controls various options for
processing.
If opts.no_chardata
is set, any unexpected
character data (that is, not handled by an
ftable[""]
entry) results in an error.
If opts.no_tags
is set, any unexpected child
elements (those not mentioned in ftable
or handled
by an ftable[true]
entry) result in an error.
If opts.parent
is set, it is passed to functions as
the parent node of the parent
argument. This is
useful when calling switch
recursively if the new
ftable contains [0]
or [-1]
handlers.