You might want to pre-process such legacy HTML files, perhaps with HTML Tidy <http://tidy.sourceforge.net/>, or with TagSoup <http://mercury.ccil.org/~cowan/XML/tagsoup/>, to generate valid XML.