html2xhtml produces invald XML for MS Office HTML output
Bug #1706274 reported by
Thomas Weber
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libhtml-html5-parser-perl (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
This is the document element created by MS Office on a Mac:
<html xmlns:x=
xmlns="http://
<!-- -->
</html>
html2xhtml outputs the following invalid XML with two xmlns namespace declarations:
<html xmlns="http://
</body></html>
I'm not sure what part of the Perl libraries is responsible for this and where to report this upstream. Any hints for that are very welcome.
To post a comment you must log in.
I'm not 100% sure, since this module is a parser and not a serializer, but it appears HTML::HTML5::Parser is just building a DOM, and the serialization is then done by XML::LibXML. Therefore, it seems likely the bug is indeed in HTML::HTML5: :Parser.
The upstream bug tracker is at /rt.cpan. org/Public/ Dist/Display. html?Name= HTML-HTML5- Parser
https:/
but unfortunately, it doesn't see a lot of attention these days. Nevertheless, please submit upstream.