An easy way to test if the output is formal XML is with the xmllint program that
comes with the libxml2 library (http://www.xmlsoft.org). Just download the
output of xml.cgi, then run xmllint on it like this:
xmllint -noout file.xml
If it prints anything, then the file has character set or tag balancing issues.
You can also perform validity checks (it should be able to download the DTD) with:
xmllint -noout -valid file.xml
At the moment, it looks like the output from bugzilla.mozilla.org has formality
problems (isn't utf-8, but doesn't specify its character set), and validity
problems (looks like some elements are out of order).
For my purposes (my python module for talking to a bugzilla installation), only
the formality issue is a problem. It might be worth fixing the validity bug though.
Here is the output from a validity check after setting the character set to
iso8859-1 in the <?xml?> PI:
An easy way to test if the output is formal XML is with the xmllint program that www.xmlsoft. org). Just download the
comes with the libxml2 library (http://
output of xml.cgi, then run xmllint on it like this:
xmllint -noout file.xml
If it prints anything, then the file has character set or tag balancing issues.
You can also perform validity checks (it should be able to download the DTD) with:
xmllint -noout -valid file.xml
At the moment, it looks like the output from bugzilla. mozilla. org has formality
problems (isn't utf-8, but doesn't specify its character set), and validity
problems (looks like some elements are out of order).
For my purposes (my python module for talking to a bugzilla installation), only
the formality issue is a problem. It might be worth fixing the validity bug though.
Here is the output from a validity check after setting the character set to
iso8859-1 in the <?xml?> PI:
$ xmllint -noout -valid 384.xml INVALID< /resolution>
^
384.xml:20: validity error: No declaration for element resolution
<resolution>
384.xml:199: validity error: Element bug content doesn't follow the DTD
Expecting (bug_id , exporter , urlbase , bug_status , resolution? , product ,
priority , version , rep_platform , assigned_to , delta_ts , component ,
reporter , target_milestone? , bug_severity , creation_ts , qa_contact? ,
status_whiteboard? , op_sys , short_desc? , keywords* , dependson* , blocks* ,
cc* , long_desc? , attachment*), got (bug_id bug_status product priority version
rep_platform assigned_to delta_ts component reporter target_milestone
bug_severity creation_ts qa_contact op_sys resolution short_desc long_desc
long_desc long_desc long_desc long_desc long_desc long_desc long_desc long_desc )
</bug>
^