cvs2cl outputs bad xml from mixed charset log messages
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cvs2cl (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: cvs2cl
Environment information:
Description: Ubuntu 8.04.1
Release: 8.04
Package: cvs2cl 2.59-2
What I expected: well-formed xml outputted independently from input data (e.g. mixed charsets);
What happened instead: wrong xml, letting mixed charsets in input data break any xml validation.
Scenario: serving my cvs changelog as an html page, by getting it as xml and then applying an xslt transformation:
( cvs -d /var/my_repo rlog ) | cvs2cl --rcs /var/my_repo --xml --xml-encoding=
When a CVS repository is accessed from many different operating systems, you will collect log messages with mixed text encoding, say utf-8, iso-8859-1, etc.
cvs2cl will output those messages in a <msg /> tag "as is", assuming just one encoding by the --xml-encoding option.
This breaks any xsltproc transformation because of invalid utf-8 charset.
Giving --xml-encoding=