2008-11-28 22:47:53 |
Dario |
description |
Binary package hint: cvs2cl
Environment information:
Description: Ubuntu 8.04.1
Release: 8.04
Package: cvs2cl 2.59-2
What I expected: well-formed xml outputted independently from input data's charset, e.g. by using CDATA sections;
What happened instead: wrong xml, letting mixed charset input data break any xml validation.
Scenario: serving my cvs changelog as an html page, by getting it as xml and then applying an xslt transformation:
( cvs -d /var/my_repo rlog ) | cvs2cl --rcs /var/my_repo --xml --xml-encoding=utf-8 --stdin --stdout | xsltproc /usr/local/etc/cl2html-ciaglia.xslt -
When a CVS repository is accessed from many different operating systems, you will collect log messages with mixed text encoding, say utf-8, iso-8859-1, etc.
cvs2cl will output those messages in a <msg /> tag "as is", letting you choose just one encoding by the --xml-encoding option.
This breaks any xsltproc transformation because of invalid charset.
WORKAROUND: i edited /usr/bin/cvs2cl at line 866 this way:
$text = "<msg><![CDATA[${text}]]></msg>\n";
======== |
Binary package hint: cvs2cl
Environment information:
Description: Ubuntu 8.04.1
Release: 8.04
Package: cvs2cl 2.59-2
What I expected: well-formed xml outputted independently from input data (e.g. mixed charsets);
What happened instead: wrong xml, letting mixed charsets in input data break any xml validation.
Scenario: serving my cvs changelog as an html page, by getting it as xml and then applying an xslt transformation:
( cvs -d /var/my_repo rlog ) | cvs2cl --rcs /var/my_repo --xml --xml-encoding=utf-8 --stdin --stdout | xsltproc my_stylesheet.xslt -
When a CVS repository is accessed from many different operating systems, you will collect log messages with mixed text encoding, say utf-8, iso-8859-1, etc.
cvs2cl will output those messages in a <msg /> tag "as is", assuming just one encoding by the --xml-encoding option.
This breaks any xsltproc transformation because of invalid utf-8 charset.
Giving --xml-encoding=iso-8859-1 passes validation, but utf-8 log messages are corrupt.
|
|