Ubuntu

Bogus SystemID in XHTML catalog makes org.apache.xml.resolver fail

Reported by Dominique Hazael-Massieux on 2009-07-16
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
w3c-dtd-xhtml (Ubuntu)
Wishlist
Unassigned

Bug Description

Binary package hint: w3c-dtd-xhtml

The XML-catalog file that points to the local XHTML dtds provided by the package is using the following doctype:
<!DOCTYPE catalog PUBLIC "-//GlobalTransCorp//DTD XML Catalogs V1.0-Based Extension V1.0//EN"
  "http://globaltranscorp.org/oasis/catalog/xml/tr9401.dtd">

The URL used for the SystemId of that doctype ("http://globaltranscorp.org/oasis/catalog/xml/tr9401.dtd") does not exist any more.

When trying to use /etc/xml/catalog as the catalog for DTD resolutions with org.apache.xml.resolver, the local resolution fails:
        java -cp /usr/share/java/xml-commons-resolver-1.1.jar org.apache.xml.resolver.apps.resolver -d 2 -c /etc/xml/catalog -p "-//W3C//DTD XHTML 1.0 Strict//EN" public
        Cannot find CatalogManager.properties
        Loading catalog: ./xcatalog
        Loading catalog: /etc/xml/catalog
        Resolve PUBLIC (publicid, systemid):
          public id: -//W3C//DTD XHTML 1.0 Strict//EN
        Switching to delegated catalog(s):
         file:/etc/xml/w3c-dtd-xhtml.xml
        Loading catalog: file:/etc/xml/w3c-dtd-xhtml.xml
        Switching to delegated catalog(s):
         file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
        Loading catalog: file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
        Exception in thread "main" java.net.UnknownHostException: globaltranscorp.org
        [...]

This means that any XML application relying on org.apache.xml.resolver that needs to resolve DTDs (i.e. many of them) won't do the local resolution, and will thus hit the W3C Web site (cf
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic )

A simple fix to that problem is to replace the current systemId by "/usr/share/xml/schema/xml-core/tr9401.dtd" (provided by the xml-core package).

(this might be the same bug as #390604 but I'm not sure )

Changed in w3c-dtd-xhtml (Ubuntu):
importance: Undecided → Wishlist

Err - sorry, why wishlist ? This is a bug report, not an enhancement request.

Again, could this bug's importance please be re-qualified? It is not a wishlist item, it is an actual bug report (with a patch, too).

ljs (ljs) wrote :

The file: url in the patch has too many slashes. I would suggest plain "/usr/share/xml/schema/xml-core/tr9401.dtd" as the system identifier.

Arguably this is a bug in org.apache.xml.resolver, cf. http://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html#s.bootstrap - the resolver should be able to parse catalog files without needing to resolve external entities

Bertails (bertails) wrote :

This bug is still there in Ubuntu 10.04.

Please someone accept to apply the patch.

Bruce Merry (bmerry) wrote :

This still affects Ubuntu 12.04.3. I've attached a patch against w3c-dtd-xhtml_1.1-5ubuntu1.diff, which should make this easier to fix.

And I agree with comment #2 - regardless of which package is at fault, resolution is failing, so it is a bug in the Ubuntu system.

Bruce Merry (bmerry) wrote :

This bug also affects 13.10, and is causing the maven-xml-plugin to fail (which wasn't the cause in 12.04). I've attached a patch against the source tree for version 1.2-4.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers