Missing option to change namespace mappings

Bug #555602 reported by Wichert Akkerman
48
This bug affects 9 people
Affects Status Importance Assigned to Milestone
lxml
Confirmed
Wishlist
Unassigned

Bug Description

This is essentially a summary of a thread on the mailinglist: http://codespeak.net/pipermail/lxml-dev/2010-March/005329.html

For some XML processing tools I need to be able to change the namespace mapping, specifically to insert a new entry into the namespace map. My test code looked like this:

NS="http://xml.zope.org/namespaces/i18n"
tree=lxml.etree.parse(input)
root=tree.getroot()
count=1
if "i18n" not in root.nsmap:
     root.nsmap["i18n"]=NS
for el in root.iter():
     if "{%s}translate" % NS in el.attrib:
         continue
     if hasText(el):
         el.attrib["{%s}translate" % NS]="string%d" % count
         count+=1
print lxml.etree.tostring(tree)

This did not work since root.nsmap ignores any changes you make in it. An alternative implementation suggested by Simon Wiles created a new root element with the right nsmap and moved all children over, but approach looses data (specifcally the docinfo).

The missing lxml feature to make this possible is a way to change the namespace mapping.

scoder (scoder)
Changed in lxml:
importance: Undecided → Wishlist
scoder (scoder)
Changed in lxml:
status: New → Confirmed
Revision history for this message
Bruno Narciso (brunonar) wrote :

This is critical to convert XML-like versions as GPX 1.0 to 1.1.

Revision history for this message
scoder (scoder) wrote : Re: [Bug 555602] Re: Missing option to change namespace mappings

> This is critical to convert XML-like versions as GPX 1.0 to 1.1.

Oh, I'd certainly take patches that implement this.

Revision history for this message
Bruno Narciso (brunonar) wrote :

Oh, sorry, the version that I'm using is 3.3.1 (pip install). Just now I saw that there is the version 3.4.
Thanks.

Revision history for this message
Lennart Regebro (regebro-gmail) wrote :

I think etree.register_namespace() might be the solution to this.

Revision history for this message
scoder (scoder) wrote :

For a global mapping/redefinition of a prefix, yes, I agree. Not sure if this is meant here.

Revision history for this message
scoder (scoder) wrote :

This is probably very easy to do, so here is a proposal:
- copy the "_Attrib" class in etree.pyx to a new class "_NsMap"
- replace the attribute based implementation with one that looks up namespaces (see also the nsmap property)
- on read access, return all namespaces that are defined on the element or its ancestors
- on write access, add an entry on the owning element itself
- return a new instance from the "nsmap" property of _Element (not _ReadOnlyElement)
- add tests.

Revision history for this message
Jens Troeger (jens.troeger) wrote :

I keep receiving XHTML documents with incorrect namespaces, and I guess I have two choices:

 1. treat the file as plain text and search/replace to fix the namespaces; or
 2. parse the file into an element tree and fix the actual namespaces.

In order to go with my preferred solution, I think this issue would need to be implemented. Is this feature on the current roadmap at all?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.