attrib cannot be copied

Bug #1104370 reported by Piotr Ożarowski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Low
scoder

Bug Description

[forwarding http://bugs.debian.org/698868]

copying Element's "attrib" attribute leads to unexpected behaviour

lxml 2.2.8
==========
$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}

$ python -c 'import lxml.etree, copy; print copy.copy(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
{}

lxml 2.3.2 (and 2.3.5 and 3.0.1)
==========
$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}

$ python -c 'import lxml.etree, copy; print copy.copy(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.6/copy.py", line 95, in copy
    return _reconstruct(x, rv, 0)
  File "/usr/lib/python2.6/copy.py", line 323, in _reconstruct
    y = callable(*args)
  File "/usr/lib/python2.6/copy_reg.py", line 93, in __newobj__
    return cls.__new__(cls, *args)
  File "lxml.etree.pyx", line 2155, in lxml.etree._Attrib.__cinit__ (src/lxml/lxml.etree.c:48518)
TypeError: __cinit__() takes exactly 1 positional argument (0 given)

Python : sys.version_info(major=2, minor=7, micro=3, releaselevel='final', serial=0)
lxml.etree : (3, 0, 1, 0)
libxml used : (2, 8, 0)
libxml compiled : (2, 8, 0)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)

Revision history for this message
scoder (scoder) wrote :

Seriously - I'm sure no-one has *ever* tried this before. What I usually do is to call dict() on the attrib value, which is way more obvious and explicit and works just fine:

$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}
$ python -c 'import lxml.etree; print dict(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
{'c': 'bar', 'b': 'foo'}

I'm not even sure what copy(Attrib) should return. It certainly can't return a new attrib proxy object, that wouldn't make sense at all.

For comparison, here's what CPython 3.4 gives me in a similar case, when requesting a copy of a dict items view:

>>> import copy
>>> copy.copy(dict(a=1, b=2).items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/python3.4-opt/lib/python3.4/copy.py", line 97, in copy
    return _reconstruct(x, rv, 0)
  File "/opt/python3.4-opt/lib/python3.4/copy.py", line 287, in _reconstruct
    y = callable(*args)
  File "/opt/python3.4-opt/lib/python3.4/copyreg.py", line 88, in __newobj__
    return cls.__new__(cls, *args)
TypeError: object.__new__(dict_items) is not safe, use dict_items.__new__()

Maybe mapping __copy__() to a call to dict() would be a reasonable compromise, just to keep it from failing...

Revision history for this message
Piotr Ożarowski (piotr) wrote :

heh, that's how I fixed it (with a dict). Note that I do not use copy.copy (or even copy.deepcopy) directly, it was used in some other function that was serializing an object with Element.attrib inside and I managed to narrow it down to copy.copy(Element.attrib)

Revision history for this message
scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → Stefan Behnel (scoder)
importance: Undecided → Low
status: New → Fix Committed
Revision history for this message
scoder (scoder) wrote :

Fix is in lxml 3.1.

Changed in lxml:
status: Fix Committed → Fix Released
scoder (scoder)
Changed in lxml:
milestone: none → 3.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.