attrib cannot be copied

Bug #1104370 reported by Piotr Ożarowski on 2013-01-24
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Low
scoder

Bug Description

[forwarding http://bugs.debian.org/698868]

copying Element's "attrib" attribute leads to unexpected behaviour

lxml 2.2.8
==========
$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}

$ python -c 'import lxml.etree, copy; print copy.copy(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
{}

lxml 2.3.2 (and 2.3.5 and 3.0.1)
==========
$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}

$ python -c 'import lxml.etree, copy; print copy.copy(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.6/copy.py", line 95, in copy
    return _reconstruct(x, rv, 0)
  File "/usr/lib/python2.6/copy.py", line 323, in _reconstruct
    y = callable(*args)
  File "/usr/lib/python2.6/copy_reg.py", line 93, in __newobj__
    return cls.__new__(cls, *args)
  File "lxml.etree.pyx", line 2155, in lxml.etree._Attrib.__cinit__ (src/lxml/lxml.etree.c:48518)
TypeError: __cinit__() takes exactly 1 positional argument (0 given)

Python : sys.version_info(major=2, minor=7, micro=3, releaselevel='final', serial=0)
lxml.etree : (3, 0, 1, 0)
libxml used : (2, 8, 0)
libxml compiled : (2, 8, 0)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)

scoder (scoder) wrote :

Seriously - I'm sure no-one has *ever* tried this before. What I usually do is to call dict() on the attrib value, which is way more obvious and explicit and works just fine:

$ python -c 'import lxml.etree; print lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib'
{'c': 'bar', 'b': 'foo'}
$ python -c 'import lxml.etree; print dict(lxml.etree.fromstring("""<a b="foo" c="bar" />""").attrib)'
{'c': 'bar', 'b': 'foo'}

I'm not even sure what copy(Attrib) should return. It certainly can't return a new attrib proxy object, that wouldn't make sense at all.

For comparison, here's what CPython 3.4 gives me in a similar case, when requesting a copy of a dict items view:

>>> import copy
>>> copy.copy(dict(a=1, b=2).items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/python3.4-opt/lib/python3.4/copy.py", line 97, in copy
    return _reconstruct(x, rv, 0)
  File "/opt/python3.4-opt/lib/python3.4/copy.py", line 287, in _reconstruct
    y = callable(*args)
  File "/opt/python3.4-opt/lib/python3.4/copyreg.py", line 88, in __newobj__
    return cls.__new__(cls, *args)
TypeError: object.__new__(dict_items) is not safe, use dict_items.__new__()

Maybe mapping __copy__() to a call to dict() would be a reasonable compromise, just to keep it from failing...

Piotr Ożarowski (piotr) wrote :

heh, that's how I fixed it (with a dict). Note that I do not use copy.copy (or even copy.deepcopy) directly, it was used in some other function that was serializing an object with Element.attrib inside and I managed to narrow it down to copy.copy(Element.attrib)

scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → Stefan Behnel (scoder)
importance: Undecided → Low
status: New → Fix Committed
scoder (scoder) wrote :

Fix is in lxml 3.1.

Changed in lxml:
status: Fix Committed → Fix Released
scoder (scoder) on 2013-04-28
Changed in lxml:
milestone: none → 3.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers