Minor problem appending new element
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Triaged
|
Undecided
|
Unassigned |
Bug Description
Python : sys.version_
lxml.etree : (4, 3, 2, 0)
libxml used : (2, 9, 5)
libxml compiled : (2, 9, 5)
libxslt used : (1, 1, 30)
libxslt compiled : (1, 1, 30)
In Python I can iterate through a list, and on a certain condition append a new item to the list, which is then included in the iteration.
>>> x = ['a', 'b', 'c']
>>> for y in x:
... print(y)
... if y == 'b':
... x.append('d')
...
a
b
c
d
>>> x
['a', 'b', 'c', 'd']
>>>
The same thing works in lxml -
>>> lmx = '<x><y z="a"/><y z="b"/><y z="c"/></x>'
>>> xml = etree.fromstrin
>>> for y in xml:
... print(etree.
... if y.get('z') == 'b':
... xml.append(
...
b'<y z="a"/>'
b'<y z="b"/>'
b'<y z="c"/>'
b'<y z="d"/>'
>>> etree.tostring(xml)
b'<x><y z="a"/><y z="b"/><y z="c"/><y z="d"/></x>'
However, if it happens that the condition is met on the last item in the list, Python still works, but lxml does not include the appended item in the iteration. In the following, the only change is checking for 'c' instead of 'b'.
>>> x = ['a', 'b', 'c']
>>> for y in x:
... print(y)
... if y == 'c':
... x.append('d')
...
a
b
c
d
>>> x
['a', 'b', 'c', 'd']
>>>
>>> lmx = '<x><y z="a"/><y z="b"/><y z="c"/></x>'
>>> xml = etree.fromstrin
>>> for y in xml:
... print(etree.
... if y.get('z') == 'c':
... xml.append(
...
b'<y z="a"/>'
b'<y z="b"/>'
b'<y z="c"/>'
>>> etree.tostring(xml)
b'<x><y z="a"/><y z="b"/><y z="c"/><y z="d"/></x>'
As you can see, the last element is correctly appended, but is not included in the iteration.
BTW, I see that ElementTree in the standard library does not have this problem.
ET in the stdlib is actually backed by a Python list of children, lxml uses a C level tree structure.
The iterators in lxml tend to look ahead one item in order to allow replacements of the last returned element in the tree to work. I think you can only have one of the two.