Subscript numbers result in invalid literal for int() with base 10
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
Medium
|
Unassigned |
Bug Description
In the latest release when parsing only subscripted digits it seems to decide they are digits but then throws an exception: ValueError: invalid literal for int() with base 10.
This did not happen with an earlier release.
Code to reproduce:
from lxml import objectify
xml = """
<types>
<mytext>
<mynum>123</mynum>
<mysuperscript>
</types>
"""
doc = objectify.
print(objectify
Result in the latest release:
Traceback (most recent call last):
File "/home/
print(
File "src/lxml/
File "src/lxml/
File "src/lxml/
File "src/lxml/
File "src/lxml/
ValueError: invalid literal for int() with base 10: '²²²²²²²²²²'
Not working with the following:
Python : sys.version_
lxml.etree : (4, 9, 2, 0)
libxml used : (2, 9, 14)
libxml compiled : (2, 9, 14)
libxslt used : (1, 1, 35)
libxslt compiled : (1, 1, 35)
Working with the following:
Python : sys.version_
lxml.etree : (4, 5, 1, 0)
libxml used : (2, 9, 4)
libxml compiled : (2, 9, 4)
libxslt used : (1, 1, 29)
libxslt compiled : (1, 1, 29)
Changed in lxml: | |
milestone: | none → 4.9.3 |
importance: | Undecided → Medium |
status: | New → Fix Released |