Incremental parsing cannot parse contents if data is split at certain positions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
Low
|
scoder |
Bug Description
Python : sys.version_
lxml.etree : (5, 1, 0, 0)
libxml used : (2, 10, 3)
libxml compiled : (2, 10, 3)
libxslt used : (1, 1, 37)
libxslt compiled : (1, 1, 37)
See also: https:/
If the fed data is split inside a `href`, nothing can be parsed.
See example:
Wrong:
```python
from lxml import etree
parser = etree.HTMLPullP
for data in (b'<root><a href="2011-03-13_', b'135411/
for _, elem in parser.
parser.close()
```
Expected:
```python
from lxml import etree
parser = etree.HTMLPullP
for data in (b'<root><a href="2011-
for _, elem in parser.
parser.close()
```
Changed in lxml: | |
assignee: | nobody → scoder (scoder) |
status: | Fix Committed → Fix Released |
Works for me with libxml2 2.12.6. The binary wheels of lxml 5.1 should be using 2.12.5, I guess that's ok as well.