LXML parsing criples xml tags(which does not contain anything)

Bug #1776660 reported by Jan Ondřík on 2018-06-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Undecided
Unassigned

Bug Description

Python : sys.version_info(major=3, minor=7, micro=0, releaselevel='beta', serial=2)
lxml.etree : (4, 2, 0, 0)
libxml used : (2, 9, 7)
libxml compiled : (2, 9, 7)
libxslt used : (1, 1, 32)
libxslt compiled : (1, 1, 32)

LXML parse method or etree.tostring criples the xml tags in following case: where there is nothing between starting and ending tags:
eg. <xxx></xxx>
the tags will be cripled to <xxx/>
in case where between start and end tag is something(e.g single whitespace) the parsing works well like:
<xxx> </xxx> -> will be parsed and set tostring correctly to <xxx> </xxx>.

scoder (scoder) wrote :

From the point of view of the information set, there is not difference, it's just a different serialisation.
If you really want, you can get separate opening and closing tags by explicitly assigning an empty string to the .text attribute of empty tags.

Changed in lxml:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers