LXML parsing criples xml tags(which does not contain anything)

Bug #1776660 reported by Jan Ondřík
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Invalid
Undecided
Unassigned

Bug Description

Python : sys.version_info(major=3, minor=7, micro=0, releaselevel='beta', serial=2)
lxml.etree : (4, 2, 0, 0)
libxml used : (2, 9, 7)
libxml compiled : (2, 9, 7)
libxslt used : (1, 1, 32)
libxslt compiled : (1, 1, 32)

LXML parse method or etree.tostring criples the xml tags in following case: where there is nothing between starting and ending tags:
eg. <xxx></xxx>
the tags will be cripled to <xxx/>
in case where between start and end tag is something(e.g single whitespace) the parsing works well like:
<xxx> </xxx> -> will be parsed and set tostring correctly to <xxx> </xxx>.

Revision history for this message
scoder (scoder) wrote :

From the point of view of the information set, there is not difference, it's just a different serialisation.
If you really want, you can get separate opening and closing tags by explicitly assigning an empty string to the .text attribute of empty tags.

Changed in lxml:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.