lxml

LXML parsing criples xml tags(which does not contain anything)

Bug #1776660 reported by Jan Ondřík on 2018-06-13

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	lxml	Invalid	Undecided	Unassigned

Bug Description

Python : sys.version_info(major=3, minor=7, micro=0, releaselevel='beta', serial=2)
lxml.etree : (4, 2, 0, 0)
libxml used : (2, 9, 7)
libxml compiled : (2, 9, 7)
libxslt used : (1, 1, 32)
libxslt compiled : (1, 1, 32)

LXML parse method or etree.tostring criples the xml tags in following case: where there is nothing between starting and ending tags:
eg. <xxx></xxx>
the tags will be cripled to <xxx/>
in case where between start and end tag is something(e.g single whitespace) the parsing works well like:
<xxx> </xxx> -> will be parsed and set tostring correctly to <xxx> </xxx>.

Revision history for this message

scoder (scoder) wrote on 2018-06-15:

From the point of view of the information set, there is not difference, it's just a different serialisation.
If you really want, you can get separate opening and closing tags by explicitly assigning an empty string to the .text attribute of empty tags.

Changed in lxml:
status:	New → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.