XML comment on first line and root element on second line get squeezed together after write
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
New
|
Undecided
|
Unassigned |
Bug Description
I am using lxml to validate, parse and modify a bunch of XML files. All the input XML files have a XML comment as the first line. After the processing of a XML file, i.e. when it has been written to a new output XML file, the XML comment on the first line and the root element on the second line has been squeezed together on one line. For me this is a bug.
Here is a minimal example showing what I mean (files also attached to bug report):
input.xml:
----------
<!-- $Revision: $ $URL: $ -->
<RootElement>
<!-- A comment -->
<ChildEleme
<!-- Another comment -->
<ChildElement2>
</ChildElem
</RootElement>
----------
processing.py:
----------
from lxml import etree as ET
root = ET.parse(
# Modify XML file
# ...
root.write(
----------
output.xml:
----------
<!-- $Revision: $ $URL: $ --><RootElement>
<!-- A comment -->
<ChildEleme
<!-- Another comment -->
<ChildElement2>
</ChildElem
</RootElement>
----------
This might not be the biggest issue in the world, but it is a bit annoying. Both the first line with the XML comment and the second line with the root element tend to be quite long in my XML files. I sometimes use these XML files as a reference and the root element contains various attributes that I'd like to see on the screen without scrolling right in the editor. I also understand that it is a very easy thing to post-process the XML file and separate the two lines with standard file operations in Python. But then again, I shouldn't have to.
For the record: Windows (CRLF) or Linux (LF) line endings in the input.xml file makes no difference.
By using the write function with the C14N method, i.e.:
root.write(
the first XML comment stays on its own line. But there are too many other changes in the XML, e.g. all attributes get sorted in alphabetical order. I would like as few changes as possible to get clear and relevant diffs.
----------
Python : sys.version_
lxml.etree : (4, 4, 2, 0)
libxml used : (2, 9, 5)
libxml compiled : (2, 9, 5)
libxslt used : (1, 1, 30)
libxslt compiled : (1, 1, 30)