Activity log for bug #1943277

Date Who What changed Old value New value Message
2021-09-10 17:38:53 Corey Bryant bug added bug
2021-09-10 17:39:31 Corey Bryant summary libxml2 causes regression in python3-lxml2 libxml2 2.9.12+dfsg-3 causes regression
2021-09-10 17:40:19 Corey Bryant description root@i1:~# cat example.xml <domain type="qemu"> <name>fake-name</name> <os> <type/> </os> <devices> <interface type="bridge"> <mac address="22:52:25:62:e2:aa"/> <model type="virtio"/> <source bridge="br0"/> <mtu size="9000"/> <target dev="nicdc065497-3c"/> <bandwidth> <inbound average="100" peak="200" burst="300"/> <outbound average="10" peak="20" burst="30"/> </bandwidth> </interface> </devices> </domain> root@i1:~# cat xml.py from lxml import etree def parseXML(xmlFile): # Parse the xml with open(xmlFile) as fobj: xml = fobj.read() doc = etree.fromstring(xml) ret = doc.findall('./devices/interface') node_xml = etree.tostring(ret[0]).decode() print("node_xml={}".format(node_xml)) if __name__ == "__main__": parseXML("example.xml") Expected result: root@i1:~# python3 xml.py node_xml=<interface type="bridge"> <mac address="22:52:25:62:e2:aa"/> <model type="virtio"/> <source bridge="br0"/> <mtu size="9000"/> <target dev="nicdc065497-3c"/> <bandwidth> <inbound average="100" peak="200" burst="300"/> <outbound average="10" peak="20" burst="30"/> </bandwidth> </interface> Actual Result: root@i1:~# python3 xml.py node_xml=<interface type="bridge"> <mac address="22:52:25:62:e2:aa"/> <model type="virtio"/> <source bridge="br0"/> <mtu size="9000"/> <target dev="nicdc065497-3c"/> <bandwidth> <inbound average="100" peak="200" burst="300"/> <outbound average="10" peak="20" burst="30"/> </bandwidth> </interface> </devices> </domain> This is broken in 2.9.12+dfsg-3 on impish. It is not broken in 2.9.10+dfsg-6.3ubuntu0.1 on hirsute. There are some fixes in master since the 2.9.12 release was cut. https://gitlab.gnome.org/GNOME/libxml2/-/commits/master I'm planning to pick all 3 of these from upstream master since they are all intertwined. With this patches the regression is fixed. 85b1792e37b131e7a51af98a37f92472e8de5f3f: Work around lxml API abuse. Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. 13ad8736d294536da4cbcd70a96b0a2fbf47070c: Add patch from upstream to fix regression in xmlNodeDumpOutputInternal. Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level. 92d9ab4c28842a09ca2b76d3ff2f933e01b6cd6f: Add patch from upstream to fix whitespace when serializing empty HTML documents. root@i1:~# cat example.xml <domain type="qemu">   <name>fake-name</name>   <os>     <type/>   </os>   <devices>     <interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> root@i1:~# cat xml.py from lxml import etree def parseXML(xmlFile):     # Parse the xml     with open(xmlFile) as fobj:         xml = fobj.read()     doc = etree.fromstring(xml)     ret = doc.findall('./devices/interface')     node_xml = etree.tostring(ret[0]).decode()     print("node_xml={}".format(node_xml)) if __name__ == "__main__":     parseXML("example.xml") == Expected result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface> == Actual Result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> This is broken in 2.9.12+dfsg-3 on impish. It is not broken in 2.9.10+dfsg-6.3ubuntu0.1 on hirsute. There are some fixes in master since the 2.9.12 release was cut. https://gitlab.gnome.org/GNOME/libxml2/-/commits/master I'm planning to pick all 3 of these from upstream master since they are all intertwined. With this patches the regression is fixed. 85b1792e37b131e7a51af98a37f92472e8de5f3f: Work around lxml API abuse. Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. 13ad8736d294536da4cbcd70a96b0a2fbf47070c: Add patch from upstream to fix regression in xmlNodeDumpOutputInternal. Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level. 92d9ab4c28842a09ca2b76d3ff2f933e01b6cd6f: Add patch from upstream to fix whitespace when serializing empty HTML documents.
2021-09-10 17:41:11 Corey Bryant description root@i1:~# cat example.xml <domain type="qemu">   <name>fake-name</name>   <os>     <type/>   </os>   <devices>     <interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> root@i1:~# cat xml.py from lxml import etree def parseXML(xmlFile):     # Parse the xml     with open(xmlFile) as fobj:         xml = fobj.read()     doc = etree.fromstring(xml)     ret = doc.findall('./devices/interface')     node_xml = etree.tostring(ret[0]).decode()     print("node_xml={}".format(node_xml)) if __name__ == "__main__":     parseXML("example.xml") == Expected result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface> == Actual Result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> This is broken in 2.9.12+dfsg-3 on impish. It is not broken in 2.9.10+dfsg-6.3ubuntu0.1 on hirsute. There are some fixes in master since the 2.9.12 release was cut. https://gitlab.gnome.org/GNOME/libxml2/-/commits/master I'm planning to pick all 3 of these from upstream master since they are all intertwined. With this patches the regression is fixed. 85b1792e37b131e7a51af98a37f92472e8de5f3f: Work around lxml API abuse. Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. 13ad8736d294536da4cbcd70a96b0a2fbf47070c: Add patch from upstream to fix regression in xmlNodeDumpOutputInternal. Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level. 92d9ab4c28842a09ca2b76d3ff2f933e01b6cd6f: Add patch from upstream to fix whitespace when serializing empty HTML documents. root@i1:~# cat example.xml <domain type="qemu">   <name>fake-name</name>   <os>     <type/>   </os>   <devices>     <interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> root@i1:~# cat xml.py from lxml import etree def parseXML(xmlFile):     # Parse the xml     with open(xmlFile) as fobj:         xml = fobj.read()     doc = etree.fromstring(xml)     ret = doc.findall('./devices/interface')     node_xml = etree.tostring(ret[0]).decode()     print("node_xml={}".format(node_xml)) if __name__ == "__main__":     parseXML("example.xml") == Expected result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface> == Actual Result == root@i1:~# python3 xml.py node_xml=<interface type="bridge">       <mac address="22:52:25:62:e2:aa"/>       <model type="virtio"/>       <source bridge="br0"/>       <mtu size="9000"/>       <target dev="nicdc065497-3c"/>       <bandwidth>         <inbound average="100" peak="200" burst="300"/>         <outbound average="10" peak="20" burst="30"/>       </bandwidth>     </interface>   </devices> </domain> This is broken in 2.9.12+dfsg-3 on impish. It is not broken in 2.9.10+dfsg-6.3ubuntu0.1 on hirsute. There are some fixes in master since the 2.9.12 release was cut. https://gitlab.gnome.org/GNOME/libxml2/-/commits/master I'm planning to pick all 3 of the following commits from upstream master since they are all intertwined. With this patches the regression is fixed. 85b1792e37b131e7a51af98a37f92472e8de5f3f: Work around lxml API abuse. Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. 13ad8736d294536da4cbcd70a96b0a2fbf47070c: Add patch from upstream to fix regression in xmlNodeDumpOutputInternal. Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level. 92d9ab4c28842a09ca2b76d3ff2f933e01b6cd6f: Add patch from upstream to fix whitespace when serializing empty HTML documents.
2021-09-10 17:41:28 Corey Bryant libxml2 (Ubuntu): status New Triaged
2021-09-10 17:41:30 Corey Bryant libxml2 (Ubuntu): importance Undecided High
2021-09-10 20:13:36 Mattia Rizzolo libxml2 (Ubuntu): assignee Mattia Rizzolo (mapreri)
2021-09-10 20:13:40 Mattia Rizzolo libxml2 (Ubuntu): status Triaged Fix Committed
2021-09-15 20:45:06 Launchpad Janitor libxml2 (Ubuntu): status Fix Committed Fix Released