Have you taken care of lxml, too? I just noticed that lxml always resolves and loads external entities with file:// URLs. An attacker can possibly load and retrieve all (XML) files that the service is allowed to access.
Example:
external_file.xml
==============
<!DOCTYPE external [
<!ENTITY ee SYSTEM "file:///PATH/TO/simple.xml">
]>
<root>ⅇ</root>
Have you taken care of lxml, too? I just noticed that lxml always resolves and loads external entities with file:// URLs. An attacker can possibly load and retrieve all (XML) files that the service is allowed to access.
Example:
external_file.xml //PATH/ TO/simple. xml">
==============
<!DOCTYPE external [
<!ENTITY ee SYSTEM "file:/
]>
<root>ⅇ</root>
simple.xml >text</ element> text</element> tail
=========
<!-- comment -->
<root>
<element key='value'
<element>
<empty-element/>
</root>
>>> from lxml import etree "external_ file.xml" ) tostring( tree)) >text</ element> text</element> tail
>>> tree = etree.parse(
>>> print(etree.
<root><!-- comment -->
<root>
<element key="value"
<element>
<empty-element/>
</root>
</root>