lxml should not do external entity expansion (XXE) by default

Bug #1742885 reported by Lie Ryan on 2018-01-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Wishlist
Unassigned

Bug Description

libxml2 defaults to disabling external entity expansion (XXE) since version 2.9.0 (https://git.gnome.org/browse/libxml2/commit/?id=4629ee02ac649c27f9c0cf98ba017c6b5526070f). However, lxml would still process external entity even when using libxml2>=2.9.0.

lxml should either follow the installed libxml2 version's default behavior or to explicitly disable processing external entities.

Test case is attached.

--

Python : sys.version_info(major=2, minor=7, micro=13, releaselevel='final', serial=0)
lxml.etree : (4, 1, 1, 0)
libxml used : (2, 9, 7)
libxml compiled : (2, 9, 7)
libxslt used : (1, 1, 32)
libxslt compiled : (1, 1, 32)

Lie Ryan (lie-1296) wrote :
scoder (scoder) on 2018-03-03
information type: Private Security → Public
scoder (scoder) wrote :

This is actually documented:
http://lxml.de/FAQ.html#how-do-i-use-lxml-safely-as-a-web-service-endpoint

And the defusedxml package has additional information about security in lxml (and other XML packages):
https://bitbucket.org/tiran/defusedxml

I agree that it's something that's worth changing, even though it's a backwards incompatible change.

Pull request welcome. See the "_local_resolver()" function in "parser.pxi". A reasonable logic might be to disallow access to local files by default if the input file itself is not known to be local, but add an XMLParser option to override it. Not sure about the HTMLParser, but that probably suffers from the same issue.

Changed in lxml:
importance: Undecided → Wishlist
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments