Test suite doesn't support libxml2 builds without HTTP or zlib

Bug #2066270 reported by Nick Wellnhofer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Triaged
Medium
Unassigned

Bug Description

HTTP support will eventually be removed from libxml2. Support for zlib and lzma maybe as well.

It would be nice if the lxml test suite could skip affected tests if the features aren't available, preferably with a runtime check using xmlHasFeature(XML_WITH_HTTP) and xmlHasFeature(XML_WITH_ZLIB).

Regarding output compression, lxml already has some code to compress output using Python's gzip module. It might make sense to always handle compression directly instead of relying on libxml2.

Revision history for this message
scoder (scoder) wrote :

Thanks for the notification, Nick.

lxml can probably live without HTTP support in libxml2, given that most HTTP access actually means HTTPS these days.

But the nice thing about direct gzip support is that there's neither Python object overhead for the compression/file access, nor a dependency on the Python GIL, nor the need to hold larger uncompressed output chunks in memory. I'd rather start talking to zlib directly than losing all that. However, linking against zlib means even more complication for the build process. The same applies to lzma. While both are part of the normal Python installation, linking against the (sometimes) Python shipped C libraries isn't easy or even possible, depending on the platform.

I'll add at least feature flags to the API so that users can detect if support is available.

Changed in lxml:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Nick Wellnhofer (nick-aevum) wrote :

Regarding compression support, one of the major issues is that libxml2 tries to decompress input files by default without any API controlling the behavior. This allows trivial DoS attacks with zip bombs. We could add new API features like disabling decompression, but this would ultimately require changes in 100+ downstream projects which seems unrealistic. At some point, libxml2 should be secure-by-default and disable automatic decompression.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.