Comment 9 for bug 2002576

Revision history for this message
David Fernandez Gonzalez (litios) wrote :

I reviewed python-xmltodict 0.13.0-1 as checked into lunar. This shouldn't be
considered a full audit but rather a quick gauge of maintainability.

> python-xmltodict is a Python module that makes working with XML feel like you are working with JSON

- CVE History:
  - No known CVEs
- Build-Depends?
  - Python3 default libraries
- pre/post inst/rm scripts?
  - It is a Python library so it will be installed along with other Python libraries.
  - Upon deletion, it will remove everything with py3clean if available or manually rm everything.
- init scripts?
  - No
- systemd units?
  - No
- dbus services?
  - No
- setuid binaries?
  - No
- binaries in PATH?
  - No
- sudo fragments?
  - No
- polkit files?
  - No
- udev rules?
  - No
- unit tests / autopkgtests?
  - Embedded in the build process
  - Tested both with python 3.10 and 3.11
- cron jobs?
  - No
- Build logs:
  - No errors / warnings
  - Lintian errors/warnings:
    - W: python3-xmltodict: changelog-distribution-does-not-match-changes-file unstable != lunar [usr/share/doc/python3-xmltodict/changelog.Debian.gz:1]
    - W: python-xmltodict changes: distribution-and-changes-mismatch lunar unstable
    - E: python-xmltodict changes: inconsistent-maintainer David Fernandez Gonzalez <email address hidden> (changes vs. source) Sebastien Badia <email address hidden>

- Processes spawned?
  - No
- Memory management?
  - No
- File IO?
  - No I/O
- Logging?
  - No
- Environment variable usage?
  - No
- Use of privileged functions?
  - No
- Use of cryptography / random number sources etc?
  - No crypto
- Use of temp files?
  - No
- Use of networking?
  - No use of network but could be use for processing input from network.
  - Input is not sanitized.
- Use of WebKit?
  - No
- Use of PolicyKit?
  - No

- Any significant cppcheck results?
  - No cpp code
- Any significant Coverity results?
  - Nothing found
- Any significant shellcheck results?
  - Nothing found
- Any significant bandit results?
  - LOW: Using XMLGenerator to parse untrusted XML data is known to be vulnerable to XML attacks. Replace XMLGenerator with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called.
  - In this case, XMLGenerator is not used to parse untrusted XML data but generate it from a dict.

The library relies on xml.parsers.expat to perform the parsing. Expat was considered vulnerable to XML attacks up to 2.4.1 (https://bugs.python.org/issue44394). Default Python version in Lunar already includes the patched expat.

(A request to migrate to defusedxml has been proposed to the developers in https://github.com/martinblech/xmltodict/issues/321 for older Python versions as defusedexpat is not supported since Python 3.4)

xml.sax is imported, which is known to be vulnerable to XML attacks, but it's never used for parsing untrusted XML data, only to generate the XML in unparse. Nevertheless, the unparse function seems to be vulnerable to a dict bomb (https://gist.github.com/AlanCoding/a20e4f5cff19edefd98a39554ff682cb) which will create a DoS.
This is a very edge case due to the restrictions when creating the payload so it is highly unlikely that this attack could occur. (Also, note that this attack will trigger in any other function which would try to unpack the dict, like print())

Marshal is used for printing the results if the tool is run as a command line tool, which is not recommended to use with untrusted input, but this doesn't happen if used as a library.

Ceilometer only uses the parser function.

This package is only suitable for promotion in Jammy or newer because of expat version restrictions. Older versions would be vulnerable to the XML attacks described previously.

Security team ACK for promoting python-xmltodict to Lunar main.