"Uncaught Python exception: KeyError: None" in lxml.isoschematron

Bug #2058177 reported by David Lakin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

I have been experimenting with fuzzing Python libraries using the OSS-Fuzz project and I saw that this project has some fuzz targets integrated into OSS-Fuzz[0], but the build was broken and some of the fuzz tests could be improved.

While working to improve some of the fuzz tests, I believe I identified a possible bug.

## Summary of the bug:

An uncaught Python exception can be triggered in `lxml.isoschematron.Schematron` if it is passed a document containing a bad embedded schema. Here is a minimal reproduction (also attached as a file for your connivence):

```
#!/usr/bin/env python3
# Assume file name is: isoschematron_uncaught_keyerror_poc.py

import io
from lxml import isoschematron, etree

data = io.BytesIO(b"<rr/>")
poc_schema = etree.parse(data)
schematron = isoschematron.Schematron(poc_schema)

```

The above should output something similar to the following:

```
Traceback (most recent call last):
  File "<some_path_to_poc_file>/isoschematron_uncaught_keyerror_poc.py", line 8, in <module>
    schematron = isoschematron.Schematron(poc_schema)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<some_path_to_installed_package>/.venv/lib/python3.12/site-packages/lxml/isoschematron/__init__.py", line 280, in __init__
    schematron = self._extract(root)
                 ^^^^^^^^^^^^^^^^^^^
  File "<some_path_to_installed_package>/.venv/lib/python3.12/site-packages/lxml/isoschematron/__init__.py", line 228, in _extract
    elif element.nsmap[element.prefix] == RELAXNG_NS:
         ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
KeyError: None
```

## Environment Info:

I tested this:
- In two environments (macOS & Ubuntu)
- With two versions of Python (3.12.1 (venv) and 3.8)
- With lxml 5.0.1 installed from pip as well compiled from source with C extensions.

The following is from my local environment where the attached POC script was run:

```
Python : sys.version_info(major=3, minor=12, micro=1, releaselevel='final', serial=0)
lxml.etree : (5, 1, 0, 0)
libxml used : (2, 12, 3)
libxml compiled : (2, 12, 3)
libxslt used : (1, 1, 39)
libxslt compiled : (1, 1, 39)
```

Additional info:
```
Python Version: 3.12.1 (main, Feb 5 2024, 16:23:00) [Clang 15.0.0 (clang-1500.1.0.2.5)]
OS Information: macOS-14.4-x86_64-i386-64bit
Installed Packages:
Package Version
------- -------
lxml 5.1.0
pip 24.0
```

And this is from the OSS-Fuzz container environment I discovered the issue in (running it locally):

```
Python : sys.version_info(major=3, minor=8, micro=3, releaselevel='final', serial=0)
lxml.etree : (5, 1, 0, 0)
libxml used : (2, 10, 3)
libxml compiled : (2, 10, 3)
libxslt used : (1, 1, 37)
libxslt compiled : (1, 1, 37)
```

Additional info:
```
Python Version: 3.8.3 (default, Mar 17 2024, 03:21:27)
[Clang 15.0.0 (https://github.com/llvm/llvm-project.git bf7f8d6fa6f460bf0a16ffe
OS Information: Linux-6.6.16-linuxkit-x86_64-with-glibc2.2.5
Installed Packages:
Package Version
------------------------- -------
altgraph 0.17.4
atheris 2.3.0
coverage 6.3.2
Cython 3.0.9
importlib_metadata 7.0.2
lxml 5.1.0
packaging 24.0
pip 24.0
pyinstaller 6.5.0
pyinstaller-hooks-contrib 2024.3
setuptools 69.2.0
six 1.15.0
zipp 3.18.1
```

### Fuzz test output

Below is the output of the OSS-Fuzz test run that I discovered this with (note: I haven't pushed the fuzz test itself to a public repo yet but I expect to shortly and I'm happy to share a link here when I do.)

```
=== Uncaught Python exception: ===
KeyError: None
Traceback (most recent call last):
File "fuzz_schematron.py", line 31, in TestOneInput
File "lxml/isoschematron/__init__.py", line 280, in __init__
File "lxml/isoschematron/__init__.py", line 228, in _extract
KeyError: None

==20== ERROR: libFuzzer: fuzz target exited
#0 0x7f712245d694 in __sanitizer_print_stack_trace /src/llvm-project/compiler-rt/lib/ubsan/ubsan_diag_standalone.cpp:31:3
#1 0x7f71223def48 in fuzzer::PrintStackTrace() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:210:5
#2 0x7f71223c3cdc in fuzzer::Fuzzer::ExitCallback() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:250:3
#3 0x7f712219c8a6 (/lib/x86_64-linux-gnu/libc.so.6+0x468a6) (BuildId: eebe5d5f4b608b8a53ec446b63981bba373ca0ca)
#4 0x7f712219ca5f in exit (/lib/x86_64-linux-gnu/libc.so.6+0x46a5f) (BuildId: eebe5d5f4b608b8a53ec446b63981bba373ca0ca)
#5 0x7f712194dc78 in Py_Exit /tmp/Python-3.8.3/Python/pylifecycle.c:2299:5
#6 0x7f71219526cf in handle_system_exit /tmp/Python-3.8.3/Python/pythonrun.c:658:9
#7 0x7f71219526cf in _PyErr_PrintEx /tmp/Python-3.8.3/Python/pythonrun.c:668:5
#8 0x403d90 (/out/fuzz_schematron.pkg+0x403d90) (BuildId: 4900f1057c817d78f6abf8c33793107b79dcd1a7)
#9 0x404003 (/out/fuzz_schematron.pkg+0x404003) (BuildId: 4900f1057c817d78f6abf8c33793107b79dcd1a7)
#10 0x7f712217a082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: eebe5d5f4b608b8a53ec446b63981bba373ca0ca)
#11 0x40259d (/out/fuzz_schematron.pkg+0x40259d) (BuildId: 4900f1057c817d78f6abf8c33793107b79dcd1a7)

DEDUP_TOKEN: __sanitizer_print_stack_trace--fuzzer::PrintStackTrace()--fuzzer::Fuzzer::ExitCallback()
SUMMARY: libFuzzer: fuzz target exited
MS: 2 ChangeByte-CrossOver-; base unit: 1a96de07008e0f09ec8c011c19ac4978bfad782f
0x3c,0x72,0x72,0x2f,0x3e,0x0,
<rr/>\000
artifact_prefix='./'; Test unit written to ./crash-c3c014aead328b908bde3d09289b45b5ec1f3581
Base64: PHJyLz4A
```

[0]: https://introspector.oss-fuzz.com/project-profile?project=lxml

Revision history for this message
David Lakin (davelak) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.