Comment 10 for bug 2009544

Revision history for this message
David Leadbeater (launchpad-net-dgl) wrote (last edit ):

We came across this and noticed the CA certificates parsing itself is one of the biggest slowdowns. One of the reasons OpenSSL ends up doing that parsing is ca-certificates puts the same certificates in /etc/ssl/certs/*.pem ("CApath") and /etc/ssl/certs/ca-certificates.crt ("CAfile"):

$ wc -l /etc/ssl/certs/ca-certificates.crt <(cat /etc/ssl/certs/*.pem)
   3431 /etc/ssl/certs/ca-certificates.crt
   3431 /dev/fd/63

On a mantic system, as a baseline:

$ time python3.11 main.py
Distro: Ubuntu 23.10
Python Version: 3.11.6 (main, Oct 8 2023, 05:06:43) [GCC 13.2.0]
OpenSSL Version: OpenSSL 3.0.10 1 Aug 2023
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
real 0m2.768s
user 0m9.666s
sys 0m0.124s

If I replace /etc/ssl/certs/ca-certificates.crt with a random one of the certificates (sadly an empty file results in errors), so there's much less parsing to be done, things are much faster:

$ sudo sh -c 'cat /etc/ssl/certs/002c0b4f.0 > /etc/ssl/certs/ca-certificates.crt'
$ time python3.11 main.py
Distro: Ubuntu 23.10
Python Version: 3.11.6 (main, Oct 8 2023, 05:06:43) [GCC 13.2.0]
OpenSSL Version: OpenSSL 3.0.10 1 Aug 2023
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
real 0m0.675s
user 0m0.781s
sys 0m0.059s

This also reproduces as a visible difference in user CPU time when simply running curl against a HTTPS site. Both Python and curl happily validate peers using the files from CApath.

I do wonder if somehow it would be possible to disable the CA file and only rely on the CA path. Likely easier said than done, particularly with the API changes in 3.x (splitting out the file loading into SSL_CTX_load_verify_file).

There's also some Python specific discussion in https://github.com/python/cpython/issues/95031 too, I did try setting requests.get(..., verify='/etc/ssl/certs/') in the test script to attempt to get requests to read only the directory per[1] but that didn't seem to work.

[1]: https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification