use-after-free in af_alg_accept() due to bh_lock_sock()
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Mauricio Faria de Oliveira | ||
Xenial |
Fix Released
|
Medium
|
Mauricio Faria de Oliveira | ||
Bionic |
Fix Released
|
Medium
|
Mauricio Faria de Oliveira | ||
Eoan |
Won't Fix
|
Medium
|
Mauricio Faria de Oliveira | ||
Focal |
Fix Released
|
Medium
|
Mauricio Faria de Oliveira | ||
Groovy |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* Users of the Linux kernel's crypto userspace API
reported BUG() / kernel NULL pointer dereference
errors after kernel upgrades.
* The stack trace signature is an accept() syscall
going through af_alg_accept() and hitting errors
usually in one of:
- apparmor_
- apparmor_
- release_sock()
[Fix]
* This is a regression introduced by upstream commit
37f96694cf73 ("crypto: af_alg - Use bh_lock_sock
in sk_destruct") which made its way through stable.
* The offending patch allows the critical regions
of af_alg_accept() and af_alg_
run concurrently; now with the "right" events on 2
CPUs it might drop the non-atomic reference counter
of the alg_sock then the sock, thus release a sock
that is still in use.
* The fix is upstream commit 34c86f4c4a7b ("crypto:
af_alg - fix use-after-free in af_alg_accept() due
to bh_lock_sock()") [1]. It changes alg_sock's ref
counter to atomic, which addresses the root cause.
[Test Case]
* There is a synthetic test case available, which
uses a kprobes kernel module to synchronize the
concurrent CPUs on the instructions responsible
for the problem; and a userspace part to run it.
* The organic reproducer is the Varnish Cache Plus
software with the Crypto vmod (which uses kernel
crypto userspace API) under long, very high load.
* The patch has been verified on both reproducers
with the 4.15 and 5.7 kernels.
* More tests performed with 'stress-ng --af-alg'
with 11 CPUs on Xenial/
(all on same version of stress-ng, V0.11.14)
No regressions observed from original kernel.
(the af-alg stressor can exercise almost all
kernel crypto modules shipped with the kernel;
so it checks more paths/crypto alg interfaces.)
[Regression Potential]
* The fix patch does a fundamental change in how
alg_sock reference counters work, plus another
change to the 'nokey' counting. This of course
*has* a risk of regression.
* Regressions theoretically could manifest as use
after free errors (in case of undercounting) in
the af_alg functions or silent memory leaks (in
case of overcounting), but also other behaviors
since reference counting is key to many things.
* FWIW, this patch has been written by the crypto
subsystem maintainer, who certainly knows a lot
of the normal and corner cases, thus giving the
patch more credit.
* Testing with the organic reproducer ran as long
as 5 days, without issues, so it does look good.
[Other Info]
* Not sending for Groovy (should get via Unstable).
[Stack Trace Examples]
Examples:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
...
RIP: 0010:apparmor_
...
Call Trace:
security_
af_
alg_
SYSC_
SyS_
do_
entry_
general protection fault: 0000 [#1] SMP PTI
...
RIP: 0010:__
...
Call Trace:
release_
af_
alg_
SYSC_
SyS_
do_
entry_
Changed in linux (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
tags: | added: sts |
description: | updated |
description: | updated |
Changed in linux (Ubuntu Xenial): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
Changed in linux (Ubuntu Bionic): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
Changed in linux (Ubuntu Eoan): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
Changed in linux (Ubuntu Focal): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
Changed in linux (Ubuntu Groovy): | |
status: | Confirmed → Won't Fix |
importance: | Medium → Undecided |
assignee: | Mauricio Faria de Oliveira (mfo) → nobody |
Changed in linux (Ubuntu): | |
status: | Confirmed → In Progress |
description: | updated |
Changed in linux (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Eoan): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
Focal: testing
=====
$ ./stress-ng --version
stress-ng, version 0.11.14 (gcc 9.3, x86_64 Linux 5.4.0-38-generic) 💻🔥
$ sudo modprobe -a \
/lib/modules/ $(uname -r)/kernel/ crypto/ *.ko \
/lib/modules/ $(uname -r)/kernel/ arch/*/ crypto/ *.ko \
$(modinfo \
| grep -ow 'crypto-.*')
No error/strange kernel messages logged in /var/log/kern.log.
original:
--------
$ uname -rv
5.4.0-38-generic #42-Ubuntu SMP Mon Jun 8 14:14:24 UTC 2020
$ ./stress-ng --af-alg 0 --timeout 1h 2>&1 | tee ../stress- ng.log. focal.orig
stress-ng: info: [27052] dispatching hogs: 11 af-alg
stress-ng: info: [27054] stress-ng-af-alg: 62 cryptographic algorithms found in /proc/crypto
stress-ng: info: [27054] stress-ng-af-alg: 101 cryptographic algorithms max (with defconfigs)
stress-ng: info: [27052] successful run completed in 3600.38s (1 hour, 0.38 secs)
modified:
--------
$ uname -rv
5.4.0-38-generic #42+test20200623b1 SMP Tue Jun 23 09:37:56 -03 2020
$ ./stress-ng --af-alg 0 --timeout 1h 2>&1 | tee ../stress- ng.log. focal.mod. 2
stress-ng: info: [2577] dispatching hogs: 11 af-alg
stress-ng: info: [2579] stress-ng-af-alg: 62 cryptographic algorithms found in /proc/crypto
stress-ng: info: [2579] stress-ng-af-alg: 101 cryptographic algorithms max (with defconfigs)
stress-ng: info: [2577] successful run completed in 3600.52s (1 hour, 0.52 secs)