kernel BUG/Oops errors from modprobe while the DRBG has not yet initialized (focal/fips-updates)

Bug #1981487 reported by Mauricio Faria de Oliveira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned
Focal
Fix Released
Medium
Mauricio Faria de Oliveira
Jammy
Invalid
Undecided
Unassigned

Bug Description

[Impact]

 * The Focal FIPS kernel in fips-updates hits kernel BUG/Oops
   errors during boot with the FIPS OpenSSL library installed
   (but those don't cause issues), when it runs modprobe with
   request_module() when looking up crypto algorithms/modules.

 * The modprobe command happens to call the OpenSSL library,
   and the FIPS version of OpenSSL calls getrandom(),
   and the FIPS kernel calls the DRBG for that,
   BUT it's _not yet_ initialized that early during boot
   when the kernel can run modprobe via request_module().
   (e.g., IPv6 initialization time.)

 * The issue impacts the kernels in fips-updates only, per:
   "UBUNTU: SAUCE: random: Use Crypto API DRBG for urandom in FIPS mode"
   which exists in Focal, but not in Xenial/Bionic/Jammy.

 * The issue only happens with the crypto algorithms, even
   if they're built-in (i.e., modprobe is not needed).

[Fix]

 * Fall back to CRNG while the DRBG is not yet initialized.
   (Marcelo Cerri confirmed it's OK per other discussions.)

 * The fix doesn't change the list and details of algorithms
   as in /proc/crypto (e.g., name, driver, module, priority)
   by the time the DRBG is initialized / initramfs started,
   so even though behavior changes, the net effect doesn't.

 * (Note: it's not possible to just use an initcall level
    earlier than rootfs_initcall() so modprobe isn't there,
    because fips_drbg_init() must run _after_ module_init()
    level so that crypto_rng_reset() works, even though its
    required module is built-in too.

[Test Steps]

 * Install the kernel and openssl from fips-updates,
   boot with fips=1, check dmesg for BUG/Oops errors:

   $ sudo apt install linux-image-fips libssl1.1 # fips-updates
   $ sudo vim /etc/default/grub # append fips=1 boot option
   $ sudo update-grub && sudo reboot
   $ sudo dmesg | grep BUG:

 * Check/store the /proc/crypto file for comparisons.
   You can boot with break=top as well, to check that
   as early as possible, and copy into /run/initramfs/
   then exit, to get it later in the rootfs.

[Regression Potential]

 * The fix falls back to the regular CRNG for a while
   in early boot. The CRNG is used permanently in the
   non-FIPS kernels (and in FIPS kernels w/out fips=1),
   so the code path is exercised/tested frequently.

 * Regressions would most likely occur in calls to
   getrandom() before the DRBG is initialized, but
   that currently hits a BUG/Oops anyway.

[Original Bug Description]

$ sudo apt install --yes linux-image-fips # fips-updates
$ sudo vim /etc/default/grub # fips=1
$ sudo update-grub && sudo reboot

$ uname -r
5.4.0-1056-fips

$ cat /proc/cmdline
... fips=1

No errors with the original/non-FIPS openssl, because it does NOT call getrandom():

$ dmesg | grep -c BUG:
0

$ dpkg -s libssl1.1 | grep ^Version:
Version: 1.1.1f-1ubuntu2.15

$ strace -e getrandom modprobe --version
kmod version 27
+ZSTD +XZ -ZLIB +LIBCRYPTO -EXPERIMENTAL
+++ exited with 0 +++

But if you install the FIPS openssl, it calls getrandom(), then BUG/Oops happen:

$ sudo apt install libssl1.1 # updates initramfs

$ dpkg -s libssl1.1 | grep ^Version:
Version: 1.1.1f-1ubuntu2.fips.13.1

$ strace -e getrandom modprobe --version
getrandom("\xc4\x84\x26\x25\x6f\xd4\xed\x38\xdf\xa9\x67\xee\x15\x1c\xe3\x98", 16, GRND_NONBLOCK) = 16
getrandom("\xa1\xd6\x67\x3e\xe4\x90\xb3\x8b\xdf\xe6\x34\x2a\xa7\x50\xbc\x2f", 16, GRND_NONBLOCK) = 16
getrandom("\xf1\x3e\xe4\x27\x9d\x47\x8c\x4b\x8a\x39\x8c\xe1\x2e\xee\xfa\x45", 16, GRND_NONBLOCK) = 16
getrandom("\xfb\x34\x18\x44\xd8\x23\x4c\x87\x13\x2e\x6b\x03\x79\xa7\x99\xf8", 16, 0) = 16
getrandom("\xdd\x83\xa7\x02\x10\x51\x2b\x4f\x21\x6b\xc1\xf1\x0d\xe7\x44\xb7", 16, 0) = 16
kmod version 27
+ZSTD +XZ -ZLIB +LIBCRYPTO -EXPERIMENTAL
+++ exited with 0 +++

$ sudo reboot

$ dmesg | grep -c BUG:
22

$ dmesg
...
[ 1.595759] NET: Registered protocol family 10
[ 1.600256] BUG: kernel NULL pointer dereference, address: 0000000000000038
...
[ 1.603829] CPU: 2 PID: 137 Comm: modprobe Not tainted 5.4.0-1056-fips #64-Ubuntu
[ 1.603829] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1.603829] RIP: 0010:urandom_read+0x268/0x480
...
[ 1.603829] Call Trace:
[ 1.603829] __x64_sys_getrandom+0x7f/0x130
[ 1.603829] do_syscall_64+0x57/0x190
[ 1.603829] entry_SYSCALL_64_after_hwframe+0x44/0xa9
...

All BUG/Oops errors are the same:

$ dmesg | grep BUG: | sed 's/^.*BUG:/BUG:/' | uniq -c
     22 BUG: kernel NULL pointer dereference, address: 0000000000000038

And they stop after the DRBG is initialized:

[ 3.651566] random: DRBG (drbg_nopr_ctr_aes256) initialized!

[Fix Impact Analysis]

The patch adds a dynamic debug message that can be enabled
in the kernel cmdline, for comparisons (dmesg); we can also
compare /proc/crypto for no changes.

dyndbg="func urandom_read +p"

Also add break=top, so we can copy /proc/crypto and dmesg
right after DRBG is initialized (when initramfs is started).

@ break=top time

 suffix=original # or modified
 cat /proc/crypto > /run/initramfs/proc-crypto.$suffix
 dmesg > /run/initramfs/dmesg.$suffix
 exit

@ login time
 sudo -s
 cp /run/initramfs/{proc-crypto.*,dmesg.*} .
 reboot # next test

There's no difference in the list/details of loaded crypto algorithms at all, with any combination:

# md5sum proc-crypto.*
0b91bd619078fa342c6b4da039cb1582 proc-crypto.modified
0b91bd619078fa342c6b4da039cb1582 proc-crypto.original

The kernel with the fix does not hit BUG/Oops errors:

# grep ^ -m1 dmesg.*
dmesg.modified:[ 0.000000] Linux version 5.4.0-1060-fips ... #68+fipsdrbgnullcheck ...
dmesg.original:[ 0.000000] Linux version 5.4.0-1060-fips ... #68-Ubuntu ...

# grep -c BUG: dmesg.*
dmesg.modified:0
dmesg.original:22

# grep -c 'random: DRBG uninitialized! crng fallback' dmesg.*
dmesg.modified:110
dmesg.original:0

The 110 number is 5 * 22 calls as modprobe calls getrandom() 5 times (see strace above).
The original kernel has 1 BUG/Oops only because that kills the modprobe task.

summary: - kernel BUG/Oops errors from modprobe in early boot while the DRBG is not
+ kernel BUG/Oops errors from modprobe while the DRBG has not yet
initialized (focal/fips-updates)
information type: Private → Public
Changed in linux (Ubuntu Jammy):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
status: New → Invalid
Changed in linux (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mauricio Faria de Oliveira (mfo)
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1981487

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

[Focal/FIPS][PATCH 0/4] Fix kernel BUG/Oops errors from modprobe while the DRBG has not yet initialized (focal/fips-updates)

Changed in linux (Ubuntu):
status: Incomplete → Invalid
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (6.1 KiB)

For reference purposes, I'm attaching a dmesg log with debugging boot options to:

- print initcalls entry/return
- print calls to __request_module() while the FIPS DRBG is uninitialized

Method / boot options:

 fips=1
 random.fips_urandom_drbg_crypto_noload=0
 random.fips_urandom_drbg_modprobe_panic=0
 "dyndbg=func __request_module +p"
 initcall_debug
 break=top

List of modules/aliases requested to be loaded (regardless of CONFIG=y|m) while the FIPS DRBG is uninitialized, per conversation with Marcelo Cerri.

$ cat dmesg.initramfs-break-top.log | grep '__request_module()' | grep -v -- 'crypto-.*-all$' | cut -d: -f4- | sort | uniq -c
      1 crypto-cbc(aes)
      1 crypto-cryptd(__cbc-aes-aesni)
      1 crypto-cryptd(__ctr-aes-aesni)
      1 crypto-cryptd(__ecb-aes-aesni)
      1 crypto-cryptd(__generic-gcm-aesni)
      1 crypto-cryptd(__rfc4106-gcm-aesni)
      1 crypto-cryptd(__xts-aes-aesni)
      2 crypto-ctr(aes)
      1 crypto-gcm(aes)
      1 crypto-hmac(sha1)
      2 crypto-hmac(sha256)
      1 crypto-pkcs1pad(rsa,sha512)

This is the gist of it:

@ dmesg.initramfs-break-top.log

[ 0.000000] Command line: <...> fips=1 random.fips_urandom_drbg_crypto_noload=0 random.fips_urandom_drbg_modprobe_panic=0 "dyndbg=func __request_module +p" initcall_debug break=top
...
[ 1.244744] calling populate_rootfs+0x0/0x110 @ 1
[ 1.245245] Trying to unpack rootfs image as initramfs...
[ 1.893124] Freeing initrd memory: 82992K
[ 1.893689] initcall populate_rootfs+0x0/0x110 returned 0 after 633268 usecs
...
[ 2.672049] calling inet6_init+0x0/0x39a @ 1
[ 2.673434] FIPS DRBG uninitialized: __request_module(): module: crypto-hmac(sha1)
[ 2.677101] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.711252] FIPS DRBG uninitialized: __request_module(): module: crypto-hmac(sha1)-all
[ 2.715082] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.756484] FIPS DRBG uninitialized: __request_module(): module: crypto-hmac(sha256)
[ 2.759045] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.793932] FIPS DRBG uninitialized: __request_module(): module: crypto-hmac(sha256)-all
[ 2.797236] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.833370] initcall inet6_init+0x0/0x39a returned 0 after 157136 usecs
...
[ 2.864104] calling aesni_init+0x0/0x135 @ 1
[ 2.866739] FIPS DRBG uninitialized: __request_module(): module: crypto-cryptd(__ecb-aes-aesni)
[ 2.870655] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.909789] FIPS DRBG uninitialized: __request_module(): module: crypto-cryptd(__ecb-aes-aesni)-all
[ 2.912948] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.950626] FIPS DRBG uninitialized: __request_module(): module: crypto-cryptd(__cbc-aes-aesni)
[ 2.954469] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2.989342] FIPS DRBG uninitialized: __request_module(): module: crypto-cryptd(__cbc-aes-aesni)-all
[ 2.992103] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 3.021153] FIPS DRBG uninitialized: __request_module(): module: crypto-cr...

Read more...

description: updated
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

The patchset in comment #2
  [Focal/FIPS][PATCH 0/4] Fix kernel BUG/Oops errors from modprobe while the DRBG has not yet initialized (focal/fips-updates)

is now superseded by patch:
  [Focal/FIPS][PATCH] UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

[Focal/FIPS][PATCH v2] UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized
(code-style change)

Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Cory Todd (corytodd) wrote :

Due to stable upstream changes to random number generator the patch created for this bug has been ported.

original
commit b8d9a2d0c69e ("UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized")

port
commit 58ce102c669e ("UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized")

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Marking this bug as Fix Released:

The mentioned patch was applied in 5.4.0-1061.69,
and is present is 5.4.0-1068.77 (in fips-updates).

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

$ git log --oneline -2 Ubuntu-fips-5.4.0-1060.68..Ubuntu-fips-5.4.0-1061.69 -- drivers/char/random.c
58ce102c669e UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized
9579b4bc92c5 UBUNTU: SAUCE: random: Use Crypto API DRBG for urandom in FIPS mode

$ git log --oneline -2 Ubuntu-fips-5.4.0-1068.77 -- drivers/char/random.c
03fc91a3c155 UBUNTU: SAUCE: random: fallback to CRNG while DRBG is not yet initialized
c85e8a2b046b UBUNTU: SAUCE: random: Use Crypto API DRBG for urandom in FIPS mode

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure-fips/5.15.0-1049.56+fips1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-azure-fips' to 'verification-done-jammy-linux-azure-fips'. If the problem still exists, change the tag 'verification-needed-jammy-linux-azure-fips' to 'verification-failed-jammy-linux-azure-fips'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-azure-fips-v2 verification-needed-jammy-linux-azure-fips
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp-fips/5.15.0-1048.56+fips1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-gcp-fips' to 'verification-done-jammy-linux-gcp-fips'. If the problem still exists, change the tag 'verification-needed-jammy-linux-gcp-fips' to 'verification-failed-jammy-linux-gcp-fips'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-gcp-fips-v2 verification-needed-jammy-linux-gcp-fips
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.