impish:linux 5.13 panic during systemd autotest

Bug #1946001 reported by Andrea Righi
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-aws (Ubuntu)
Fix Released
Critical
Unassigned
Impish
Fix Released
Critical
Unassigned
Jammy
New
Undecided
Unassigned
Kinetic
Fix Released
Critical
Unassigned
linux-intel-iotg (Ubuntu)
New
Undecided
Unassigned
Impish
New
Undecided
Unassigned
Jammy
New
Undecided
Unassigned
Kinetic
New
Undecided
Unassigned

Bug Description

Found this when running the systemd autopkgtest on linux-aws 5.13.0-1004.5 (apparently it seems to affect nested kvm only):

systemd-testsuite login: [ 70.235559] int3: 0000 [#1] SMP NOPTI
[ 70.237824] CPU: 0 PID: 326 Comm: systemd-journal Not tainted 5.13.0-1004-aws #5-Ubuntu
[ 70.237852] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[ 70.237864] RIP: 0010:kmem_cache_alloc+0x57/0x240
[ 70.237875] Code: 08 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 c7 45 c8 00 00 00 00 e8 d6 bb ff ff 49 89 c4 48 85 c0 0f 84 cd 00 00 00 cc <0a> 7
[ 70.237994] RSP: 0018:ffffaabd001cfd78 EFLAGS: 00000286
[ 70.239655] RAX: ffff97e8411da400 RBX: 0000000000001000 RCX: 0000000000000400
[ 70.239670] RDX: 0000000000000001 RSI: 0000000000000cc0 RDI: ffff97e8411da400
[ 70.239679] RBP: ffffaabd001cfdb8 R08: 0000000000000000 R09: 0000000000009802
[ 70.239688] R10: 0000000000000000 R11: 0000000000000000 R12: ffff97e8411da400
[ 70.239696] R13: ffff97e8411da400 R14: 0000000000000cc0 R15: ffffffffb2737cf0
[ 70.239705] FS: 00007fc6fc243380(0000) GS:ffff97e85c000000(0000) knlGS:0000000000000000
[ 70.239713] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 70.239720] CR2: 00007fc6f9249000 CR3: 0000000003c5c000 CR4: 00000000000006f0
[ 70.239727] Call Trace:
[ 70.239733] getname_flags.part.0+0x30/0x1b0
[ 70.239741] getname+0x35/0x50
[ 70.239746] do_sys_openat2+0x64/0x150
[ 70.239753] __x64_sys_openat+0x55/0x90
[ 70.239759] do_syscall_64+0x61/0xb0
[ 70.239766] ? do_syscall_64+0x6e/0xb0
[ 70.239772] ? do_sync_core+0x26/0x30
[ 70.239779] ? flush_smp_call_function_queue+0x119/0x190
[ 70.239786] ? exit_to_user_mode_prepare+0x37/0xb0
[ 70.239793] ? irqentry_exit_to_user_mode+0x9/0x20
[ 70.239800] ? irqentry_exit+0x19/0x30
[ 70.239807] ? sysvec_call_function+0x4e/0x90
[ 70.239813] ? asm_sysvec_call_function+0xa/0x20
[ 70.239819] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 70.239826] RIP: 0033:0x7fc6fca946e4
[ 70.239834] Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 16 d2 f7 ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 4
[ 70.239845] RSP: 002b:00007ffc9699f1a0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[ 70.240063] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6fca946e4
[ 70.240075] RDX: 0000000000080802 RSI: 0000556e8d5885d0 RDI: 00000000ffffff9c
[ 70.240085] RBP: 0000556e8d5885d0 R08: 0000000000000000 R09: ffffffffffffffff
[ 70.240095] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000080802
[ 70.240104] R13: 00000000ffffffff R14: 0000556e8d59dd90 R15: 00000000fffffffa
[ 70.240113] Modules linked in: btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raidy
[ 70.281001] ---[ end trace 044cf87b8c867a36 ]---
[ 70.281099] RIP: 0010:kmem_cache_alloc+0x57/0x240
[ 70.281107] Code: 08 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 c7 45 c8 00 00 00 00 e8 d6 bb ff ff 49 89 c4 48 85 c0 0f 84 cd 00 00 00 cc <0a> 7
[ 70.281115] RSP: 0018:ffffaabd001cfd78 EFLAGS: 00000286
[ 70.281132] RAX: ffff97e8411da400 RBX: 0000000000001000 RCX: 0000000000000400
[ 70.281138] RDX: 0000000000000001 RSI: 0000000000000cc0 RDI: ffff97e8411da400
[ 70.281143] RBP: ffffaabd001cfdb8 R08: 0000000000000000 R09: 0000000000009802
[ 70.281148] R10: 0000000000000000 R11: 0000000000000000 R12: ffff97e8411da400
[ 70.281154] R13: ffff97e8411da400 R14: 0000000000000cc0 R15: ffffffffb2737cf0
[ 70.281159] FS: 00007fc6fc243380(0000) GS:ffff97e85c000000(0000) knlGS:0000000000000000
[ 70.281173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 70.281178] CR2: 00007fc6f9249000 CR3: 0000000003c5c000 CR4: 00000000000006f0
[ 70.281183] Kernel panic - not syncing: Fatal exception in interrupt
[ 70.282418] Kernel Offset: 0x31400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Andrea Righi (arighi)
affects: linux-oem-5.6 (Ubuntu) → linux-aws (Ubuntu)
description: updated
Revision history for this message
Andrea Righi (arighi) wrote :

Update: disabling CONFIG_KFENCE_STATIC_KEYS prevents this problem from happening, so we may consider to disable this option as a temporary workaround, or even disable KFENCE entirely, since it's more like a debugging feature (low-overhead out-of-bounds / use-after-free / invalid-free memory detector).

Changed in linux-aws (Ubuntu Impish):
milestone: none → ubuntu-21.10
tags: added: rls-ff-incoming
Revision history for this message
Andrea Righi (arighi) wrote :

Update: apparently CONFIG_KFENCE can only reduce the probability to trigger the bug, so it's not a reliable fix. According to all the traces that we got the soft lockup is always happening to any SLUB allocation function (__kmalloc, kmem_cache_alloc, and similar), and the instruction pointer is on a static branch call. And static branches are used only by KFENCE in SLUB, hence the test of disabling KFENCE.

summary: - impish:linux-aws 5.13 panic during systemd autotest
+ impish:linux 5.13 panic during systemd autotest
Changed in linux-aws (Ubuntu Impish):
importance: Undecided → Critical
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-aws - 5.13.0-1005.6

---------------
linux-aws (5.13.0-1005.6) impish; urgency=medium

  * impish/linux-aws: 5.13.0-1005.6 -proposed tracker (LP: #1946328)

  * linux-tools-aws package does not contain libperf-jvmti.so (LP: #1944754)
    - [Packaging] aws: Support building libperf-jvmti.so

  * Miscellaneous Ubuntu changes
    - [Config] aws: update configs and annotations after rebase

  [ Ubuntu: 5.13.0-19.19 ]

  * impish/linux: 5.13.0-19.19 -proposed tracker (LP: #1946337)
  * impish:linux-aws 5.13 panic during systemd autotest (LP: #1946001)
    - [Config] disable KFENCE

  [ Ubuntu: 5.13.0-18.18 ]

  * impish/linux: 5.13.0-18.18 -proposed tracker (LP: #1945995)
  * [21.10 FEAT] KVM: Use interpretation of specification exceptions
    (LP: #1932157)
    - KVM: s390: Enable specification exception interpretation

 -- Andrea Righi <email address hidden> Fri, 08 Oct 2021 08:16:10 +0200

Changed in linux-aws (Ubuntu Impish):
status: New → Fix Released
no longer affects: ubuntu-release-notes
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-5.13/5.13.0-19.19~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-raspi/5.13.0-1010.11 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-intel-iotg/5.15.0-1006.8 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-jammy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.