BUG:soft lockup - CPU#0 stuck for 36s! rcu_core_si kernel/rcu/tree.c:2807

Bug #1983436 reported by saltf1sh
264
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-hwe-5.13 (Ubuntu)
New
Undecided
Unassigned
linux-hwe-5.15 (Ubuntu)
New
Undecided
Unassigned

Bug Description

We would like to report the following bug which has been found by our modified version of syzkaller.
rcu_core_si in kernel/rcu/tree.c:2807 in the Linux kernel through 5.13 allows attackers to cause a denial of service (soft lockup) via a large number of different function calls.
description: BUG: soft lockup in rcu_core_si
affected file: kernel/rcu/tree.c
kernel version: 5.13
kernel config, syzkaller reproducer and raw console output are all in the attachments.
======================================================
Crash log:
======================================================
watchdog: BUG: soft lockup - CPU#0 stuck for 36s! [syz-executor.6:14479]
Modules linked in:
CPU: 0 PID: 14479 Comm: syz-executor.6 Not tainted 5.13.19+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:cred_label security/apparmor/include/cred.h:27 [inline]
RIP: 0010:apparmor_cred_free+0x5f/0x1a0 security/apparmor/lsm.c:69
Code: 01 00 00 48 63 1d a1 fd 4d 02 49 03 5c 24 78 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 0f 85 08 01 00 00 <4c> 8b 2b 4d 85 ed 74 68 e8 74 53 4b ff be 04 00 00 00 4c 89 ef bb
RSP: 0018:ffff888056609dc8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff888005c8fc80 RCX: ffffffff967eb4fd
RDX: 1ffff11000b91f90 RSI: 0000000000000100 RDI: ffff888005821000
RBP: ffff888056609de8 R08: 0000000000000001 R09: ffffed1000b04201
R10: ffff888005821003 R11: ffffed1000b04200 R12: ffff888005821000
R13: ffff888005821000 R14: ffff888005821078 R15: ffff888007ba8000
FS: 00007f5f81a40700(0000) GS:ffff888056600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe8252bb80 CR3: 0000000003cf6006 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <IRQ>
 security_cred_free+0x83/0x130 security/security.c:1881
 put_cred_rcu+0x71/0x360 kernel/cred.c:115
 rcu_do_batch kernel/rcu/tree.c:2559 [inline]
 rcu_core+0x536/0x12f0 kernel/rcu/tree.c:2794
 rcu_core_si+0xe/0x10 kernel/rcu/tree.c:2807
 __do_softirq+0x187/0x576 kernel/softirq.c:559
 invoke_softirq kernel/softirq.c:433 [inline]
 __irq_exit_rcu kernel/softirq.c:637 [inline]
 irq_exit_rcu+0x120/0x150 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x79/0x90 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:0xffffffffc01e0801
Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 <55> 48 89 e5 53 41 55 31 c0 45 31 ed 48 89 fb b8 ff ff ff 7f 41 5d
RSP: 0018:ffff888006ec7d58 EFLAGS: 00000246
RAX: ffffffffc01e07fc RBX: 000000007fff0000 RCX: ffffffff95ccef6a
RDX: ffff888007ba8000 RSI: ffffc90000763048 RDI: ffff888006ec7e10
RBP: ffff888006ec7eb8 R08: 0000000000000001 R09: ffffed1005374c87
R10: ffff888029ba6437 R11: ffffed1005374c86 R12: ffff888006ec7e10
R13: ffff888029ba6400 R14: ffffc90000763000 R15: dffffc0000000000
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 27 Comm: khungtaskd Not tainted 5.13.19+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
RIP: 0010:watchdog+0x1e1/0xa60 kernel/hung_task.c:294
Code: 45 a8 e8 e2 74 fd ff 49 8d 87 40 03 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 45 c0 48 c1 e8 03 80 3c 08 00 0f 85 dd 07 00 00 <49> 8b 9f 40 03 00 00 48 be 00 00 00 00 00 fc ff df 4c 8d 63 10 4c
RSP: 0000:ffff888001d77ea0 EFLAGS: 00010246
RAX: 1ffff11000dea98b RBX: ffff88800669b630 RCX: dffffc0000000000
RDX: ffff888001d6a080 RSI: 0000000000000000 RDI: ffff888005e4c918
RBP: ffff888001d77f00 R08: 0000000000000001 R09: fffffbfff34354d9
R10: ffffffff9a1aa6c7 R11: fffffbfff34354d8 R12: ffff88800669c010
R13: 00000000003fff85 R14: 0000000100021a4b R15: ffff888006f54918
FS: 0000000000000000(0000) GS:ffff888056700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcc3c83ce8 CR3: 0000000004310004 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 kthread+0x352/0x430 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
 </TASK>
----------------
Code disassembly (best guess):
   0: 01 00 add %eax,(%rax)
   2: 00 48 63 add %cl,0x63(%rax)
   5: 1d a1 fd 4d 02 sbb $0x24dfda1,%eax
   a: 49 03 5c 24 78 add 0x78(%r12),%rbx
   f: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
  16: fc ff df
  19: 48 89 da mov %rbx,%rdx
  1c: 48 c1 ea 03 shr $0x3,%rdx
  20: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
  24: 0f 85 08 01 00 00 jne 0x132
* 2a: 4c 8b 2b mov (%rbx),%r13 <-- trapping instruction
  2d: 4d 85 ed test %r13,%r13
  30: 74 68 je 0x9a
  32: e8 74 53 4b ff callq 0xff4b53ab
  37: be 04 00 00 00 mov $0x4,%esi
  3c: 4c 89 ef mov %r13,%rdi
  3f: bb .byte 0xbb
--

Tags: lockup soft
Revision history for this message
saltf1sh (saltf1sh) wrote :
Revision history for this message
saltf1sh (saltf1sh) wrote :
Revision history for this message
saltf1sh (saltf1sh) wrote :
affects: linux (Ubuntu) → linux-hwe-5.13 (Ubuntu)
saltf1sh (saltf1sh)
information type: Private Security → Public Security
Revision history for this message
Simon Déziel (sdeziel) wrote :

@saltf1sh, thanks for reporting this, however, the 5.13 kernel is no longer supported (since July 2022). Are you able to reproduce the problem on a kernel version that's still supported (like 5.4 or 5.15)?

Revision history for this message
saltf1sh (saltf1sh) wrote :
Download full text (9.3 KiB)

@sdeziel,thanks for your prompt. Today I have reproduced the bug in kernel version 5.4 and kernel version 5.15 respectively. The experiment proves that the bug still exists.
I use this command:
"./syz-execprog -executor=./syz-executor -repeat=0 -procs=8 -cover=0 -threaded=0 -collide=0 ./SyzReproRcu".
Crash can still be triggered in the kernel.
You can download CReproducerRcu.c. Locally through "gcc CReproducerRcu.c -o CReproducerRcu" gets the file CReproducerRcu. Run "./CReproducerRcu" locally and wait for about one minute. You can see the message "rcu: INFO: rcu_sched self-detected stall on CPU". I'm not sure if this is a bug.
You can use vm_5dot4.log and vm_5dot15.log to view each step of my operation in QEMU. According to stack trace, the problem may be rcu_sched_clock_irq? I'm not sure which file in the kernel corresponds to the bug.
======================================================
The following are links related to kernel 5.4:
Description: rcu: INFO: rcu_sched self-detected stall on CPU
Kernel version: 5.4
Kernel config: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/.config
C reproducer: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/CReproducerRcu.c
Crash log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/CrashLog.txt
Syz repro: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/SyzReproRcu
Vm.log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/vm_5dot4.log

Crash log:
[ 65.129421] audit: type=1400 audit(1661428172.369:15): avc: denied { execmem } for pid=360 comm="CReproducerRcu" scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process permissive=1
[ 79.840127] hrtimer: interrupt took 14052 ns
[ 104.359596] rcu: INFO: rcu_sched self-detected stall on CPU
[ 104.368030] rcu: 0-....: (20995 ticks this GP) idle=eee/1/0x4000000000000004 softirq=3077/3077 fqs=4694
[ 104.373374] (t=21000 jiffies g=4473 q=220481)
[ 104.375926] NMI backtrace for cpu 0
[ 104.378133] CPU: 0 PID: 12886 Comm: CReproducerRcu Not tainted 5.4.192 #1
[ 104.381887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 104.387042] Call Trace:
[ 104.388557] <IRQ>
[ 104.390225] dump_stack+0x95/0xc8
[ 104.392112] nmi_cpu_backtrace.cold+0x55/0x94
[ 104.394758] ? lapic_can_unplug_cpu+0x70/0x70
[ 104.397442] nmi_trigger_cpumask_backtrace+0x15a/0x1b0
[ 104.400354] rcu_dump_cpu_stacks+0x15d/0x1a7
[ 104.402749] rcu_sched_clock_irq.cold+0x4c8/0x90b
[ 104.405510] ? hrtimer_run_queues+0x1d/0x310
[ 104.407876] update_process_times+0x24/0x60
[ 104.410279] tick_sched_handle+0x10f/0x150
[ 104.412622] tick_sched_timer+0x41/0x120
[ 104.414804] __hrtimer_run_queues+0x308/0x7c0
[ 104.417218] ? tick_sched_do_timer+0x160/0x160
[ 104.419750] ? enqueue_hrtimer+0x230/0x230
[ 104.422074] ? kvm_clock_get_cycles+0xd/0x10
[ 104.424437] ? ktime_get_update_offsets_now+0x17d/0x250
[ 104.427363] hrtimer_interrupt+0x2c9/0x6c0
[ 104.429693] smp_apic_timer_interrupt+0xd4/0x380
[ 104.432226] apic_timer_interrupt+0xf/0...

Read more...

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.