Comment 5 for bug 1983436

Revision history for this message
saltf1sh (saltf1sh) wrote :

@sdeziel,thanks for your prompt. Today I have reproduced the bug in kernel version 5.4 and kernel version 5.15 respectively. The experiment proves that the bug still exists.
I use this command:
"./syz-execprog -executor=./syz-executor -repeat=0 -procs=8 -cover=0 -threaded=0 -collide=0 ./SyzReproRcu".
Crash can still be triggered in the kernel.
You can download CReproducerRcu.c. Locally through "gcc CReproducerRcu.c -o CReproducerRcu" gets the file CReproducerRcu. Run "./CReproducerRcu" locally and wait for about one minute. You can see the message "rcu: INFO: rcu_sched self-detected stall on CPU". I'm not sure if this is a bug.
You can use vm_5dot4.log and vm_5dot15.log to view each step of my operation in QEMU. According to stack trace, the problem may be rcu_sched_clock_irq? I'm not sure which file in the kernel corresponds to the bug.
======================================================
The following are links related to kernel 5.4:
Description: rcu: INFO: rcu_sched self-detected stall on CPU
Kernel version: 5.4
Kernel config: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/.config
C reproducer: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/CReproducerRcu.c
Crash log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/CrashLog.txt
Syz repro: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/SyzReproRcu
Vm.log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.4/vm_5dot4.log

Crash log:
[ 65.129421] audit: type=1400 audit(1661428172.369:15): avc: denied { execmem } for pid=360 comm="CReproducerRcu" scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process permissive=1
[ 79.840127] hrtimer: interrupt took 14052 ns
[ 104.359596] rcu: INFO: rcu_sched self-detected stall on CPU
[ 104.368030] rcu: 0-....: (20995 ticks this GP) idle=eee/1/0x4000000000000004 softirq=3077/3077 fqs=4694
[ 104.373374] (t=21000 jiffies g=4473 q=220481)
[ 104.375926] NMI backtrace for cpu 0
[ 104.378133] CPU: 0 PID: 12886 Comm: CReproducerRcu Not tainted 5.4.192 #1
[ 104.381887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 104.387042] Call Trace:
[ 104.388557] <IRQ>
[ 104.390225] dump_stack+0x95/0xc8
[ 104.392112] nmi_cpu_backtrace.cold+0x55/0x94
[ 104.394758] ? lapic_can_unplug_cpu+0x70/0x70
[ 104.397442] nmi_trigger_cpumask_backtrace+0x15a/0x1b0
[ 104.400354] rcu_dump_cpu_stacks+0x15d/0x1a7
[ 104.402749] rcu_sched_clock_irq.cold+0x4c8/0x90b
[ 104.405510] ? hrtimer_run_queues+0x1d/0x310
[ 104.407876] update_process_times+0x24/0x60
[ 104.410279] tick_sched_handle+0x10f/0x150
[ 104.412622] tick_sched_timer+0x41/0x120
[ 104.414804] __hrtimer_run_queues+0x308/0x7c0
[ 104.417218] ? tick_sched_do_timer+0x160/0x160
[ 104.419750] ? enqueue_hrtimer+0x230/0x230
[ 104.422074] ? kvm_clock_get_cycles+0xd/0x10
[ 104.424437] ? ktime_get_update_offsets_now+0x17d/0x250
[ 104.427363] hrtimer_interrupt+0x2c9/0x6c0
[ 104.429693] smp_apic_timer_interrupt+0xd4/0x380
[ 104.432226] apic_timer_interrupt+0xf/0x20
[ 104.434517] RIP: 0010:clocksource_watchdog+0xa/0x8b0
[ 104.437264] Code: 48 c7 c7 48 ed 37 bd e8 34 11 2e 00 e9 74 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 <48> 83 ec 50 e8 9d 30 0b 00 48 c7 c7 40 7d ca bd e8 91 e5 22 02 8b
[ 104.447427] RSP: 0018:ffff88806d209d88 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 104.451622] RAX: ffffffffb9e945c0 RBX: ffff88806d209e80 RCX: ffffffffb9e7fe19
[ 104.455952] RDX: 0000000000000100 RSI: ffffffffb9e945c0 RDI: ffffffffbdca7d80
[ 104.460191] RBP: 0000000000000100 R08: 0000000000000001 R09: ffffed100da45aae
[ 104.464058] R10: ffffed100da45aad R11: ffff88806d22d56f R12: ffffffffbdca7d80
[ 104.467990] R13: 00000000fffcaa29 R14: 00000000fffcaa29 R15: ffff88806d22d540
[ 104.471933] ? apic_timer_interrupt+0xa/0x20
[ 104.474346] ? __clocksource_unstable+0x170/0x170
[ 104.476956] ? call_timer_fn+0x39/0x2b0
[ 104.479082] ? __clocksource_unstable+0x170/0x170
[ 104.481732] call_timer_fn+0x45/0x2b0
[ 104.483781] ? __clocksource_unstable+0x170/0x170
[ 104.486381] ? __clocksource_unstable+0x170/0x170
[ 104.488998] run_timer_softirq+0x53b/0x1230
[ 104.491289] ? call_timer_fn+0x2b0/0x2b0
[ 104.493461] ? apic_timer_interrupt+0xa/0x20
[ 104.495802] __do_softirq+0x17c/0x627
[ 104.497954] irq_exit+0x114/0x140
[ 104.499798] smp_apic_timer_interrupt+0xde/0x380
[ 104.502340] apic_timer_interrupt+0xf/0x20
[ 104.504607] </IRQ>
======================================================
The following are links related to kernel 5.15:
Description: rcu: INFO: rcu_sched self-detected stall on CPU
Kernel version: hwe-5.15
Kernel config: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.15/.config
C reproducer: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.15/CReproducerRcu.c
Crash log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.15/CrashLog.txt
Syz repro: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.15/SyzReproRcu
Vm.log: https://github.com/LancyRiver/SoftLockup/blob/master/rcu_core_si/v5.15/vm_5dot15.log

Crash log:
[ 115.490784] audit: type=1400 audit(1661428486.901:15): avc: denied { execmem } for pid=281 comm="CReproducerRcu" scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process permissive=1
[ 132.604057] hrtimer: interrupt took 13982 ns
[ 136.826220] rcu: INFO: rcu_sched self-detected stall on CPU
[ 136.828843] rcu: 0-....: (1 GPs behind) idle=ef3/1/0x4000000000000000 softirq=2418/2419 fqs=5209
[ 136.832988] (t=21004 jiffies g=3289 q=201640)
[ 136.835107] NMI backtrace for cpu 0
[ 136.836798] CPU: 0 PID: 291 Comm: CReproducerRcu Not tainted 5.15.39 #1
[ 136.839721] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 136.843880] Call Trace:
[ 136.845158] <IRQ>
[ 136.846144] dump_stack_lvl+0x4d/0x66
[ 136.848247] nmi_cpu_backtrace.cold+0xd0/0xd5
[ 136.850270] ? _raw_write_lock_irqsave+0xd0/0xd0
[ 136.852437] ? lapic_can_unplug_cpu+0x70/0x70
[ 136.854520] nmi_trigger_cpumask_backtrace+0x15a/0x1b0
[ 136.856961] rcu_dump_cpu_stacks+0x1b8/0x202
[ 136.858971] rcu_sched_clock_irq.cold+0x1d2/0x417
[ 136.860977] ? hrtimer_run_queues+0x1d/0x310
[ 136.863037] update_process_times+0x11b/0x180
[ 136.865024] ? tick_sched_handle.isra.0+0x150/0x150
[ 136.867320] tick_sched_handle.isra.0+0x105/0x150
[ 136.869365] tick_sched_timer+0xca/0xf0
[ 136.871140] __hrtimer_run_queues+0x2e2/0x6a0
[ 136.873114] ? enqueue_hrtimer+0x1c0/0x1c0
[ 136.874967] ? kvm_clock_get_cycles+0xd/0x10
[ 136.876857] ? ktime_get_update_offsets_now+0x17f/0x240
[ 136.879252] hrtimer_interrupt+0x2c9/0x6c0
[ 136.881099] __sysvec_apic_timer_interrupt+0xc8/0x260
[ 136.883367] sysvec_apic_timer_interrupt+0x65/0x90
[ 136.885456] </IRQ>
[ 136.886389] <TASK>
[ 136.887369] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 136.889669] RIP: 0010:task_function_call+0xf8/0x140
[ 136.891991] Code: 44 8b 64 24 38 e8 e8 bd f3 ff 41 83 fc f5 74 b0 e8 dd bd f3 ff 48 b8 00 00 00 00 00 fc ff df 49 01 c5 41 c7 45 00 00 00 00 00 <41> c7 45 08 00 00 00 00 48 8b 44 24 78 65 48 33 04 25 28 00 00 00
[ 136.900280] RSP: 0018:ffff888006cd7c30 EFLAGS: 00000282
[ 136.902629] RAX: dffffc0000000000 RBX: ffff888005bbba00 RCX: ffffffffac847503
[ 136.905834] RDX: ffff888005bbba00 RSI: 0000000000000000 RDI: ffff88800372b008
[ 136.908889] RBP: ffffed1000b77749 R08: ffffffffb0c86018 R09: ffffffffb0c8600f
[ 136.932729] R10: ffffffffb0c86017 R11: ffffffffb0c8601b R12: 0000000000000000
[ 136.936527] R13: ffffed1000d9af86 R14: ffff888005bbba48 R15: dffffc0000000000
[ 136.940063] ? task_function_call+0xe3/0x140
[ 136.942475] ? perf_event_addr_filters_exec+0x300/0x300
[ 136.944767] ? ctx_resched+0x220/0x220
[ 136.946522] ? exclusive_event_installable+0x1a1/0x220
[ 136.948839] perf_install_in_context+0x25f/0x540
[ 136.951008] ? perf_group_attach+0x380/0x380
[ 136.952979] ? mutex_lock+0x89/0xd0
[ 136.954676] ? __mutex_lock_slowpath+0x10/0x10
[ 136.956706] ? exclusive_event_installable+0x1a1/0x220
[ 136.959052] __do_sys_perf_event_open+0x15ba/0x2340
[ 136.961345] ? perf_remove_from_context+0x1a0/0x1a0
[ 136.963591] ? kasan_unpoison+0x23/0x50
[ 136.965394] ? fpregs_assert_state_consistent+0x74/0xb0
[ 136.967753] ? exit_to_user_mode_prepare+0x2f/0x150
[ 136.969969] do_syscall_64+0x3b/0x90
[ 136.971778] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 136.974181] RIP: 0033:0x7f2fcb723469
[ 136.977080] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
[ 136.985256] RSP: 002b:00007ffeb8cd4c88 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
[ 136.988673] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2fcb723469
[ 136.991908] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000080
[ 136.995134] RBP: 00007ffeb8cd4ca0 R08: 0000000000000000 R09: 00007ffeb8cd4ca0
[ 136.998343] R10: 00000000ffffffff R11: 0000000000000246 R12: 000056542ba00ce0
[ 137.001573] R13: 00007ffeb8cd4dc0 R14: 0000000000000000 R15: 0000000000000000
[ 137.004773] </TASK>
======================================================