ftrace:test.d--ftrace--func_traceonoff_triggers.tc in ubuntu_kselftests_ftrace triggers kernel NULL pointer dereference on node blanka
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned |
Bug Description
Issue found on node "blanka" with Focal 5.4.0-190.210 in s2024.06.10
After some manual tests I noticed that this issue is not 100% reproducible:
* Focal 5.4.0-189.209 + -189 source failed with 5 out of 5 attempts
* Focal 5.4.0-190.210 + -190 source failed with 2 out of 5 attempts
* Focal 5.4.0-192.212 + -192 source failed with 3 out of 5 attempts
Despite with this high fail rate, I can't see this failure in our test history all the way back to 2024.01.08 (except with 5.4.0-190.210), perhaps we retest it and it has passed?
$ sudo ./ftracetest -v test.d/
=== Ftrace unit tests ===
[1] ftrace - test for function traceon/off triggers
dmesg output:
[ 7112.186092] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.
[ 7128.566260] BUG: kernel NULL pointer dereference, address: 0000000000000050
[ 7128.574031] #PF: supervisor read access in kernel mode
[ 7128.579763] #PF: error_code(0x0000) - not-present page
[ 7128.585495] PGD 0 P4D 0
[ 7128.588320] Oops: 0000 [#1] SMP NOPTI
[ 7128.592405] CPU: 129 PID: 0 Comm: swapper/129 Tainted: G OE 5.4.0-190-generic #210-Ubuntu
[ 7128.602887] Hardware name: NVIDIA DGXA100 920-23687-
[ 7128.612119] RIP: 0010:trace_
[ 7128.618820] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 b0 ab c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 c9 12
[ 7128.639774] RSP: 0018:ffffb1779b
[ 7128.645604] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 7128.653565] RDX: ffff8fe7ae7052f8 RSI: 0000000000000100 RDI: ffff8fe7ae7052f4
[ 7128.661525] RBP: ffffb1779b140e08 R08: ffff8fe7ae7052f4 R09: 0000000000000100
[ 7128.669486] R10: ffffd1777fdcb960 R11: 0000000000000000 R12: ffff8fe7ae7052f8
[ 7128.677445] R13: 00000000ffffffff R14: 0000000000000002 R15: ffff9047fe8f5000
[ 7128.685409] FS: 000000000000000
[ 7128.694437] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7128.700847] CR2: 0000000000000050 CR3: 000000cb2760a000 CR4: 0000000000340ee0
[ 7128.708809] Call Trace:
[ 7128.711535] <IRQ>
[ 7128.713782] ? show_regs.
[ 7128.718058] ? __die+0x90/0xd9
[ 7128.721467] ? no_context+
[ 7128.725555] ? ring_buffer_
[ 7128.730996] ? ring_buffer_
[ 7128.736436] ? __bad_area_
[ 7128.741585] ? bad_area_
[ 7128.746445] ? do_user_
[ 7128.751306] ? trace_event_
[ 7128.757718] ? __do_page_
[ 7128.762092] ? do_page_
[ 7128.766276] ? page_fault+
[ 7128.770167] ? trace_event_
[ 7128.776190] wb_timer_
[ 7128.780179] ? blk_mq_
[ 7128.785522] blk_stat_
[ 7128.790094] call_timer_
[ 7128.794179] __run_timers.
[ 7128.798945] ? trace_event_
[ 7128.804677] run_timer_
[ 7128.809053] __do_softirq+
[ 7128.813041] irq_exit+0xae/0xb0
[ 7128.816545] smp_apic_
[ 7128.821696] apic_timer_
[ 7128.826264] </IRQ>
[ 7128.828603] RIP: 0010:native_
[ 7128.833655] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d a6 14 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 96 14 50 00 fb f4 <c3> 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 ed 46 61 ff 65
[ 7128.854607] RSP: 0018:ffffb17799
[ 7128.863053] RAX: 0000000000023800 RBX: ffff90676cadbf28 RCX: 000000000003514a
[ 7128.871013] RDX: 000000000003514a RSI: 0000000000000000 RDI: ffffffffb06c6120
[ 7128.878974] RBP: ffffb177992fbe90 R08: 00000000000001ac R09: 0000000000000000
[ 7128.886936] R10: 0000000000020000 R11: 0000000000000002 R12: 0000000000000081
[ 7128.894895] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 7128.902858] ? default_
[ 7128.907040] arch_cpu_
[ 7128.911027] default_
[ 7128.915403] do_idle+0x1fb/0x270
[ 7128.919004] ? complete+0x49/0x50
[ 7128.922699] cpu_startup_
[ 7128.927074] start_secondary
[ 7128.931452] secondary_
[ 7128.936116] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua amd64_edac_mod edac_mce_amd kvm_amd kvm ipmi_ssif input_leds binfmt_misc mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) ccp k10temp ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel msr ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure ast crct10dif_pclmul crc32_pclmul drm_vram_helper ghash_clmulni_intel mlx5_core(OE) ttm aesni_intel crypto_simd drm_kms_helper pci_hyperv_intf mlxdevm(OE) cryptd syscopyarea sysfillrect auxiliary(OE) glue_helper igb tls sysimgblt uas hid_generic mpt3sas mlxfw(OE) psample dca raid_class usbhid fb_sys_fops scsi_transport_sas nvme i2c_algo_bit usb_storage hid drm mlx_compat(OE) nvme_core i2c_piix4
[ 7129.023659] CR2: 0000000000000050
[ 7129.027471] ---[ end trace 5d27e00102fa9701 ]---
[ 7129.052391] RIP: 0010:trace_
[ 7129.059093] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 b0 ab c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 c9 12
[ 7129.080046] RSP: 0018:ffffb1779b
[ 7129.085875] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 7129.093835] RDX: ffff8fe7ae7052f8 RSI: 0000000000000100 RDI: ffff8fe7ae7052f4
[ 7129.101795] RBP: ffffb1779b140e08 R08: ffff8fe7ae7052f4 R09: 0000000000000100
[ 7129.109757] R10: ffffd1777fdcb960 R11: 0000000000000000 R12: ffff8fe7ae7052f8
[ 7129.117718] R13: 00000000ffffffff R14: 0000000000000002 R15: ffff9047fe8f5000
[ 7129.125680] FS: 000000000000000
[ 7129.134706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7129.141115] CR2: 0000000000000050 CR3: 000000cb2760a000 CR4: 0000000000340ee0
[ 7129.149075] Kernel panic - not syncing: Fatal exception in interrupt
[ 7129.158266] Kernel Offset: 0x2dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000
[ 7129.193426] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
description: | updated |
description: | updated |
description: | updated |
description: | updated |
summary: |
ftrace:test.d--ftrace--func_traceonoff_triggers.tc in - ubuntu_kselftests_ftrace triggers kernel NULL pointer dereference + ubuntu_kselftests_ftrace triggers kernel NULL pointer dereference on + node blanka |
A quick bisect shows this failure occurs between 5.4.0-181 and 5.4.0-182. And very likely a test case issue:
* 5.4.0-181 + 181 source code - OK
* 5.4.0-182 + 182 source code - NOT OK
* 5.4.0-181 + 182 source code - NOT OK
* 5.4.0-182 + 181 source code - OK
Next is to check the test case change between these two versions.