adt bpf tests crash 5.4.0-7 on ppc64el on power box
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
In Progress
|
High
|
Colin Ian King |
Bug Description
Running the ADT tests on a power box, the bpf tests crash the kernel as follows:
[ 2745.079592] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 2745.079808] Faulting instruction address: 0x00000000
[ 2745.079824] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2745.079993] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 2745.080011] Modules linked in: af_packet_diag tcp_diag udp_diag raw_diag inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds mac_hid ofpart
cmdlinepart powernv_flash mtd ibmpowernv at24 uio_pdrv_genirq uio ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng vmx_crypto sch_fq_codel ip_tables x_tables autofs4 bt
rfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast drm_vram_he
lper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm tg3 ahci libahci drm_panel_
tifier_
[ 2745.080195] CPU: 0 PID: 1111366 Comm: reuseport_bpf_c Not tainted 5.4.0-7-generic #8
[ 2745.080214] NIP: 0000000000000000 LR: c000000000ce8710 CTR: 0000000000000000
[ 2745.080233] REGS: c0000007ff6eb550 TRAP: 0400 Not tainted (5.4.0-7-generic)
[ 2745.080250] MSR: 9000000040009033 <SF,HV,
[ 2745.080272] CFAR: c00000000000de44 IRQMASK: 0
[ 2745.080272] GPR00: c000000000d67c9c c0000007ff6eb7e0 c000000001a5bf00 c0000004258e10e0
[ 2745.080272] GPR04: c008000002830038 c0000004258e10e0 0000000000000028 000000000000e3c2
[ 2745.080272] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2745.080272] GPR12: 0000000000000000 c000000001cf0000 0000000000000000 0000000000000001
[ 2745.080272] GPR16: 00000000000022b8 000000000100007f 000000000000e3c2 000000000100007f
[ 2745.080272] GPR20: c00000000198c100 0000000000000000 0000000000000000 00000000000022b8
[ 2745.080272] GPR24: 0000000000000000 0000000000000028 0000000000000080 000000000100007f
[ 2745.080272] GPR28: c008000002830000 0000000018ed5e01 c0000004258e10e0 c00000075f0ff000
[ 2745.080409] NIP [0000000000000000] 0x0
[ 2745.080423] LR [c000000000ce8710] reuseport_
[ 2745.080439] Call Trace:
[ 2745.080448] [c0000007ff6eb7e0] [c0000007ff6eb8a0] 0xc0000007ff6eb8a0 (unreliable)
[ 2745.080469] [c0000007ff6eb880] [c000000000d67c9c] inet_lhash2_
[ 2745.080490] [c0000007ff6eb900] [c000000000d6849c] __inet_
[ 2745.080509] [c0000007ff6eb9d0] [c000000000d96608] tcp_v4_
[ 2745.080527] [c0000007ff6ebb00] [c000000000d5a480] ip_protocol_
[ 2745.080547] [c0000007ff6ebb50] [c000000000d5a740] ip_local_
[ 2745.080566] [c0000007ff6ebb70] [c000000000d5a7ec] ip_local_
[ 2745.080585] [c0000007ff6ebbe0] [c000000000d59aec] ip_rcv_
[ 2745.080602] [c0000007ff6ebc20] [c000000000d5a9a0] ip_rcv+0x100/0x110
[ 2745.080619] [c0000007ff6ebca0] [c000000000cab220] __netif_
[ 2745.080638] [c0000007ff6ebce0] [c000000000cac4f0] process_
[ 2745.080657] [c0000007ff6ebd50] [c000000000cadc68] net_rx_
[ 2745.080674] [c0000007ff6ebe70] [c000000000ee2a7c] __do_softirq+
[ 2745.080692] [c0000007ff6ebf90] [c000000000030678] call_do_
[ 2745.080709] [c00000070656f7c0] [c00000000001bf58] do_softirq_
[ 2745.080729] [c00000070656f7e0] [c000000000143d60] do_softirq.
[ 2745.080914] [c00000070656f810] [c000000000143e54] __local_
[ 2745.080933] [c00000070656f830] [c000000000d5f8fc] ip_finish_
[ 2745.080953] [c00000070656f8d0] [c000000000d61fe4] ip_output+
[ 2745.080971] [c00000070656f960] [c000000000d61444] ip_local_
[ 2745.080988] [c00000070656f9a0] [c000000000d61838] __ip_queue_
[ 2745.081007] [c00000070656fa30] [c000000000d90a3c] ip_queue_
[ 2745.081024] [c00000070656fa50] [c000000000d887e4] __tcp_transmit_
[ 2745.081044] [c00000070656fb00] [c000000000d89a88] tcp_connect+
[ 2745.081060] [c00000070656fbb0] [c000000000d93148] tcp_v4_
[ 2745.082755] [c00000070656fc40] [c000000000db876c] __inet_
[ 2745.084563] [c00000070656fcf0] [c000000000db8b5c] inet_stream_
[ 2745.085528] [c00000070656fd30] [c000000000c75dec] __sys_connect+
[ 2745.086424] [c00000070656fe00] [c000000000c75e58] sys_connect+
[ 2745.087343] [c00000070656fe20] [c00000000000b278] system_
[ 2745.089157] Instruction dump:
[ 2745.089169] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 2745.090048] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 2745.096394] ---[ end trace d347ca85a257c66f ]---
[ 2745.208020]
[ 2746.208219] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 274[ 2796.226294116,5] OPAL: Reboot request...
6.316857] Rebooting in 10 seconds..
The final ADT test output recorded was:
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
17:03:13 DEBUG| [stdout] # Socket 2: 2
17:03:13 DEBUG| [stdout] # Socket 3: 3
... etc ...
17:03:13 DEBUG| [stdout] # Socket 4: 4
17:03:13 DEBUG| [stdout] # Socket 5: 5
17:03:13 DEBUG| [stdout] # Socket 9: 19
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Socket 3: 18
17:03:13 DEBUG| [stdout] # Socket 4: 19
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Socket 4: 19
17:03:13 DEBUG| [stdout] # Testing too many filters...
17:03:13 DEBUG| [stdout] # Testing filters on non-SO_REUSEPORT socket...
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP w/ mapped IPv4 ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing filter add without bind...
17:03:13 DEBUG| [stdout] # SUCCESS
17:03:13 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
17:03:13 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
17:03:13 DEBUG| [stdout] # ---- IPv4 UDP ----
17:03:13 DEBUG| [stdout] # send cpu 0, receive socket 0
17:03:13 DEBUG| [stdout] # send cpu 1, receive socket 1
...
17:03:13 DEBUG| [stdout] # send cpu 125, receive socket 125
17:03:13 DEBUG| [stdout] # send cpu 127, receive socket 127
17:03:13 DEBUG| [stdout] # ---- IPv4 TCP ----
[ end of output as machine panic's ]
..so it occurred sometime around or after this. I'll re-run this with the ipmi tool on the console to see if I can see how far it got before the kernel panic'd.
Changed in linux (Ubuntu): | |
assignee: | nobody → Colin Ian King (colin-king) |
status: | New → In Progress |
importance: | Undecided → High |
17:59:24 DEBUG| [stdout] # send cpu 63, receive socket 63 orientation_ quirks [l error_inject] VEC,VSX, EE,FP,ME, IR,DR,RI, LE> CR: 28222422 XER: 20000000 0x29c/0x3d8 [test_bpf] init+0x164/ 0x468 [test_bpf] initcall+ 0x64/0x2b0 module+ 0x7c/0x2e0 0x1628/ 0x1a40 finit_module+ 0xc8/0x150
17:59:24 DEBUG| [stdout] # send cpu 65, receive socket 65
17:59:24 DEBUG| [stdout] # send cpu 67, receive socket 67
17:59:24 DEBUG| [stdout] # send cpu 69, receive socket 69
17:59:24 DEBUG| [stdout] # send cpu 71, receive socket 71
17:59:24 DEBUG| [stdout] # send cpu 73, receive socket 73
[ 3269.552837] test_bpf: #0 TAX jited:1
[ 3269.552885] Oops: Exception in kernel mode, sig: 4 [#1]
[ 3269.552916] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 3269.552928] Modules linked in: test_bpf(+) tls af_packet_diag tcp_diag udp_diag raw_diag inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds
mac_hid ofpart cmdlinepart powernv_flash mtd at24 opal_prd uio_pdrv_genirq uio ipmi_powernv ipmi_devintf ipmi_msghandler ibmpowernv vmx_crypto powernv_rng sch_fq_codel ip_tables x_t
ables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid
ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm ahci tg3 libahci drm_panel_
ast unloaded: notifier_
[ 3269.847244] CPU: 55 PID: 1111137 Comm: modprobe Not tainted 5.4.0-7-generic #8
[ 3269.926547] NIP: c0080000029f80b4 LR: c00800000465106c CTR: c0080000029f80b4
[ 3269.927427] REGS: c000000712eb3410 TRAP: 0e40 Not tainted (5.4.0-7-generic)
[ 3269.928286] MSR: 900000000288b033 <SF,HV,
[ 3270.036372] CFAR: c00000000000de44 IRQMASK: 0
[ 3270.036372] GPR00: c008000004651044 c000000712eb36a0 c00800000465dd00 c000000415ee1600
[ 3270.036372] GPR04: c008000002850038 ffffffffffffffff 0000000001f401dc 00025a599268f4d4
[ 3270.036372] GPR08: 0000000000000018 0000018acb48de01 000000000018f194 c008000004651ac0
[ 3270.036372] GPR12: c0080000029f80b4 c0000007ff741c80 0000000000000008 000000000000007b
[ 3270.036372] GPR16: ffff00081234aaab 0000000000000241 000000000000024c 20c49ba5e353f7cf
[ 3270.036372] GPR20: c000000415ee1600 c008000004656dc9 c008000004656e74 00000000000003e8
[ 3270.036372] GPR24: c008000002850038 000000001234aaaa c008000004656e50 c008000002850000
[ 3270.036372] GPR28: 0000000000000000 000002f94279bb09 0000000000000000 c008000004655dc0
[ 3270.306180] NIP [c0080000029f80b4] 0xc0080000029f80b4
[ 3270.307006] LR [c00800000465106c] run_one+0x2b0/0x41c [test_bpf]
[ 3270.307912] Call Trace:
[ 3270.307923] [c000000712eb36a0] [c008000004651044] run_one+0x288/0x41c [test_bpf] (unreliable)
[ 3270.415622] [c000000712eb37b0] [c008000004651474] test_bpf+
[ 3270.416485] [c000000712eb38a0] [c008000004651714] test_bpf_
[ 3270.505901] [c000000712eb3990] [c0000000000100c4] do_one_
[ 3270.506777] [c000000712eb3a60] [c000000000225bec] do_init_
[ 3270.507674] [c000000712eb3af0] [c000000000228e88] load_module+
[ 3270.606197] [c000000712eb3d00] [c0000000002295a8] __do_sys_
[ 3270.607134] [c000000712eb3e20] [c0...