adt bpf tests crash 5.4.0-7 on ppc64el on power box

Bug #1855151 reported by Colin Ian King
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
High
Colin Ian King

Bug Description

Running the ADT tests on a power box, the bpf tests crash the kernel as follows:

[ 2745.079592] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 2745.079808] Faulting instruction address: 0x00000000
[ 2745.079824] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2745.079993] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 2745.080011] Modules linked in: af_packet_diag tcp_diag udp_diag raw_diag inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds mac_hid ofpart
cmdlinepart powernv_flash mtd ibmpowernv at24 uio_pdrv_genirq uio ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng vmx_crypto sch_fq_codel ip_tables x_tables autofs4 bt
rfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast drm_vram_he
lper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm tg3 ahci libahci drm_panel_orientation_quirks [last unloaded: no
tifier_error_inject]
[ 2745.080195] CPU: 0 PID: 1111366 Comm: reuseport_bpf_c Not tainted 5.4.0-7-generic #8
[ 2745.080214] NIP: 0000000000000000 LR: c000000000ce8710 CTR: 0000000000000000
[ 2745.080233] REGS: c0000007ff6eb550 TRAP: 0400 Not tainted (5.4.0-7-generic)
[ 2745.080250] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002282 XER: 20000000
[ 2745.080272] CFAR: c00000000000de44 IRQMASK: 0
[ 2745.080272] GPR00: c000000000d67c9c c0000007ff6eb7e0 c000000001a5bf00 c0000004258e10e0
[ 2745.080272] GPR04: c008000002830038 c0000004258e10e0 0000000000000028 000000000000e3c2
[ 2745.080272] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2745.080272] GPR12: 0000000000000000 c000000001cf0000 0000000000000000 0000000000000001
[ 2745.080272] GPR16: 00000000000022b8 000000000100007f 000000000000e3c2 000000000100007f
[ 2745.080272] GPR20: c00000000198c100 0000000000000000 0000000000000000 00000000000022b8
[ 2745.080272] GPR24: 0000000000000000 0000000000000028 0000000000000080 000000000100007f
[ 2745.080272] GPR28: c008000002830000 0000000018ed5e01 c0000004258e10e0 c00000075f0ff000
[ 2745.080409] NIP [0000000000000000] 0x0
[ 2745.080423] LR [c000000000ce8710] reuseport_select_sock+0x100/0x400
[ 2745.080439] Call Trace:
[ 2745.080448] [c0000007ff6eb7e0] [c0000007ff6eb8a0] 0xc0000007ff6eb8a0 (unreliable)
[ 2745.080469] [c0000007ff6eb880] [c000000000d67c9c] inet_lhash2_lookup+0x1ec/0x220
[ 2745.080490] [c0000007ff6eb900] [c000000000d6849c] __inet_lookup_listener+0x1ec/0x1f0
[ 2745.080509] [c0000007ff6eb9d0] [c000000000d96608] tcp_v4_rcv+0x6e8/0xe70
[ 2745.080527] [c0000007ff6ebb00] [c000000000d5a480] ip_protocol_deliver_rcu+0x60/0x2b0
[ 2745.080547] [c0000007ff6ebb50] [c000000000d5a740] ip_local_deliver_finish+0x70/0x90
[ 2745.080566] [c0000007ff6ebb70] [c000000000d5a7ec] ip_local_deliver+0x8c/0x140
[ 2745.080585] [c0000007ff6ebbe0] [c000000000d59aec] ip_rcv_finish+0xbc/0xf0
[ 2745.080602] [c0000007ff6ebc20] [c000000000d5a9a0] ip_rcv+0x100/0x110
[ 2745.080619] [c0000007ff6ebca0] [c000000000cab220] __netif_receive_skb_one_core+0x70/0xb0
[ 2745.080638] [c0000007ff6ebce0] [c000000000cac4f0] process_backlog+0xd0/0x230
[ 2745.080657] [c0000007ff6ebd50] [c000000000cadc68] net_rx_action+0x1e8/0x520
[ 2745.080674] [c0000007ff6ebe70] [c000000000ee2a7c] __do_softirq+0x15c/0x3b8
[ 2745.080692] [c0000007ff6ebf90] [c000000000030678] call_do_softirq+0x14/0x24
[ 2745.080709] [c00000070656f7c0] [c00000000001bf58] do_softirq_own_stack+0x38/0x50
[ 2745.080729] [c00000070656f7e0] [c000000000143d60] do_softirq.part.0+0x80/0xb0
[ 2745.080914] [c00000070656f810] [c000000000143e54] __local_bh_enable_ip+0xc4/0xf0
[ 2745.080933] [c00000070656f830] [c000000000d5f8fc] ip_finish_output2+0x1fc/0x740
[ 2745.080953] [c00000070656f8d0] [c000000000d61fe4] ip_output+0xd4/0x190
[ 2745.080971] [c00000070656f960] [c000000000d61444] ip_local_out+0x64/0x90
[ 2745.080988] [c00000070656f9a0] [c000000000d61838] __ip_queue_xmit+0x168/0x4d0
[ 2745.081007] [c00000070656fa30] [c000000000d90a3c] ip_queue_xmit+0x1c/0x30
[ 2745.081024] [c00000070656fa50] [c000000000d887e4] __tcp_transmit_skb+0x574/0xda0
[ 2745.081044] [c00000070656fb00] [c000000000d89a88] tcp_connect+0x4b8/0x600
[ 2745.081060] [c00000070656fbb0] [c000000000d93148] tcp_v4_connect+0x478/0x5b0
[ 2745.082755] [c00000070656fc40] [c000000000db876c] __inet_stream_connect+0x12c/0x4c0
[ 2745.084563] [c00000070656fcf0] [c000000000db8b5c] inet_stream_connect+0x5c/0x90
[ 2745.085528] [c00000070656fd30] [c000000000c75dec] __sys_connect+0x11c/0x160
[ 2745.086424] [c00000070656fe00] [c000000000c75e58] sys_connect+0x28/0x40
[ 2745.087343] [c00000070656fe20] [c00000000000b278] system_call+0x5c/0x68
[ 2745.089157] Instruction dump:
[ 2745.089169] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 2745.090048] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 2745.096394] ---[ end trace d347ca85a257c66f ]---
[ 2745.208020]
[ 2746.208219] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 274[ 2796.226294116,5] OPAL: Reboot request...
6.316857] Rebooting in 10 seconds..

The final ADT test output recorded was:

17:03:13 DEBUG| [stdout] # ---- IPv6 TCP ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
17:03:13 DEBUG| [stdout] # Socket 2: 2
17:03:13 DEBUG| [stdout] # Socket 3: 3
... etc ...
17:03:13 DEBUG| [stdout] # Socket 4: 4
17:03:13 DEBUG| [stdout] # Socket 5: 5
17:03:13 DEBUG| [stdout] # Socket 9: 19
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Socket 3: 18
17:03:13 DEBUG| [stdout] # Socket 4: 19
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Socket 4: 19
17:03:13 DEBUG| [stdout] # Testing too many filters...
17:03:13 DEBUG| [stdout] # Testing filters on non-SO_REUSEPORT socket...
17:03:13 DEBUG| [stdout] # ---- IPv6 TCP w/ mapped IPv4 ----
17:03:13 DEBUG| [stdout] # Testing EBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing CBPF mod 10...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Reprograming, testing mod 5...
17:03:13 DEBUG| [stdout] # Socket 0: 0
17:03:13 DEBUG| [stdout] # Socket 1: 1
...
17:03:13 DEBUG| [stdout] # Testing filter add without bind...
17:03:13 DEBUG| [stdout] # SUCCESS
17:03:13 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
17:03:13 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
17:03:13 DEBUG| [stdout] # ---- IPv4 UDP ----
17:03:13 DEBUG| [stdout] # send cpu 0, receive socket 0
17:03:13 DEBUG| [stdout] # send cpu 1, receive socket 1
...
17:03:13 DEBUG| [stdout] # send cpu 125, receive socket 125
17:03:13 DEBUG| [stdout] # send cpu 127, receive socket 127
17:03:13 DEBUG| [stdout] # ---- IPv4 TCP ----
[ end of output as machine panic's ]

..so it occurred sometime around or after this. I'll re-run this with the ipmi tool on the console to see if I can see how far it got before the kernel panic'd.

Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
importance: Undecided → High
Revision history for this message
Colin Ian King (colin-king) wrote :
Download full text (3.5 KiB)

17:59:24 DEBUG| [stdout] # send cpu 63, receive socket 63
17:59:24 DEBUG| [stdout] # send cpu 65, receive socket 65
17:59:24 DEBUG| [stdout] # send cpu 67, receive socket 67
17:59:24 DEBUG| [stdout] # send cpu 69, receive socket 69
17:59:24 DEBUG| [stdout] # send cpu 71, receive socket 71
17:59:24 DEBUG| [stdout] # send cpu 73, receive socket 73
[ 3269.552837] test_bpf: #0 TAX jited:1
[ 3269.552885] Oops: Exception in kernel mode, sig: 4 [#1]
[ 3269.552916] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 3269.552928] Modules linked in: test_bpf(+) tls af_packet_diag tcp_diag udp_diag raw_diag inet_diag binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds
 mac_hid ofpart cmdlinepart powernv_flash mtd at24 opal_prd uio_pdrv_genirq uio ipmi_powernv ipmi_devintf ipmi_msghandler ibmpowernv vmx_crypto powernv_rng sch_fq_codel ip_tables x_t
ables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid
 ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm ahci tg3 libahci drm_panel_orientation_quirks [l
ast unloaded: notifier_error_inject]
[ 3269.847244] CPU: 55 PID: 1111137 Comm: modprobe Not tainted 5.4.0-7-generic #8
[ 3269.926547] NIP: c0080000029f80b4 LR: c00800000465106c CTR: c0080000029f80b4
[ 3269.927427] REGS: c000000712eb3410 TRAP: 0e40 Not tainted (5.4.0-7-generic)
[ 3269.928286] MSR: 900000000288b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28222422 XER: 20000000
[ 3270.036372] CFAR: c00000000000de44 IRQMASK: 0
[ 3270.036372] GPR00: c008000004651044 c000000712eb36a0 c00800000465dd00 c000000415ee1600
[ 3270.036372] GPR04: c008000002850038 ffffffffffffffff 0000000001f401dc 00025a599268f4d4
[ 3270.036372] GPR08: 0000000000000018 0000018acb48de01 000000000018f194 c008000004651ac0
[ 3270.036372] GPR12: c0080000029f80b4 c0000007ff741c80 0000000000000008 000000000000007b
[ 3270.036372] GPR16: ffff00081234aaab 0000000000000241 000000000000024c 20c49ba5e353f7cf
[ 3270.036372] GPR20: c000000415ee1600 c008000004656dc9 c008000004656e74 00000000000003e8
[ 3270.036372] GPR24: c008000002850038 000000001234aaaa c008000004656e50 c008000002850000
[ 3270.036372] GPR28: 0000000000000000 000002f94279bb09 0000000000000000 c008000004655dc0
[ 3270.306180] NIP [c0080000029f80b4] 0xc0080000029f80b4
[ 3270.307006] LR [c00800000465106c] run_one+0x2b0/0x41c [test_bpf]
[ 3270.307912] Call Trace:
[ 3270.307923] [c000000712eb36a0] [c008000004651044] run_one+0x288/0x41c [test_bpf] (unreliable)
[ 3270.415622] [c000000712eb37b0] [c008000004651474] test_bpf+0x29c/0x3d8 [test_bpf]
[ 3270.416485] [c000000712eb38a0] [c008000004651714] test_bpf_init+0x164/0x468 [test_bpf]
[ 3270.505901] [c000000712eb3990] [c0000000000100c4] do_one_initcall+0x64/0x2b0
[ 3270.506777] [c000000712eb3a60] [c000000000225bec] do_init_module+0x7c/0x2e0
[ 3270.507674] [c000000712eb3af0] [c000000000228e88] load_module+0x1628/0x1a40
[ 3270.606197] [c000000712eb3d00] [c0000000002295a8] __do_sys_finit_module+0xc8/0x150
[ 3270.607134] [c000000712eb3e20] [c0...

Read more...

Revision history for this message
Colin Ian King (colin-king) wrote :

Same root issue as bug 1854968

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.