Comment 0 for bug 2008085

Revision history for this message
Po-Hsu Lin (cypressyew) wrote : net:veth.sh in ubuntu_kernel_selftests hang with J-intel-iotg

Issue found on node "onibi" with J-intel-iotg 5.15.0-1026.31 this cycle.

The veth.sh test in net category will hang and timeout, causing test report incomplete.

I can see some traces in dmesg with manual test.

ubuntu@onibi:~/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net$ sudo ./veth.sh
default - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation ok
        - aggregation with TSO off ok
with gro on - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
setting tx channels ok
setting both rx and tx channels ok
bad setting: combined channels ok
setting invalid channels nr fail rx:3:3 tx:3:5 combined:n/a:n/a
bad setting: XDP with RX nr less than TX ok
(hangs here)

dmesg output:
[ 547.520923] BUG: unable to handle page fault for address: ffffb73800000001
[ 547.520999] #PF: supervisor write access in kernel mode
[ 547.521045] #PF: error_code(0x0002) - not-present page
[ 547.521089] PGD 100000067 P4D 100000067 PUD 0
[ 547.521133] Oops: 0002 [#1] SMP PTI
[ 547.521168] CPU: 1 PID: 1559 Comm: ip Not tainted 5.15.0-1026-intel-iotg #31-Ubuntu
[ 547.521233] Hardware name: Dell Inc. PowerEdge R310/05XKKK, BIOS 1.8.2 08/17/2011
[ 547.521293] RIP: 0010:veth_xdp+0x18f/0x1e0 [veth]
[ 547.521342] Code: ff 41 89 9d 1c 01 00 00 49 21 85 e8 00 00 00 e9 74 ff ff ff 48 c7 c7 80 e3 b0 c0 e8 2b 3b 06 c1 b8 e4 ff ff ff 4d 85 ff 74 85 <49> c7 07 80 e3 b0 c0 e9 79 ff ff ff 48 c7 c7 20 e4 b0 c0 e8 09 3b
[ 547.521488] RSP: 0018:ffffb738c254f420 EFLAGS: 00010282
[ 547.521535] RAX: 00000000ffffffe4 RBX: 0000000000000db2 RCX: ffffb738c254fb20
[ 547.521594] RDX: ffffffffc0b0bf90 RSI: ffffb738c254f468 RDI: ffffffffc0b0e380
[ 547.521653] RBP: ffffb738c254f450 R08: 0000000000000001 R09: ffffb738c0081000
[ 547.521711] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c65ced90000
[ 547.521769] R13: ffff8c65c12f6000 R14: 0000000000000000 R15: ffffb73800000001
[ 547.521828] FS: 00007faa028b3b80(0000) GS:ffff8c66f7640000(0000) knlGS:0000000000000000
[ 547.521895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.521943] CR2: ffffb73800000001 CR3: 000000010f068000 CR4: 00000000000006e0
[ 547.522004] Call Trace:
[ 547.522029] <TASK>
[ 547.522052] ? veth_open+0x90/0x90 [veth]
[ 547.522094] dev_xdp_install+0x66/0xf0
[ 547.522135] dev_xdp_attach+0x1fc/0x590
[ 547.522171] ? __bpf_prog_get+0x1f/0xe0
[ 547.522212] dev_change_xdp_fd+0x200/0x240
[ 547.522252] do_setlink+0xba2/0xc70
[ 547.522288] ? dev_get_alias+0x35/0x50
[ 547.522326] __rtnl_newlink+0x61e/0xa20
[ 547.522363] ? security_sock_rcv_skb+0x2f/0x50
[ 547.522406] ? skb_queue_tail+0x48/0x60
[ 547.522444] ? sock_def_readable+0x4b/0x80
[ 547.522485] ? __netlink_sendskb+0x62/0x80
[ 547.522528] ? netlink_unicast+0x2fb/0x340
[ 547.522566] ? rtnl_getlink+0x398/0x420
[ 547.522611] ? kmem_cache_alloc_trace+0x17e/0x2a0
[ 547.522657] rtnl_newlink+0x49/0x70
[ 547.522692] rtnetlink_rcv_msg+0x15d/0x400
[ 547.522731] ? rtnl_calcit.isra.0+0x130/0x130
[ 547.524524] netlink_rcv_skb+0x56/0x100
[ 547.526314] rtnetlink_rcv+0x15/0x20
[ 547.528102] netlink_unicast+0x223/0x340
[ 547.529837] netlink_sendmsg+0x24b/0x4c0
[ 547.531505] sock_sendmsg+0x69/0x70
[ 547.533114] ____sys_sendmsg+0x252/0x290
[ 547.534667] ? import_iovec+0x31/0x40
[ 547.536164] ? sendmsg_copy_msghdr+0x7f/0xa0
[ 547.537609] ___sys_sendmsg+0x81/0xc0
[ 547.539024] ? rseq_ip_fixup+0x72/0x170
[ 547.540420] ? __rseq_handle_notify_resume+0x2d/0xc0
[ 547.541824] ? exit_to_user_mode_loop+0x10d/0x160
[ 547.543227] ? exit_to_user_mode_prepare+0x37/0xb0
[ 547.544623] ? syscall_exit_to_user_mode+0x27/0x50
[ 547.545993] ? __x64_sys_close+0x11/0x50
[ 547.547334] __sys_sendmsg+0x62/0xc0
[ 547.548650] __x64_sys_sendmsg+0x1d/0x30
[ 547.549918] do_syscall_64+0x5c/0xc0
[ 547.551134] ? exc_page_fault+0x89/0x170
[ 547.552322] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 547.553511] RIP: 0033:0x7faa02a07b17
[ 547.554680] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 547.557206] RSP: 002b:00007ffdbbca3678 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 547.558517] RAX: ffffffffffffffda RBX: 0000000063f5ffbc RCX: 00007faa02a07b17
[ 547.559818] RDX: 0000000000000000 RSI: 00007ffdbbca36e0 RDI: 0000000000000003
[ 547.561110] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000555ea5465830
[ 547.562386] R10: 00007faa02afa340 R11: 0000000000000246 R12: 0000000000000001
[ 547.563647] R13: 00007ffdbbca3790 R14: 0000000000000000 R15: 0000555ea4edb040
[ 547.564924] </TASK>
[ 547.566184] Modules linked in: algif_hash af_alg veth intel_powerclamp ipmi_ssif coretemp joydev input_leds binfmt_misc kvm_intel ipmi_si kvm dcdbas ipmi_devintf ipmi_msghandler intel_cstate mac_hid acpi_power_meter i7core_edac sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 i2c_algo_bit hid_generic gpio_ich drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mpt3sas cec rc_core usbhid raid_class drm pata_acpi hid lpc_ich bnx2 scsi_transport_sas
[ 547.575407] CR2: ffffb73800000001
[ 547.577070] ---[ end trace 3ebb9a2cada35096 ]---
[ 547.586349] RIP: 0010:veth_xdp+0x18f/0x1e0 [veth]
[ 547.588039] Code: ff 41 89 9d 1c 01 00 00 49 21 85 e8 00 00 00 e9 74 ff ff ff 48 c7 c7 80 e3 b0 c0 e8 2b 3b 06 c1 b8 e4 ff ff ff 4d 85 ff 74 85 <49> c7 07 80 e3 b0 c0 e9 79 ff ff ff 48 c7 c7 20 e4 b0 c0 e8 09 3b
[ 547.591612] RSP: 0018:ffffb738c254f420 EFLAGS: 00010282
[ 547.593432] RAX: 00000000ffffffe4 RBX: 0000000000000db2 RCX: ffffb738c254fb20
[ 547.595282] RDX: ffffffffc0b0bf90 RSI: ffffb738c254f468 RDI: ffffffffc0b0e380
[ 547.597143] RBP: ffffb738c254f450 R08: 0000000000000001 R09: ffffb738c0081000
[ 547.599012] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c65ced90000
[ 547.600888] R13: ffff8c65c12f6000 R14: 0000000000000000 R15: ffffb73800000001
[ 547.602764] FS: 00007faa028b3b80(0000) GS:ffff8c66f7640000(0000) knlGS:0000000000000000
[ 547.604675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.606593] CR2: ffffb73800000001 CR3: 000000010f068000 CR4: 00000000000006e0

As this node was not tested with this test in previous cycle, it's yet to determine whether this is a regression or not.