mlx5_core: Error cqe on cqn
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Expired
|
Undecided
|
Unassigned | ||
linux-oem-5.6 (Ubuntu) |
Expired
|
Undecided
|
Unassigned |
Bug Description
I have encountered the following repeating error with kernel 5.6.0-1018-oem. Network was disturbed and error kept repeating until for one hour until the system was hung.
316294.820469] mlx5_core 0000:44:00.1 enp68s0f1: Error cqe on cqn 0x816, ci 0xc5, sqn 0x1908, opcode 0xd, syndrome 0x4, vendor syndrome 0x51
[316294.833103] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[316294.833106] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[316294.833110] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[316294.833116] 00000030: 00 00 00 00 04 00 51 04 0e 00 19 08 53 64 dc d2
[316294.833118] WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0x364, len: 128
[316294.833120] 00000000: 00 53 64 0e 00 19 08 07 00 00 00 08 00 00 00 00
[316294.833121] 00000010: 00 00 00 00 c0 00 05 a0 00 00 00 00 00 42 00 a3
[316294.833123] 00000020: 8e bf 47 d7 86 14 ad f8 ef 46 08 00 45 00 12 34
[316294.833124] 00000030: 76 d8 40 00 40 06 77 97 c3 a8 4a 4a 5f 67 cc fa
[316294.833126] 00000040: 01 bb d8 2a 5c 7e 3d a0 b0 c5 3e 74 80 18 00 0b
[316294.833127] 00000050: 4c 7b 00 00 01 01 08 0a 63 59 a1 46 00 41 05 b4
[316294.833129] 00000060: 00 00 12 00 00 08 01 01 00 00 00 00 c2 c6 0b 74
[316294.833130] 00000070: 00 00 00 44 00 08 01 01 00 00 00 00 c3 09 6c fc
[316294.833144] mlx5_core 0000:44:00.1 enp68s0f1: ERR CQE on SQ: 0x1908
[316294.996328] enp68s0f1: hw csum failure
[316295.000262] skb len=1500 headroom=78 headlen=1500 tailroom=22
[316295.000262] mac=(64,14) net=(78,40) trans=118
[316295.000262] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[316295.000262] csum(0x81a5 ip_summed=2 complete_sw=0 valid=0 level=0)
[316295.000262] hash(0x322a7dd7 sw=0 l4=1) proto=0x86dd pkttype=0 iif=0
[316295.029909] dev name=enp68s0f1 feat=0x0x0010a1
...
[316295.943994] Hardware name: ASUSTeK COMPUTER INC. RS500A-
[316295.943995] Call Trace:
[316295.943997] <IRQ>
[316295.944002] dump_stack+
[316295.944006] netdev_
[316295.944007] __skb_gro_
[316295.944009] tcp6_gro_
[316295.944010] ipv6_gro_
[316295.944012] ? kmem_cache_
[316295.944017] dev_gro_
[316295.996284] ? mlx5e_build_
[316296.010778] napi_gro_
[316296.010793] mlx5e_handle_
[316296.010808] mlx5e_poll_
[316296.010825] mlx5e_napi_
[316296.010843] ? mlx5_eq_
[316296.010850] net_rx_
[316296.010859] __do_softirq+
[316296.010862] irq_exit+0xae/0xb0
[316296.010863] do_IRQ+0x5a/0xf0
[316296.010865] common_
[316296.010866] </IRQ>
[316296.010868] RIP: 0010:cpuidle_
[316296.010869] Code: ff e8 aa 7d 7e ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 ea 02 00 00 31 ff e8 2d 01 85 ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 3f 02 00 00 49 63 d4 4c 8b 7d d0 4c 2b 7d c8 48 8d
[316296.010870] RSP: 0018:ffff9d8400
[316296.010872] RAX: ffff91110b62ce00 RBX: ffff9110ac1d1c00 RCX: 000000000000001f
[316296.010872] RDX: 0000000000000000 RSI: 00000000334bfb91 RDI: 0000000000000000
[316296.010873] RBP: ffff9d84002cfe78 R08: 00011fab2ae67109 R09: 00011faebfd6b300
[316296.010873] R10: ffff91110b62bac4 R11: ffff91110b62baa4 R12: 0000000000000002
[316296.010874] R13: ffffffff8f978700 R14: 0000000000000002 R15: ffff9110ac1d1c00
[316296.010876] ? cpuidle_
[316296.010878] cpuidle_
[316296.010880] call_cpuidle+
[316296.010881] do_idle+0x1e7/0x280
[316296.010882] cpu_startup_
[316296.010885] start_secondary
[316296.010886] secondary_
# lspci -v -s 0000:44:00.1
44:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
Subsystem: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
Flags: bus master, fast devsel, latency 0, IRQ 254, NUMA node 0
Memory at b0000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at b5300000 [disabled] [size=1M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [40] Power Management version 3
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [180] Single Root I/O Virtualization (SR-IOV)
Capabilities: [230] Access Control Services
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core
affects: | ubuntu → linux-oem-5.6 (Ubuntu) |
Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https:/ /wiki.ubuntu. com/Bugs/ FindRightPackag e. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.
To change the source package that this bug is filed about visit https:/ /bugs.launchpad .net/ubuntu/ +bug/1887723/ +editstatus and add the package name in the text box next to the word Package.
[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]