mlx5: Tail padding HW Checksum crash in ubuntu 18.04

Bug #1850135 reported by Mohammad Heib
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

Hi,
we have the following issue which affects a lot of our customers this issue fixes upstream and need to add the fixes to ubuntu 18.04.

Mlx5 driver: Tail padding HW Checksum crash in Ubuntu 18.04 kernel 4.15.0-58-generic.

Crach log:

[Fri Aug 30 07:15:23 2019] enp59s0f0.101: hw csum failure
[Fri Aug 30 07:15:23 2019] CPU: 40 PID: 0 Comm: swapper/40 Not tainted 4.15.0-58-generic #64-Ubuntu
[Fri Aug 30 07:15:23 2019] Hardware name: Supermicro SYS-2029BT-A2-ADBE-FS011/X11DPT-B, BIOS 3.1 04/30/2019
[Fri Aug 30 07:15:23 2019] Call Trace:
[Fri Aug 30 07:15:23 2019] <IRQ>
[Fri Aug 30 07:15:23 2019] dump_stack+0x63/0x8b
[Fri Aug 30 07:15:23 2019] netdev_rx_csum_fault+0x38/0x40
[Fri Aug 30 07:15:23 2019] __skb_checksum_complete+0xbc/0xd0
[Fri Aug 30 07:15:23 2019] nf_ip_checksum+0xc3/0xf0
[Fri Aug 30 07:15:23 2019] tcp_error+0x162/0x1c0 [nf_conntrack]
[Fri Aug 30 07:15:23 2019] ? update_load_avg+0x57f/0x6e0
[Fri Aug 30 07:15:23 2019] ? __update_load_avg_se.isra.38+0x1c0/0x1d0
[Fri Aug 30 07:15:23 2019] nf_conntrack_in+0x14f/0x500 [nf_conntrack]
[Fri Aug 30 07:15:23 2019] ? csum_partial_ext+0x9/0x10
[Fri Aug 30 07:15:23 2019] ? __skb_checksum+0x6b/0x300
[Fri Aug 30 07:15:23 2019] ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4]
[Fri Aug 30 07:15:23 2019] nf_hook_slow+0x48/0xc0
[Fri Aug 30 07:15:23 2019] ? skb_send_sock+0x50/0x50
[Fri Aug 30 07:15:23 2019] ip_rcv+0x2fa/0x360
[Fri Aug 30 07:15:23 2019] ? inet_del_offload+0x40/0x40
[Fri Aug 30 07:15:23 2019] __netif_receive_skb_core+0x432/0xb40
[Fri Aug 30 07:15:23 2019] ? tcp4_gro_receive+0x137/0x1a0
[Fri Aug 30 07:15:23 2019] __netif_receive_skb+0x18/0x60
[Fri Aug 30 07:15:23 2019] ? __netif_receive_skb+0x18/0x60
[Fri Aug 30 07:15:23 2019] netif_receive_skb_internal+0x45/0xe0
[Fri Aug 30 07:15:23 2019] napi_gro_receive+0xc5/0xf0
[Fri Aug 30 07:15:23 2019] mlx5e_handle_rx_cqe_mpwrq+0x465/0x860 [mlx5_core]
[Fri Aug 30 07:15:23 2019] mlx5e_poll_rx_cq+0xd1/0x8b0 [mlx5_core]
[Fri Aug 30 07:15:23 2019] mlx5e_napi_poll+0x9d/0x290 [mlx5_core]
[Fri Aug 30 07:15:23 2019] net_rx_action+0x140/0x3a0
[Fri Aug 30 07:15:23 2019] __do_softirq+0xe4/0x2d4
[Fri Aug 30 07:15:23 2019] irq_exit+0xc5/0xd0
[Fri Aug 30 07:15:23 2019] do_IRQ+0x8a/0xe0
[Fri Aug 30 07:15:23 2019] common_interrupt+0x8c/0x8c
[Fri Aug 30 07:15:23 2019] </IRQ>
[Fri Aug 30 07:15:23 2019] RIP: 0010:cpu_idle_poll+0x3b/0x14

additional info:
this issue fixed upstream by the following patches list:
 net/mlx5e: Rx, Fix checksum calculation for new hardware --> db849faa9bef993a1379dc510623f750a72fa7ce
 net/mlx5e: Rx, Check ip headers sanity - > 0318a7b7fcad9765931146efa7ca3a034194737c
 net/mlx5e: Rx, Fixup skb checksum for packets with tail padding --> 0aa1d18615c163f92935b806dcaff9157645233a
 net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded --> 5d0bb3bac4b9f6c22280b04545626fdfd99edc6b
 mlx5: fix get_ip_proto() --> ef6fcd455278c2be3032a346cc66d9dd9866b787
 net/mlx5e: Allow reporting of checksum unnecessary --> b856df28f9230a47669efbdd57896084caadb2b3
 net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets --> fe1dc069990c1f290ef6b99adb46332c03258f38
 net/mlx5e: Set ECN for received packets using CQE indication --> f007c13d4ad62f494c83897eda96437005df4a91
 net/mlx5e: Add likely to the common RX checksum flow --> 63a612f984a1fae040ab6f1c6a0f1fdcdf1954b8 --> taking
 net/mlx5e: CHECKSUM_COMPLETE offload for VLAN/QinQ packets --> f938daeee95eb36ef6b431bf054a5cc6cdada112 --> in ubuntu

we trying to backport those patches list to bionic kernel master branch, will add the adjusted patches once the backporting is done.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1850135/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1850135

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hello,

I believe this bug is a duplicate of Bug 1840854, and the fix should be released in 4.15.0-59. Can you please review Bug 1840854 and try installing a newer kernel, say 4.15.0-69, and let me know if it fixes the problem?

Thanks,
Matthew

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.