Comment 0 for bug 1854842

Revision history for this message
Mohammad Heib (mohamadh) wrote :

Hi,
we have the following issue which affects a lot of our customers this issue fixes upstream and need to add the fixes to ubuntu 18.04.

Mlx5 driver: Tail padding HW Checksum crash in Ubuntu 18.04 kernel Ubuntu-4.15.0-72

Crach log:

[ 785.337368] Call Trace:
[ 785.337372] <IRQ>
[ 785.337388] dump_stack+0x63/0x8e
[ 785.337397] netdev_rx_csum_fault+0x38/0x40
[ 785.337403] __skb_checksum_complete+0xbc/0xd0
[ 785.337408] nf_ip_checksum+0xc3/0xf0
[ 785.337417] icmp_error+0x27d/0x310 [nf_conntrack_ipv4]
[ 785.337431] nf_conntrack_in+0x15a/0x510 [nf_conntrack]
[ 785.337437] ? __skb_checksum+0x68/0x330
[ 785.337441] ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4]
[ 785.337449] nf_hook_slow+0x48/0xc0
[ 785.337452] ? skb_send_sock+0x50/0x50
[ 785.337460] ip_rcv+0x301/0x360
[ 785.337463] ? inet_del_offload+0x40/0x40
[ 785.337468] __netif_receive_skb_core+0x432/0xb80
[ 785.337473] __netif_receive_skb+0x18/0x60
[ 785.337477] ? __netif_receive_skb+0x18/0x60
[ 785.337481] netif_receive_skb_internal+0x45/0xe0
[ 785.337483] napi_gro_receive+0xc5/0xf0
[ 785.337517] mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core]
[ 785.337524] ? enqueue_task_rt+0x1b4/0x2e0
[ 785.337546] mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core]
[ 785.337566] mlx5e_napi_poll+0x9d/0x290 [mlx5_core]
[ 785.337569] net_rx_action+0x140/0x3a0
[ 785.337574] __do_softirq+0xe4/0x2d4
[ 785.337580] irq_exit+0xc5/0xd0
[ 785.337583] do_IRQ+0x86/0xe0
[ 785.337588] common_interrupt+0x8c/0x8c
[ 785.337590] </IRQ>
[ 785.337598] RIP: 0010:cpuidle_enter_state+0xa4/0x2f0
[ 785.337600] RSP: 0018:ffffad8d8329fe68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9
[ 785.337604] RAX: ffff8a6c7f7e1840 RBX: 000000b6d9bf6a06 RCX: 000000000000001f
[ 785.337605] RDX: 000000b6d9bf6a06 RSI: ffd4a4b4c86359ce RDI: 0000000000000000
[ 785.337607] RBP: ffffad8d8329fea8 R08: 0000000000000004 R09: 0000000000021080
[ 785.337609] R10: ffffad8d8329fe38 R11: 0056b80166a42400 R12: ffff8a6c7f7ece18
[ 785.337610] R13: 0000000000000005 R14: ffffffffaff73438 R15: 0000000000000000

[HOW TO REPRODUCE]:
with scapy on the sender side please run the following commands:
1) a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='127.0.0.1')/ICMP()/Padding(load='\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe')

2) sendp(a, iface='enp6s0f0')

3) check the dmesg i the receiver side

[ADDITIONAL INFO]:
This issue fixes upstream by the following set of patches:
net/mlx5e: Rx, Fix checksum calculation for new hardware --> db849faa9bef993a1379dc510623f750a72fa7ce
 net/mlx5e: Rx, Check ip headers sanity - > 0318a7b7fcad9765931146efa7ca3a034194737c
 net/mlx5e: Rx, Fixup skb checksum for packets with tail padding --> 0aa1d18615c163f92935b806dcaff9157645233a
 net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded --> 5d0bb3bac4b9f6c22280b04545626fdfd99edc6b
 mlx5: fix get_ip_proto() --> ef6fcd455278c2be3032a346cc66d9dd9866b787
 net/mlx5e: Allow reporting of checksum unnecessary --> b856df28f9230a47669efbdd57896084caadb2b3
 net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets --> fe1dc069990c1f290ef6b99adb46332c03258f38
 net/mlx5e: Set ECN for received packets using CQE indication --> f007c13d4ad62f494c83897eda96437005df4a91
 net/mlx5e: Add likely to the common RX checksum flow --> 63a612f984a1fae040ab6f1c6a0f1fdcdf1954b8
 net/mlx5e: CHECKSUM_COMPLETE offload for VLAN/QinQ packets --> f938daeee95eb36ef6b431bf054a5cc6cdada112

attached the /var/log/kern.log file.