hw csum failure when IPv6 interfaces configured in netdev_rx_csum_fault+0x38/0x40

Bug #1614953 reported by Andrew McDermott
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned
Xenial
Expired
Medium
Unassigned
Yakkety
Expired
Medium
Unassigned

Bug Description

Since I started using IPv6 I noticed the following kernel error:

Thu Aug 18 15:05:01 2016] <unknown>: hw csum failure
[Thu Aug 18 15:05:01 2016] CPU: 5 PID: 11086 Comm: Chrome_IOThread Tainted: P OE 4.4.0-34-generic #53-Ubuntu
[Thu Aug 18 15:05:01 2016] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77-D3H, BIOS F22 11/14/2013
[Thu Aug 18 15:05:01 2016] 0000000000000286 00000000c05989f2 ffff8807573dfcd0 ffffffff813f11b3
[Thu Aug 18 15:05:01 2016] 0000000000000000 0000000000000008 ffff8807573dfce8 ffffffff8171f6e8
[Thu Aug 18 15:05:01 2016] 0000000000000074 ffff8807573dfd30 ffffffff8171639b 00000000fe091360
[Thu Aug 18 15:05:01 2016] Call Trace:
[Thu Aug 18 15:05:01 2016] [<ffffffff813f11b3>] dump_stack+0x63/0x90
[Thu Aug 18 15:05:01 2016] [<ffffffff8171f6e8>] netdev_rx_csum_fault+0x38/0x40
[Thu Aug 18 15:05:01 2016] [<ffffffff8171639b>] skb_copy_and_csum_datagram_msg+0xeb/0x100
[Thu Aug 18 15:05:01 2016] [<ffffffff817ec0b3>] udpv6_recvmsg+0x233/0x670
[Thu Aug 18 15:05:01 2016] [<ffffffff8179dc4e>] inet_recvmsg+0x7e/0xb0
[Thu Aug 18 15:05:01 2016] [<ffffffff817061fb>] sock_recvmsg+0x3b/0x50
[Thu Aug 18 15:05:01 2016] [<ffffffff81706451>] SYSC_recvfrom+0xe1/0x160
[Thu Aug 18 15:05:01 2016] [<ffffffff810ac0b0>] ? wake_up_q+0x70/0x70
[Thu Aug 18 15:05:01 2016] [<ffffffff8170785e>] SyS_recvfrom+0xe/0x10
[Thu Aug 18 15:05:01 2016] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71

It repeats periodically:

$ dmesg -T | grep 'hw csum failure' | wc -l
201

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-34-generic 4.4.0-34.53
ProcVersionSignature: Ubuntu 4.4.0-34.53-generic 4.4.15
Uname: Linux 4.4.0-34-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl nvidia_uvm nvidia_modeset nvidia
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
CurrentDesktop: Unity
Date: Fri Aug 19 13:47:54 2016
HibernationDevice: RESUME=UUID=066c8903-7f5e-43d5-b48f-76d33c4558f2
InstallationDate: Installed on 2016-04-24 (116 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
MachineType: Gigabyte Technology Co., Ltd. To be filled by O.E.M.
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-34-generic.efi.signed root=/dev/mapper/rootpool-rootvol ro intel_iommu=on
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-34-generic N/A
 linux-backports-modules-4.4.0-34-generic N/A
 linux-firmware 1.157.3
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/14/2013
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F22
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: Z77-D3H
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF22:bd11/14/2013:svnGigabyteTechnologyCo.,Ltd.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnGigabyteTechnologyCo.,Ltd.:rnZ77-D3H:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To be filled by O.E.M.
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Andrew McDermott (frobware) wrote :
Revision history for this message
Andrew McDermott (frobware) wrote :

I mentioned that it repeats - here's the uptime to get a feeling of the frequency:

$ uptime
 13:51:59 up 1 day, 4:17, 2 users, load average: 0.40, 0.67, 0.71

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.8 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8-rc3

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Andrew McDermott (frobware) wrote :

I only noticed because I had been experimenting with setting up IPv6 support for Juju. I enabled IPv6 on my router, then insisted that "eth0" got an IP address in ENI and only then did I notice the messages. This has only been a recent experiment so no evidence for whether this happened as part of any upgrade.

I can try the latest upstream kernel but it won't be for a day or two.

Revision history for this message
Andrew McDermott (frobware) wrote :

Currently running:

$ uname -a
Linux spicy 4.8.0-040800rc3-generic #201608212032 SMP Mon Aug 22 00:34:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Will report back tomorrow with a finding either way.

Revision history for this message
Andrew McDermott (frobware) wrote :

Running:

$ uname -a
Linux spicy 4.8.0-040800rc3-generic #201608212032 SMP Mon Aug 22 00:34:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ uptime
 08:18:06 up 1 day, 15:40, 2 users, load average: 0.60, 0.80, 2.73

$ dmesg -T | grep 'hw csum failure' | wc -l
0

Marking this as fixed upstream.

tags: added: kernel-fixed-upstream
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

That is good news. That means we can perform a "Reverse" kernel bisect to identify the commit that fixes this. Can you test some kernels to perform the bisect? I can build them and post links to them here. It would take testing of about 10-12 kernels.

To start the bisect, we first need to identify the last BAD kernel and the first GOOD kernel. Currently we know the 4.4 based kernel in Xenial is bad. The 4.8-rc3 kernel is good. Can you next test the upstream 4.7 final kernel? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.7/

If that one is good, can try the 4.6, then 4.5, etc.

Thanks!

Changed in linux (Ubuntu):
status: Incomplete → Triaged
tags: added: performing-bisect
Changed in linux (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Changed in linux (Ubuntu Xenial):
status: Triaged → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Does this bug still happen with the latest Xenial updates?

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Changed in linux (Ubuntu Yakkety):
status: In Progress → Incomplete
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu Xenial):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu Yakkety):
assignee: Joseph Salisbury (jsalisbury) → nobody
Revision history for this message
Andrew McDermott (frobware) wrote : Re: [Bug 1614953] Re: hw csum failure when IPv6 interfaces configured in netdev_rx_csum_fault+0x38/0x40
Download full text (4.9 KiB)

I haven't tried recently as I disabled IPv6 on my machine.

On 26 January 2017 at 16:31, Joseph Salisbury <
<email address hidden>> wrote:

> Does this bug still happen with the latest Xenial updates?
>
> ** Changed in: linux (Ubuntu)
> Status: In Progress => Incomplete
>
> ** Changed in: linux (Ubuntu Xenial)
> Status: In Progress => Incomplete
>
> ** Changed in: linux (Ubuntu Yakkety)
> Status: In Progress => Incomplete
>
> ** Changed in: linux (Ubuntu)
> Assignee: Joseph Salisbury (jsalisbury) => (unassigned)
>
> ** Changed in: linux (Ubuntu Xenial)
> Assignee: Joseph Salisbury (jsalisbury) => (unassigned)
>
> ** Changed in: linux (Ubuntu Yakkety)
> Assignee: Joseph Salisbury (jsalisbury) => (unassigned)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1614953
>
> Title:
> hw csum failure when IPv6 interfaces configured in
> netdev_rx_csum_fault+0x38/0x40
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Xenial:
> Incomplete
> Status in linux source package in Yakkety:
> Incomplete
>
> Bug description:
> Since I started using IPv6 I noticed the following kernel error:
>
> Thu Aug 18 15:05:01 2016] <unknown>: hw csum failure
> [Thu Aug 18 15:05:01 2016] CPU: 5 PID: 11086 Comm: Chrome_IOThread
> Tainted: P OE 4.4.0-34-generic #53-Ubuntu
> [Thu Aug 18 15:05:01 2016] Hardware name: Gigabyte Technology Co., Ltd.
> To be filled by O.E.M./Z77-D3H, BIOS F22 11/14/2013
> [Thu Aug 18 15:05:01 2016] 0000000000000286 00000000c05989f2
> ffff8807573dfcd0 ffffffff813f11b3
> [Thu Aug 18 15:05:01 2016] 0000000000000000 0000000000000008
> ffff8807573dfce8 ffffffff8171f6e8
> [Thu Aug 18 15:05:01 2016] 0000000000000074 ffff8807573dfd30
> ffffffff8171639b 00000000fe091360
> [Thu Aug 18 15:05:01 2016] Call Trace:
> [Thu Aug 18 15:05:01 2016] [<ffffffff813f11b3>] dump_stack+0x63/0x90
> [Thu Aug 18 15:05:01 2016] [<ffffffff8171f6e8>]
> netdev_rx_csum_fault+0x38/0x40
> [Thu Aug 18 15:05:01 2016] [<ffffffff8171639b>]
> skb_copy_and_csum_datagram_msg+0xeb/0x100
> [Thu Aug 18 15:05:01 2016] [<ffffffff817ec0b3>]
> udpv6_recvmsg+0x233/0x670
> [Thu Aug 18 15:05:01 2016] [<ffffffff8179dc4e>] inet_recvmsg+0x7e/0xb0
> [Thu Aug 18 15:05:01 2016] [<ffffffff817061fb>] sock_recvmsg+0x3b/0x50
> [Thu Aug 18 15:05:01 2016] [<ffffffff81706451>] SYSC_recvfrom+0xe1/0x160
> [Thu Aug 18 15:05:01 2016] [<ffffffff810ac0b0>] ? wake_up_q+0x70/0x70
> [Thu Aug 18 15:05:01 2016] [<ffffffff8170785e>] SyS_recvfrom+0xe/0x10
> [Thu Aug 18 15:05:01 2016] [<ffffffff8182def2>]
> entry_SYSCALL_64_fastpath+0x16/0x71
>
> It repeats periodically:
>
> $ dmesg -T | grep 'hw csum failure' | wc -l
> 201
>
> ProblemType: Bug
> DistroRelease: Ubuntu 16.04
> Package: linux-image-4.4.0-34-generic 4.4.0-34.53
> ProcVersionSignature: Ubuntu 4.4.0-34.53-generic 4.4.15
> Uname: Linux 4.4.0-34-generic x86_64
> NonfreeKernelModules: zfs zunicode zcommon znvpair zavl nvidia_uvm
> nvidia_modeset nvidia
> ApportVersion: 2.20.1-0ubuntu2.1
> Architecture: amd64
> CurrentD...

Read more...

Revision history for this message
Heikki Hannikainen (hessu) wrote :

I am currently getting these on 4.4.0-66, on 14.04LTS / trusty. Xen VM. Unfortunately unable to migrate immediately to 16.04LTS (largish platform with many custom packages built for the distribution). On the 3.13.0 kernels I don't get these, but would like to go to 4.4 due to other fixes being present there.

Linux hostname 4.4.0-66-generic #87~14.04.1-Ubuntu SMP Fri Mar 3 17:32:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

[ 0.171412] xen_netfront: Initialising Xen virtual ethernet driver

[134355.195249] <unknown>: hw csum failure
[134355.195264] CPU: 0 PID: 1226 Comm: named Not tainted 4.4.0-66-generic #87~14.04.1-Ubuntu
[134355.195266] 0000000000000000 ffff88007b47bc20 ffffffff813dc80c 0000000000000000
[134355.195270] ffff880005840b00 ffff88007b47bc38 ffffffff816ff14a 0000000000000008
[134355.195272] ffff88007b47bc78 ffffffff816f5b70 0000000000000004 a9ab56547af50000
[134355.195274] Call Trace:
[134355.195284] [<ffffffff813dc80c>] dump_stack+0x63/0x87
[134355.195290] [<ffffffff816ff14a>] netdev_rx_csum_fault+0x3a/0x40
[134355.195294] [<ffffffff816f5b70>] skb_copy_and_csum_datagram_msg+0xd0/0xe0
[134355.195299] [<ffffffff817c727f>] udpv6_recvmsg+0x22f/0x6f0
[134355.195302] [<ffffffff8177a5af>] inet_recvmsg+0x6f/0x80
[134355.195305] [<ffffffff816e660b>] sock_recvmsg+0x3b/0x50
[134355.195306] [<ffffffff816e752b>] ___sys_recvmsg+0xdb/0x1f0
[134355.195312] [<ffffffff810fa58d>] ? get_futex_key+0x19d/0x280
[134355.195317] [<ffffffff810c6581>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[134355.195320] [<ffffffff810fab21>] ? futex_wake+0x81/0x150
[134355.195323] [<ffffffff810fd124>] ? do_futex+0xf4/0x520
[134355.195327] [<ffffffff810a632a>] ? finish_task_switch+0x7a/0x290
[134355.195330] [<ffffffff816e7fc2>] __sys_recvmsg+0x42/0x80
[134355.195332] [<ffffffff816e8012>] SyS_recvmsg+0x12/0x20
[134355.195335] [<ffffffff81807df6>] entry_SYSCALL_64_fastpath+0x16/0x75

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu Yakkety) because there has been no activity for 60 days.]

Changed in linux (Ubuntu Yakkety):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu Xenial) because there has been no activity for 60 days.]

Changed in linux (Ubuntu Xenial):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Jenna Nelson (jem) wrote :

Yep, this is happening to me:

Jun 03 18:24:19 nara.apposite.com.au kernel: <unknown>: hw csum failure
Jun 03 18:24:19 nara.apposite.com.au kernel: CPU: 1 PID: 10841 Comm: transmission-da Not tainted 4.4.0-116-generic #140-Ubuntu
Jun 03 18:24:19 nara.apposite.com.au kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Professional, BIOS P1.30 07/13/2012
Jun 03 18:24:19 nara.apposite.com.au kernel: 0000000000000286 00018821fdfca761 ffff880115a97cd8 ffffffff813ffc13
Jun 03 18:24:19 nara.apposite.com.au kernel: 0000000000000000 0000000000000008 ffff880115a97cf0 ffffffff8173e268
Jun 03 18:24:19 nara.apposite.com.au kernel: 0000000000000068 ffff880115a97d38 ffffffff81734a1b 000000001e995388
Jun 03 18:24:19 nara.apposite.com.au kernel: Call Trace:
Jun 03 18:24:19 nara.apposite.com.au kernel: [<ffffffff813ffc13>] dump_stack+0x63/0x90

Linux nara.apposite.com.au 4.15.0-22-generic #24-Ubuntu SMP Wed May 16 12:15:17 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Jens Elkner (jelmd) wrote :

Getting it all the time:

[Apr13 04:28] kino6_0: hw csum failure
[ +0.003777] CPU: 18 PID: 0 Comm: swapper/18 Tainted: P OE 4.15.0-58-generic #64-Ubuntu
[ +0.000003] Hardware name: GIGABYTE G291-281-00/MG51-G21-00, BIOS R06 11/19/2019
[ +0.000001] Call Trace:
[ +0.000003] <IRQ>
[ +0.000010] dump_stack+0x63/0x8b
[ +0.000008] netdev_rx_csum_fault+0x38/0x40
[ +0.000003] __skb_checksum_complete+0xbc/0xd0
[ +0.000005] nf_ip_checksum+0xc3/0xf0
[ +0.000017] tcp_error+0x162/0x1c0 [nf_conntrack]
[ +0.000007] ? kfree_skbmem+0x5f/0x70
[ +0.000004] ? consume_skb+0x34/0x90
[ +0.000011] nf_conntrack_in+0x14f/0x500 [nf_conntrack]
[ +0.000009] ? csum_partial_ext+0x9/0x10
[ +0.000004] ? __skb_checksum+0x6b/0x300
[ +0.000006] ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4]
[ +0.000005] nf_hook_slow+0x48/0xc0
[ +0.000004] ? skb_send_sock+0x50/0x50
[ +0.000005] ip_rcv+0x2fa/0x360
[ +0.000003] ? inet_del_offload+0x40/0x40
[ +0.000004] __netif_receive_skb_core+0x432/0xb40
[ +0.000004] ? tcp4_gro_receive+0x137/0x1a0
[ +0.000003] __netif_receive_skb+0x18/0x60
[ +0.000003] ? __netif_receive_skb+0x18/0x60
[ +0.000004] netif_receive_skb_internal+0x45/0xe0
[ +0.000003] napi_gro_receive+0xc5/0xf0
[ +0.000034] mlx5e_handle_rx_cqe_mpwrq+0x465/0x860 [mlx5_core]
[ +0.000029] mlx5e_poll_rx_cq+0xd1/0x8b0 [mlx5_core]
[ +0.000025] mlx5e_napi_poll+0x9d/0x290 [mlx5_core]
[ +0.000004] net_rx_action+0x140/0x3a0
[ +0.000006] __do_softirq+0xe4/0x2d4
[ +0.000006] irq_exit+0xc5/0xd0
[ +0.000004] do_IRQ+0x8a/0xe0
[ +0.000003] common_interrupt+0x8c/0x8c
[ +0.000002] </IRQ>
[ +0.000005] RIP: 0010:cpuidle_enter_state+0xa7/0x2f0
[ +0.000005] RSP: 0018:ffffa278002b3e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
[ +0.000004] RAX: ffff970ebeea2840 RBX: 0008672cdbad1064 RCX: 000000000000001f
[ +0.000002] RDX: 0008672cdbad1064 RSI: fffff3a434de6285 RDI: 0000000000000000
[ +0.000002] RBP: ffffa278002b3ea8 R08: 0000000000000004 R09: 0000000000022080
[ +0.000002] R10: ffffa278002b3e38 R11: 00137021632d21b0 R12: ffffc277ff683298
[ +0.000001] R13: 0000000000000003 R14: ffffffffaa172e78 R15: 0000000000000000
[ +0.000005] ? cpuidle_enter_state+0x97/0x2f0
[ +0.000003] cpuidle_enter+0x17/0x20
[ +0.000005] call_cpuidle+0x23/0x40
[ +0.000004] do_idle+0x18c/0x1f0
[ +0.000005] cpu_startup_entry+0x73/0x80
[ +0.000004] start_secondary+0x1ab/0x200
[ +0.000005] secondary_startup_64+0xa5/0xb0

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Should be fixed by LP: #1840854.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.