Intel Ethernet I218-V [8086:15a1] Subsystem [1043:85c4] detected Hardware Unit Hang

Bug #1766377 reported by Robert Dinse
62
This bug affects 12 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned
Bionic
Confirmed
High
Unassigned

Bug Description

     With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
Intel® I218V, 1 x Gigabit LAN Controller(s)
Intel® I211-AT, 1 x Gigabit LAN
Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
Support Teaming Technology
ASUS Turbo LAN Utility
The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
Here are the messages from dmesg:
1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                   TDH <ea>
                   TDT <2d>
                   next_to_use <2d>
                   next_to_clean <e9>
                 buffer_info[next_to_clean]:
                   time_stamp <13c8d0008>
                   next_to_watch <ea>
                   jiffies <13c8d0880>
                   next_to_watch.status <0>
                 MAC Status <80083>
                 PHY Status <796d>
                 PHY 1000BASE-T Status <3c00>
                 PHY Extended Status <3000>
                 PCI Status <10>
[1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                   TDH <ea>
                   TDT <2d>
                   next_to_use <2d>
                   next_to_clean <e9>
                 buffer_info[next_to_clean]:
                   time_stamp <13c8d0008>
                   next_to_watch <ea>
                   jiffies <13c8d1040>
                   next_to_watch.status <0>
                 MAC Status <80083>
                 PHY Status <796d>
                 PHY 1000BASE-T Status <3c00>
                 PHY Extended Status <3000>
                 PCI Status <10>
[1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
[1016202.413701] bridge0: port 1(eno1) entered disabled state
[1016202.413732] bridge0: topology change detected, propagating
[1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[1016206.666708] bridge0: port 1(eno1) entered blocking state
[1016206.666712] bridge0: port 1(eno1) entered listening state
[1016216.750911] bridge0: port 1(eno1) entered learning state
[1016232.110291] bridge0: port 1(eno1) entered forwarding state
[1016232.110294] bridge0: topology change detected, sending tcn bpdu
[1017834.390579] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[1017834.390770] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[1017834.414792] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[1017834.414794] cfg80211: failed to load regulatory.db
If there is any other information I can provide to aid in resolution, please contact me, <email address hidden>. Thank you!

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-15-lowlatency 4.15.0-15.16
ProcVersionSignature: Ubuntu 4.15.0-15.16-lowlatency 4.15.15
Uname: Linux 4.15.0-15-lowlatency x86_64
ApportVersion: 2.20.9-0ubuntu6
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/hwC1D3', '/dev/snd/hwC1D2', '/dev/snd/hwC1D1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D9p', '/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', '/dev/snd/controlC1', '/dev/snd/by-path', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CurrentDesktop: MATE
Date: Mon Apr 23 16:45:30 2018
HibernationDevice: RESUME=UUID=963cb206-8962-4fc0-82a1-fc4f02a9b5c5
InstallationDate: Installed on 2017-05-05 (353 days ago)
InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412)
MachineType: ASUS All Series
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-15-lowlatency root=UUID=28825f5b-a6fd-4e09-982c-0513ae4d2842 ro quiet splash vt.handoff=1
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-15-lowlatency N/A
 linux-backports-modules-4.15.0-15-lowlatency N/A
 linux-firmware 1.173
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-04-12 (11 days ago)
dmi.bios.date: 08/11/2017
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1801
dmi.board.asset.tag: Default string
dmi.board.name: X99-E
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1801:bd08/11/2017:svnASUS:pnAllSeries:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnX99-E:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: ASUS MB
dmi.product.name: All Series
dmi.product.version: System Version
dmi.sys.vendor: ASUS

Revision history for this message
Robert Dinse (nanook) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Ethernet E1000 Controller Hangs

Would it be possible for you to test the proposed kernel and post back if it resolves this bug?
See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Thank you in advance!

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Changed in linux (Ubuntu Bionic):
status: Confirmed → Incomplete
Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.2 KiB)

      Yes though I will need to boot at night when usage is low. Can you tell
me what the kernel version is so I an be sure to get the correct kernel?

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Tue, 24 Apr 2018, Joseph Salisbury wrote:

> Date: Tue, 24 Apr 2018 15:47:57 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Would it be possible for you to test the proposed kernel and post back if it resolves this bug?
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.
>
> Thank you in advance!
>
> ** Changed in: linux (Ubuntu)
> Importance: Undecided => High
>
> ** Also affects: linux (Ubuntu Bionic)
> Importance: High
> Status: Confirmed
>
> ** Tags added: kernel-key
>
> ** Changed in: linux (Ubuntu Bionic)
> Status: Confirmed => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT ...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote :
Download full text (5.9 KiB)

     There is a 4.50.0-20.21 in Proposed, is this the correct kernel?

On Tue, April 24, 2018 8:47 am, Joseph Salisbury wrote:
> Would it be possible for you to test the proposed kernel and post back if
> it resolves this bug? See https://wiki.ubuntu.com/Testing/EnableProposed
> for documentation how to enable and use -proposed.
>
> Thank you in advance!
>
>
> ** Changed in: linux (Ubuntu)
> Importance: Undecided => High
>
>
> ** Also affects: linux (Ubuntu Bionic)
> Importance: High
> Status: Confirmed
>
>
> ** Tags added: kernel-key
>
>
> ** Changed in: linux (Ubuntu Bionic)
> Status: Confirmed => Incomplete
>
>
> --
> You received this bug notification because you are subscribed to the bug
> report. https://bugs.launchpad.net/bugs/1766377
>
>
> Title:
> Ethernet E1000 Controller Hangs
>
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic
> hanging of the LAN connection. This is happening on an Asus X99-DELUX
> motherboard, controller specifications: Intel® I218V, 1 x Gigabit LAN
> Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE)
> appliance Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a
> software bridge to share the interface. This did not happen with 17.10 and
> 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008> next_to_watch <ea> jiffies
> <13c8d0880>
> next_to_watch.status <0> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008> next_to_watch <ea> jiffies
> <13c8d1040>
> next_to_watch.status <0> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> [1016202.413701] bridge0: port 1(eno1) entered disabled state
> [1016202.413732] bridge0: topology change detected, propagating
> [1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: Rx/Tx
> [1016206.666708] bridge0: port 1(eno1) entered blocking state
> [1016206.666712] bridge0: port 1(eno1) entered listening state
> [1016216.750911] bridge0: port 1(eno1) entered learning state
> [1016232.110291] bridge0: port 1(eno1) entered forwarding state
> [1016232.110294] bridge0: topolo...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Ethernet E1000 Controller Hangs

Yes, that is the correct kernel version.

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (6.7 KiB)

      Ok, will reboot tonight when traffic is low and let you know how it goes.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Tue, 24 Apr 2018, Joseph Salisbury wrote:

> Date: Tue, 24 Apr 2018 23:59:59 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Yes, that is the correct kernel version.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d1040>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> ...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote :
Download full text (6.7 KiB)

      Looks like you nailed it. Three machines running haven't barfed in over
11 hours, used to several times an hour.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Tue, 24 Apr 2018, Joseph Salisbury wrote:

> Date: Tue, 24 Apr 2018 23:59:59 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Yes, that is the correct kernel version.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d1040>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> ...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote : Re: Ethernet E1000 Controller Hangs
Download full text (7.1 KiB)

Hate to say it but it happened again. Only once which is a lot better in terms of frequency but still happening, here are details:

[23144.764734] hrtimer: interrupt took 46767 ns
[41628.563552] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                 TDH <db>
                 TDT <65>
                 next_to_use <65>
                 next_to_clean <da>
               buffer_info[next_to_clean]:
                 time_stamp <1027691ea>
                 next_to_watch <db>
                 jiffies <102769c40>
                 next_to_watch.status <0>
               MAC Status <80083>
               PHY Status <796d>
               PHY 1000BASE-T Status <7c00>
               PHY Extended Status <3000>
               PCI Status <10>
[41630.611608] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                 TDH <db>
                 TDT <65>
                 next_to_use <65>
                 next_to_clean <da>
               buffer_info[next_to_clean]:
                 time_stamp <1027691ea>
                 next_to_watch <db>
                 jiffies <10276a440>
                 next_to_watch.status <0>
               MAC Status <80083>
               PHY Status <796d>
               PHY 1000BASE-T Status <7c00>
               PHY Extended Status <3000>
               PCI Status <10>
[41632.595800] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                 TDH <db>
                 TDT <65>
                 next_to_use <65>
                 next_to_clean <da>
               buffer_info[next_to_clean]:
                 time_stamp <1027691ea>
                 next_to_watch <db>
                 jiffies <10276ac00>
                 next_to_watch.status <0>
               MAC Status <80083>
               PHY Status <796d>
               PHY 1000BASE-T Status <7c00>
               PHY Extended Status <3000>
               PCI Status <10>
[41634.579772] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                 TDH <db>
                 TDT <65>
                 next_to_use <65>
                 next_to_clean <da>
               buffer_info[next_to_clean]:
                 time_stamp <1027691ea>
                 next_to_watch <db>
                 jiffies <10276b3c0>
                 next_to_watch.status <0>
               MAC Status <80083>
               PHY Status <796d>
               PHY 1000BASE-T Status <7c00>
               PHY Extended Status <3000>
               PCI Status <10>
[41635.667409] ------------[ cut here ]------------
[41635.667411] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
[41635.667424] WARNING: CPU: 9 PID: 65 at /build/linux-5s7Xkn/linux-4.15.0/net/sched/sch_generic.c:323 dev_watchdog+0...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote :

Still happening, not sure why but far more frequently on i7-6850k platform than i7-6700k.

Revision history for this message
Robert Dinse (nanook) wrote :
Download full text (4.8 KiB)

I discovered a way to cause this instantly, I attempted to change the size of the ring buffers from 512 bytes to the hardware maximum of 4096 using: ethtool -G eno1 rx 4096 tx 4096, it instantly hung the interface with the following in dmesg:

[458611.154752] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <48>
                  TDT <73>
                  next_to_use <73>
                  next_to_clean <47>
                buffer_info[next_to_clean]:
                  time_stamp <11b5117a3>
                  next_to_watch <48>
                  jiffies <11b511d40>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>
[458613.138731] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <48>
                  TDT <73>
                  next_to_use <73>
                  next_to_clean <47>
                buffer_info[next_to_clean]:
                  time_stamp <11b5117a3>
                  next_to_watch <48>
                  jiffies <11b512500>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>
[458615.122888] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <48>
                  TDT <73>
                  next_to_use <73>
                  next_to_clean <47>
                buffer_info[next_to_clean]:
                  time_stamp <11b5117a3>
                  next_to_watch <48>
                  jiffies <11b512cc0>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>
[458617.106832] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <48>
                  TDT <73>
                  next_to_use <73>
                  next_to_clean <47>
                buffer_info[next_to_clean]:
                  time_stamp <11b5117a3>
                  next_to_watch <48>
                  jiffies <11b513480>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>[458619.154912] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <48>
      ...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you see if this bug also happens with the latest mainline kernel, or if it was already fixed upstream? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc3

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (13.8 KiB)

      This kernel did make it so I could not reproduce it on demand using
ethtool -G however it broke so many other things I could not leave it running
to see if it fixed spontaneous hangs.

      Strangely it broke nfs-kernel-server on the i7-6850k machine but not the
i7-6700k machine. I did much stare and compare to make sure they were
configured the same. This forced me to back out this kernel.

      But in addition to NFS, the nouveau drivers needed on the i7-6850k
machine had some bug that would pixelize much of the screen in a semi-random
fashion. Also for whatever reason x2goserver would not work properly with
that kernel.

      On the i7-6700k machines, one I had to restart lightdm several times to
get it to actually start, it did not start on boot up. On another I was unable
to get lightdm to start at all and only console graphics worked, and for some
reason they were in yellow instead of white. The i7-6700k machines are using
the internal graphics of the i7-6700k processor clocked real slow to minimize
the impact on heat budget.

      So on the kernel-developers's PPA I saw another test kernel, 4.15.0-21,
I installed it, it also made ethtool -G not induce Ethernet hang but on the
i7-6850 it's already hung once spontaneously:

[ 4112.809034] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <4f>
                  TDT <6f>
                  next_to_use <6f>
                  next_to_clean <4e>
                buffer_info[next_to_clean]:
                  time_stamp <1003a2221>
                  next_to_watch <4f>
                  jiffies <1003a2d80>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>
[ 4114.793198] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH <4f>
                  TDT <6f>
                  next_to_use <6f>
                  next_to_clean <4e>
                buffer_info[next_to_clean]:
                  time_stamp <1003a2221>
                  next_to_watch <4f>
                  jiffies <1003a3540>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <7c00>
                PHY Extended Status <3000>
                PCI Status <10>
[ 4116.008748] ------------[ cut here ]------------
[ 4116.008750] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
[ 4116.008765] WARNING: CPU: 8 PID: 59 at
/build/linux-QLn4bB/linux-4.15.0/net/s
ched/sch_generic.c:323 dev_watchdog+0x21d/0x230
[ 4116.008765] Modules linked in: tcp_diag inet_diag vhost_net vhost tap
xt_CHEC
KSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_nat_ipv
4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
n
f_reject_ipv4 xt...

Revision history for this message
Robert Dinse (nanook) wrote : Re: Ethernet E1000 Controller Hangs

I read an article here https://serverfault.com/questions/616485/e1000e-reset-adapter-unexpectedly-detected-hardware-unit-hang?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa which stated some people had had success preventing this by disabling hardware offloading with ethtool -K eth0 gso off gro off tso off. With the 4.15.0-21-lowlatency #22-Ubuntu SMP PREEMPT Tue May 1 15:47:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux kernel on the i7-6850 machine which has been the most problematic, I did this and have not had a hang since. Obviously this comes with a performance penalty so is undesirable as a permanent fix but hoping this might help narrow the cause.

tags: added: kernel-da-key
removed: kernel-key
Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.1 KiB)

      Just to make sure you got the latest, the 4.17.x kernel did not work well
enoguh to leave it running, it broke kernel-nfs-server among other things.

      I am pressenting running 4.15.0-21 and with this kernel I would still
get these hangs except that I discovered disabling certain hardware offload
functions stops it, so presently in my /etc/rc.local file on the affected
servers I have: /sbin/ethtool -K eno1 gso off gro off tso off

      With this in place no hangs, slight performance penalty but no hangs.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Mon, 7 May 2018, Joseph Salisbury wrote:

> Date: Mon, 07 May 2018 18:26:12 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> ** Tags removed: kernel-key
> ** Tags added: kernel-da-key
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Ethernet E1000 Controller Hangs

We could perform a kernel bisect to identify the commit that introduced this regression. To perform a bisect, we need to identify the last kernel that did not have the bug and the first kernel version that did.

Do you recall the last kernel that didn't exhibit the bug? If not, would you be able to test some kernels to narrow it down? I could post a link to the kernels to test.

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.0 KiB)

      I did not see this behavior with 4.13.0 and did with 4.15.0.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Thu, 10 May 2018, Joseph Salisbury wrote:

> Date: Thu, 10 May 2018 17:25:18 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> We could perform a kernel bisect to identify the commit that introduced
> this regression. To perform a bisect, we need to identify the last
> kernel that did not have the bug and the first kernel version that did.
>
> Do you recall the last kernel that didn't exhibit the bug? If not,
> would you be able to test some kernels to narrow it down? I could post
> a link to the kernels to test.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> ...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Ethernet E1000 Controller Hangs

Could you test the following two upstream kernels, so we can narrow down the last good and first bad further:

v4.14 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14/
v4.15-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc1/

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (6.9 KiB)

      I booted the most problematic machine, that's the i7-6850k machine,
probably because it has the most traffic, on the 4.14.0 kernel, so far so
good.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Thu, 10 May 2018, Joseph Salisbury wrote:

> Date: Thu, 10 May 2018 19:26:46 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Could you test the following two upstream kernels, so we can narrow down
> the last good and first bad further:
>
> v4.14 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14/
> v4.15-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc1/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> ...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote :
Download full text (6.9 KiB)

      4.14 has the problem. Do I need to try the 4.15rc0 kernel also since
4.14 isn't well?

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Thu, 10 May 2018, Joseph Salisbury wrote:

> Date: Thu, 10 May 2018 19:26:46 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Could you test the following two upstream kernels, so we can narrow down
> the last good and first bad further:
>
> v4.14 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14/
> v4.15-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc1/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies ...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Ethernet E1000 Controller Hangs

Thanks for testing. We should work backwards towards 4.13 now. Can you test the following:

4.14-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc1/
4.14-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc4/
4.14-rc7: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc7/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.8 KiB)

      4.14.0 final crashed hard after running 17 hours. Not only was the
ethernet not-responsive, neither was the console, not even the magic
sys-req key. I had to power cycle the machine to get it unhung. Then
I booted 4.14.0rc1 and it immediately exploded however I had set a 20 second
time out and the machine self booted back into 4.15.0-21 and I turned hardware
offloading back off.

      I can not continue testing this on production machines and the one Intel
machine I have with that interface chip is currently broken. I'll work on
getting that working and setup a web server on it and use some test software
to put a load on it.

      One thing I found digging in github is that there has been only one
commit against the E1000 driver in the last three years and that was on
February 13th, so might be worth looking at.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Fri, 11 May 2018, Joseph Salisbury wrote:

> Date: Fri, 11 May 2018 12:31:14 -0000
> From: Joseph Salisbury <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Thanks for testing. We should work backwards towards 4.13 now. Can you
> test the following:
>
> 4.14-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc1/
> 4.14-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc4/
> 4.14-rc7: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc7/
>
>
> You don't have to test every kernel, just up until the kernel that first has this bug.
>
> Thanks in advance!
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Incomplete
> Status in linux source package in Bionic:
> Incomplete
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_cl...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote : Re: Ethernet E1000 Controller Hangs

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu Bionic) because there has been no activity for 60 days.]

Changed in linux (Ubuntu Bionic):
status: Incomplete → Expired
Revision history for this message
Emlyn Bolton (emlynbtech) wrote :
Download full text (3.5 KiB)

I'm still seeing this issue on the 4.15.0-39-generic kernel:

[467738.375137] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
                  Tx Queue <0>
                  TDH <44>
                  TDT <44>
                  next_to_use <44>
                  next_to_clean <25>
                buffer_info[next_to_clean]
                  time_stamp <106f70e55>
                  next_to_watch <26>
                  jiffies <106f70ff8>
                  next_to_watch.status <0>
[467740.391180] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
                  Tx Queue <0>
                  TDH <44>
                  TDT <44>
                  next_to_use <44>
                  next_to_clean <25>
                buffer_info[next_to_clean]
                  time_stamp <106f70e55>
                  next_to_watch <26>
                  jiffies <106f711f0>
                  next_to_watch.status <0>
[467742.407313] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
                  Tx Queue <0>
                  TDH <44>
                  TDT <44>
                  next_to_use <44>
                  next_to_clean <25>
                buffer_info[next_to_clean]
                  time_stamp <106f70e55>
                  next_to_watch <26>
                  jiffies <106f713e8>
                  next_to_watch.status <0>
[467744.423373] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
                  Tx Queue <0>
                  TDH <44>
                  TDT <44>
                  next_to_use <44>
                  next_to_clean <25>
                buffer_info[next_to_clean]
                  time_stamp <106f70e55>
                  next_to_watch <26>
                  jiffies <106f715e0>
                  next_to_watch.status <0>
[467745.319133] e1000 0000:05:01.0 enp5s1f0: Reset adapter

05:01.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
 Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter
 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
 Memory at f7220000 (64-bit, non-prefetchable) [size=128K]
 Memory at f71c0000 (64-bit, non-prefetchable) [size=256K]
 I/O ports at d040 [size=64]
 Expansion ROM at f7180000 [disabled] [size=256K]
 Capabilities: [dc] Power Management version 2
 Capabilities: [e4] PCI-X non-bridge device
 Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+
 Kernel driver in use: e1000
 Kernel modules: e1000

05:01.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
 Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter
 Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
 Memory at f7200000 (64-bit, non-prefetchable) [size=128K]
 Memory at f7140000 (64-bit, non-prefetchable) [size=256K]
 I/O ports at d000 [size=64]
...

Read more...

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (10.1 KiB)

      I gave up on it every being fixed and bought some broadcom interfaces
for my machines.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Fri, 23 Nov 2018, Emlyn Bolton wrote:

> Date: Fri, 23 Nov 2018 17:37:46 -0000
> From: Emlyn Bolton <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> I'm still seeing this issue on the 4.15.0-39-generic kernel:
>
> [467738.375137] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <44>
> TDT <44>
> next_to_use <44>
> next_to_clean <25>
> buffer_info[next_to_clean]
> time_stamp <106f70e55>
> next_to_watch <26>
> jiffies <106f70ff8>
> next_to_watch.status <0>
> [467740.391180] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <44>
> TDT <44>
> next_to_use <44>
> next_to_clean <25>
> buffer_info[next_to_clean]
> time_stamp <106f70e55>
> next_to_watch <26>
> jiffies <106f711f0>
> next_to_watch.status <0>
> [467742.407313] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <44>
> TDT <44>
> next_to_use <44>
> next_to_clean <25>
> buffer_info[next_to_clean]
> time_stamp <106f70e55>
> next_to_watch <26>
> jiffies <106f713e8>
> next_to_watch.status <0>
> [467744.423373] e1000 0000:05:01.0 enp5s1f0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <44>
> TDT <44>
> next_to_use <44>
> next_to_clean <25>
> buffer_info[next_to_clean]
> time_stamp <106f70e55>
> next_to_watch <26>
> jiffies <106f715e0>
> next_to_watch.status <0>
> [467745.319133] e1000 0000:05:01.0 enp5s1f0: Reset adapter
>
> 05:01.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
> Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter
> Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
> Memory at f7220000 (64-bit, non-prefetchable) [size=128K]
> Memory at f71c0000 (64-bit, n...

Revision history for this message
Emlyn Bolton (emlynbtech) wrote : Re: Ethernet E1000 Controller Hangs

Thanks for letting me know that! I might switch OS...

Revision history for this message
Halvor Lyche Strandvoll (halvors) wrote :

I made a workaround for this. That is basically just disable tso.
This fix persists across boots.

Copy the following script and save it as /<email address hidden>

##################################################

[Unit]
Description=Disable TSO for %i
BindsTo=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device

[Service]
Type=oneshot
ExecStart=/sbin/ethtool -K %i tso off

[Install]
WantedBy=sys-subsystem-net-devices-%i.device

##################################################

Run "systemctl enable disable-tso@enp0s25" with your interface name instead of "enp0s25" that is what mine is named.

If you have more problematic interfaces just enable service for those as well.

Revision history for this message
Halvor Lyche Strandvoll (halvors) wrote :

I have reported this multiple upstream places, still waiting for a fix...
Gave up on canonical to fix this many years ago.

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.4 KiB)

      I had a better fix, after becoming convinced Canonical was not going to
fix the drivers in my lifetime, I went and bought some non-Intel NICs.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Wed, 24 Apr 2019, Halvor Lyche Strandvoll wrote:

> Date: Wed, 24 Apr 2019 07:08:36 -0000
> From: Halvor Lyche Strandvoll <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> I made a workaround for this. That is basically just disable tso.
> This fix persists across boots.
>
> Copy the following script and save it as /etc/systemd/system/disable-
> <email address hidden>
>
> ##################################################
>
> [Unit]
> Description=Disable TSO for %i
> BindsTo=sys-subsystem-net-devices-%i.device
> After=sys-subsystem-net-devices-%i.device
>
> [Service]
> Type=oneshot
> ExecStart=/sbin/ethtool -K %i tso off
>
> [Install]
> WantedBy=sys-subsystem-net-devices-%i.device
>
> ##################################################
>
> Run "systemctl enable disable-tso@enp0s25" with your interface name
> instead of "enp0s25" that is what mine is named.
>
> If you have more problematic interfaces just enable service for those as
> well.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Expired
> Status in linux source package in Bionic:
> Expired
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> ...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote : Re: Ethernet E1000 Controller Hangs
Revision history for this message
Chris Puttick (cputtick) wrote :

We're seeing this bug consistently (although not consistent per se - the error does not occur immediately after boot, and doesn't immediately disconnect the server from the network; anything from 3 weeks to 3 hours before it starts, and a day or so after is starts before it gets permanently disconnected).

# uname -a
Linux <hostname> 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz

Ubuntu 18.04.3 LTS

Issue occurred on earlier 18.04 and kernel versions. Server was built as 18.04.1 and is a fairly minimal install acting as a virtual host with KVM/QEMU. It's not heavily loaded and doesn't have a huge amount of traffic. It's a co-located rental so we don't have a detailed spec, but if it's of interest more details are probably available from the provider.

We have other servers with identical hardware from the same provider running 16.04 without any issue.

Changed in linux (Ubuntu Bionic):
status: Expired → Confirmed
Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.7 KiB)

      I gave up any hope it ever would be fixed, bought some NIC cards with a
different chipset, and moved on.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Thu, 5 Sep 2019, Chris Puttick wrote:

> Date: Thu, 05 Sep 2019 08:35:29 -0000
> From: Chris Puttick <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> We're seeing this bug consistently (although not consistent per se - the
> error does not occur immediately after boot, and doesn't immediately
> disconnect the server from the network; anything from 3 weeks to 3 hours
> before it starts, and a day or so after is starts before it gets
> permanently disconnected).
>
> # uname -a
> Linux <hostname> 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
>
> Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
>
> Ubuntu 18.04.3 LTS
>
> Issue occurred on earlier 18.04 and kernel versions. Server was built as
> 18.04.1 and is a fairly minimal install acting as a virtual host with
> KVM/QEMU. It's not heavily loaded and doesn't have a huge amount of
> traffic. It's a co-located rental so we don't have a detailed spec, but
> if it's of interest more details are probably available from the
> provider.
>
> We have other servers with identical hardware from the same provider
> running 16.04 without any issue.
>
>
> ** Changed in: linux (Ubuntu Bionic)
> Status: Expired => Confirmed
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Expired
> Status in linux source package in Bionic:
> Confirmed
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch ...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote : Re: Ethernet E1000 Controller Hangs
Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (6.7 KiB)

      I am no longer using E1000 based cards.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Thu, 5 Sep 2019, Kai-Heng Feng wrote:

> Date: Thu, 05 Sep 2019 09:13:13 -0000
> From: Kai-Heng Feng <email address hidden>
> Reply-To: Bug 1766377 <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> Please test latest mainline kernel:
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3-rc7/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Expired
> Status in linux source package in Bionic:
> Confirmed
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d1040>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00...

Read more...

Revision history for this message
Crenshaw Thorpe (crenshawthorpe) wrote : Re: Ethernet E1000 Controller Hangs

FYI, in reference to some of the comments here about this being intermittent, the way I can trigger this reliably is via streaming via Twitch -- which is not even something I do with any regularity. I will see the stream die as if buffering, and inevitably that will correspond with this error in the kernel ring buffer.

Normal Internet usage will not do this with any regularity but streaming my desktop with OBS via Twitch will.

Revision history for this message
Klaas DC (klaasdc) wrote :

I also have this problem on an Intel DQ45CB mainboard with the onboard Intel 82567LM-3 Gigabit ethernet controller. The syslog fills up with following entries, while network connection is being reset. It does not happen immediately after boot, but high throughput seems to trigger it. I use the adapter in a bridge in a KVM server.

[105278.169850] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                  TDH <4a>
                  TDT <79>
                  next_to_use <79>
                  next_to_clean <49>
                buffer_info[next_to_clean]:
                  time_stamp <1019073b8>
                  next_to_watch <4a>
                  jiffies <101907568>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <3800>
                PHY Extended Status <3000>
                PCI Status <10>
[105280.153549] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                  TDH <4a>
                  TDT <79>
                  next_to_use <79>
                  next_to_clean <49>
                buffer_info[next_to_clean]:
                  time_stamp <1019073b8>
                  next_to_watch <4a>
                  jiffies <101907758>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <3800>
                PHY Extended Status <3000>
                PCI Status <10>
[105282.169795] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                  TDH <4a>
                  TDT <79>
                  next_to_use <79>
                  next_to_clean <49>
                buffer_info[next_to_clean]:
                  time_stamp <1019073b8>
                  next_to_watch <4a>
                  jiffies <101907950>
                  next_to_watch.status <0>
                MAC Status <80083>
                PHY Status <796d>
                PHY 1000BASE-T Status <3800>
                PHY Extended Status <3000>
                PCI Status <10>
[105283.385343] e1000e 0000:00:19.0 enp0s25: Reset adapter unexpectedly

Revision history for this message
Ted Gerold (tgwaste) wrote :

I am also experiencing this issue.

Distribution: Ubuntu 20.04 LTS
Kernel: 5.4.0-26-generic
Hardware: NUC10i7FNH - 2TB SSD - 64GB RAM

[128476.886034] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                  TDH <c8>
                  TDT <fe>
                  next_to_use <fe>
                  next_to_clean <c7>
                buffer_info[next_to_clean]:
                  time_stamp <101e8eb91>
                  next_to_watch <c8>
                  jiffies <101e8f598>
                  next_to_watch.status <0>
                MAC Status <40080083>
                PHY Status <796d>
                PHY 1000BASE-T Status <3800>
                PHY Extended Status <3000>
                PCI Status <10>
[128478.069750] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
Download full text (7.6 KiB)

      I gave up on this ever being fixed since it's existed since 2.6 and
went and bought some Realtek ethernet cards, no problems since. But, you
can work around this bug by turning hardware offloading off if efficiency
isn't an issue.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Mon, 6 Jul 2020, Ted Gerold wrote:

> Date: Mon, 06 Jul 2020 22:41:11 -0000
> From: Ted Gerold <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
>
> I am also experiencing this issue.
>
> Distribution: Ubuntu 20.04 LTS
> Kernel: 5.4.0-26-generic
> Hardware: NUC10i7FNH - 2TB SSD - 64GB RAM
>
> [128476.886034] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
> TDH <c8>
> TDT <fe>
> next_to_use <fe>
> next_to_clean <c7>
> buffer_info[next_to_clean]:
> time_stamp <101e8eb91>
> next_to_watch <c8>
> jiffies <101e8f598>
> next_to_watch.status <0>
> MAC Status <40080083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3800>
> PHY Extended Status <3000>
> PCI Status <10>
> [128478.069750] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
> Expired
> Status in linux source package in Bionic:
> Confirmed
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies ...

Read more...

You-Sheng Yang (vicamo)
tags: added: hwe-networking-ethernet
You-Sheng Yang (vicamo)
summary: - Ethernet E1000 Controller Hangs
+ Intel Ethernet I218-V [8086:15a1] Subsystem [1043:85c4] detected
+ Hardware Unit Hang
Revision history for this message
Ted Gerold (tgwaste) wrote :

Unfortunately I cant just install another card in an NUC. What performance hit does turning hardware offloading off have? How is it done and how is it reverted?

Wish they would just fix the problem.

Revision history for this message
Dee Jay Randall (randal-g) wrote :

I don't know the performance impact. You can find the command in earlier comments, but you basically want a variation of:
/sbin/ethtool -K eth0 gso off gro off tso off

with your network device substituted for "eth0".

If you simply run this manually after boot up, rebooting should turn it back on (well, should restore default behaviour). Otherwise you can probably revert with: /sbin/ethtool -K eth0 gso on gro on tso on

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1766377] Re: Intel Ethernet I218-V [8086:15a1] Subsystem [1043:85c4] detected Hardware Unit Hang
Download full text (7.1 KiB)

      I can't tell you what performance hit, that depends upon your
application, it will hit TCP apps harder than UDP because UDP can't take
advantage of it at all. The performance hit will be highest in net
centric apps like routers.

      apt install ethtool
      ethtool --offload eth0 rx off tx off

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Tue, 7 Jul 2020, Ted Gerold wrote:

> Date: Tue, 07 Jul 2020 16:26:22 -0000
> From: Ted Gerold <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1766377] Re: Intel Ethernet I218-V [8086:15a1] Subsystem
> [1043:85c4] detected Hardware Unit Hang
>
> Unfortunately I cant just install another card in an NUC. What
> performance hit does turning hardware offloading off have? How is it
> done and how is it reverted?
>
> Wish they would just fix the problem.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
> Intel Ethernet I218-V [8086:15a1] Subsystem [1043:85c4] detected
> Hardware Unit Hang
>
> Status in linux package in Ubuntu:
> Expired
> Status in linux source package in Bionic:
> Confirmed
>
> Bug description:
> With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications:
> Intel® I218V, 1 x Gigabit LAN Controller(s)
> Intel® I211-AT, 1 x Gigabit LAN
> Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance
> Support Teaming Technology
> ASUS Turbo LAN Utility
> The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
> This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface.
> This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here.
> Here are the messages from dmesg:
> 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> next_to_clean <e9>
> buffer_info[next_to_clean]:
> time_stamp <13c8d0008>
> next_to_watch <ea>
> jiffies <13c8d0880>
> next_to_watch.status <0>
> MAC Status <80083>
> PHY Status <796d>
> PHY 1000BASE-T Status <3c00>
> PHY Extended Status <3000>
> PCI Status <10>
> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
> TDH <ea>
> TDT <2d>
> next_to_use <2d>
> ...

Read more...

Revision history for this message
Ted Gerold (tgwaste) wrote :

Thank you for the feedback everyone!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.