Linux 6.2.0-26-generic, network crashes intermittently.

Bug #2030673 reported by Garry
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux-signed-hwe-6.2 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After updating the kernel to version 6.2, the network drops once a day.

Description: Ubuntu 22.04.3 LTS
Release: 22.04

Linux MyHomeAssistant 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

авг 07 12:32:25 MyHomeAssistant kernel: ------------[ cut here ]------------
авг 07 12:32:25 MyHomeAssistant kernel: NETDEV WATCHDOG: enp2s0 (r8169): transmit queue 0 timed out
авг 07 12:32:25 MyHomeAssistant kernel: WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x21f/0x230
авг 07 12:32:25 MyHomeAssistant kernel: Modules linked in: ntfs3 rfcomm tls xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc snd_hda_codec_hdmi cmac algif_hash algif_skcipher af_alg bnep overlay snd_sof_pci_intel_icl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation binfmt_misc soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus nls_iso8859_1 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel x86_pkg_temp_thermal mei_pxp mei_hdcp intel_rapl_msr snd_intel_dspcfg intel_powerclamp snd_intel_sdw_acpi i915 snd_hda_codec kvm_intel snd_usb_audio iwlmvm snd_hda_core snd_usbmidi_lib kvm mc snd_hwdep drm_buddy btusb mac80211 snd_pcm irqbypass ttm snd_seq_midi btrtl snd_seq_midi_event libarc4 btbcm
авг 07 12:32:25 MyHomeAssistant kernel: crct10dif_pclmul snd_rawmidi drm_display_helper btintel polyval_generic snd_seq ghash_clmulni_intel cec sha512_ssse3 rc_core iwlwifi snd_seq_device aesni_intel btmtk processor_thermal_device_pci_legacy snd_timer drm_kms_helper processor_thermal_device crypto_simd processor_thermal_rfim snd i2c_algo_bit cryptd processor_thermal_mbox syscopyarea joydev input_leds bluetooth intel_cstate cdc_acm sysfillrect processor_thermal_rapl cmdlinepart sysimgblt cfg80211 spi_nor soundcore ecdh_generic intel_rapl_common 8250_dw mei_me ee1004 wmi_bmof ecc mtd int340x_thermal_zone intel_soc_dts_iosf mei acpi_pad mac_hid acpi_tad sch_fq_codel coretemp msr parport_pc ppdev lp parport ramoops pstore_blk drm reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid uas hid usb_storage spi_pxa2xx_platform dw_dmac dw_dmac_core crc32_pclmul spi_intel_pci i2c_i801 xhci_pci ahci r8169 spi_intel i2c_smbus intel_lpss_pci intel_lpss realtek libahci idma64 xhci_pci_renesas video
авг 07 12:32:25 MyHomeAssistant kernel: wmi pinctrl_jasperlake
авг 07 12:32:25 MyHomeAssistant kernel: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.2.0-26-generic #26~22.04.1-Ubuntu
авг 07 12:32:25 MyHomeAssistant kernel: Hardware name: AZW U59/U59, BIOS JTKT001 05/05/2022
авг 07 12:32:25 MyHomeAssistant kernel: RIP: 0010:dev_watchdog+0x21f/0x230
авг 07 12:32:25 MyHomeAssistant kernel: Code: 00 e9 31 ff ff ff 4c 89 e7 c6 05 f5 a9 78 01 01 e8 c6 ff f7 ff 44 89 f1 4c 89 e6 48 c7 c7 08 30 c4 a7 48 89 c2 e8 31 0b 2c ff <0f> 0b e9 22 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
авг 07 12:32:25 MyHomeAssistant kernel: RSP: 0018:ffff99cf001dce70 EFLAGS: 00010246
авг 07 12:32:25 MyHomeAssistant kernel: RAX: 0000000000000000 RBX: ffff8e93816704c8 RCX: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: RBP: ffff99cf001dce98 R08: 0000000000000000 R09: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e9381670000
авг 07 12:32:25 MyHomeAssistant kernel: R13: ffff8e938167041c R14: 0000000000000000 R15: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: FS: 0000000000000000(0000) GS:ffff8e96f0180000(0000) knlGS:0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
авг 07 12:32:25 MyHomeAssistant kernel: CR2: 0000564d214f12cc CR3: 00000002c6010000 CR4: 0000000000350ee0
авг 07 12:32:25 MyHomeAssistant kernel: Call Trace:
авг 07 12:32:25 MyHomeAssistant kernel: <IRQ>
авг 07 12:32:25 MyHomeAssistant kernel: ? __pfx_dev_watchdog+0x10/0x10
авг 07 12:32:25 MyHomeAssistant kernel: call_timer_fn+0x29/0x160
авг 07 12:32:25 MyHomeAssistant kernel: ? __pfx_dev_watchdog+0x10/0x10
авг 07 12:32:25 MyHomeAssistant kernel: __run_timers.part.0+0x1fb/0x2b0
авг 07 12:32:25 MyHomeAssistant kernel: ? ktime_get+0x43/0xc0
авг 07 12:32:25 MyHomeAssistant kernel: ? __pfx_tick_sched_timer+0x10/0x10
авг 07 12:32:25 MyHomeAssistant kernel: ? lapic_next_deadline+0x2c/0x50
авг 07 12:32:25 MyHomeAssistant kernel: ? clockevents_program_event+0xb2/0x140
авг 07 12:32:25 MyHomeAssistant kernel: run_timer_softirq+0x2a/0x60
авг 07 12:32:25 MyHomeAssistant kernel: __do_softirq+0xda/0x330
авг 07 12:32:25 MyHomeAssistant kernel: ? hrtimer_interrupt+0x12b/0x250
авг 07 12:32:25 MyHomeAssistant kernel: __irq_exit_rcu+0xa2/0xd0
авг 07 12:32:25 MyHomeAssistant kernel: irq_exit_rcu+0xe/0x20
авг 07 12:32:25 MyHomeAssistant kernel: sysvec_apic_timer_interrupt+0x96/0xb0
авг 07 12:32:25 MyHomeAssistant kernel: </IRQ>
авг 07 12:32:25 MyHomeAssistant kernel: <TASK>
авг 07 12:32:25 MyHomeAssistant kernel: asm_sysvec_apic_timer_interrupt+0x1b/0x20
авг 07 12:32:25 MyHomeAssistant kernel: RIP: 0010:cpuidle_enter_state+0xde/0x6f0
авг 07 12:32:25 MyHomeAssistant kernel: Code: a6 11 59 e8 b4 5f 45 ff 8b 53 04 49 89 c7 0f 1f 44 00 00 31 ff e8 72 3e 44 ff 80 7d d0 00 0f 85 e8 00 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 0f 02 00 00 4d 63 ee 49 83 fd 09 0f 87 c4 04 00 00
авг 07 12:32:25 MyHomeAssistant kernel: RSP: 0018:ffff99cf0013be28 EFLAGS: 00000246
авг 07 12:32:25 MyHomeAssistant kernel: RAX: 0000000000000000 RBX: ffff8e96f01bd928 RCX: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: RBP: ffff99cf0013be78 R08: 0000000000000000 R09: 0000000000000000
авг 07 12:32:25 MyHomeAssistant kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa86c22c0
авг 07 12:32:25 MyHomeAssistant kernel: R13: 0000000000000001 R14: 0000000000000001 R15: 00003d4575d9009d
авг 07 12:32:25 MyHomeAssistant kernel: ? cpuidle_enter_state+0xce/0x6f0
авг 07 12:32:25 MyHomeAssistant kernel: cpuidle_enter+0x2e/0x50
авг 07 12:32:25 MyHomeAssistant kernel: cpuidle_idle_call+0x14f/0x1e0
авг 07 12:32:25 MyHomeAssistant kernel: do_idle+0x82/0x110
авг 07 12:32:25 MyHomeAssistant kernel: cpu_startup_entry+0x20/0x30
авг 07 12:32:25 MyHomeAssistant kernel: start_secondary+0x122/0x160
авг 07 12:32:25 MyHomeAssistant kernel: secondary_startup_64_no_verify+0xe5/0xeb
авг 07 12:32:25 MyHomeAssistant kernel: </TASK>
авг 07 12:32:25 MyHomeAssistant kernel: ---[ end trace 0000000000000000 ]---

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-6.2.0-26-generic 6.2.0-26.26~22.04.1
ProcVersionSignature: Ubuntu 6.2.0-26.26~22.04.1-generic 6.2.13
Uname: Linux 6.2.0-26-generic x86_64
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: amd64
CasperMD5CheckResult: unknown
Date: Mon Aug 7 22:07:05 2023
InstallationDate: Installed on 2021-11-30 (615 days ago)
InstallationMedia: Ubuntu 20.04.3 LTS "Focal Fossa" - Release amd64 (20210819)
SourcePackage: linux-signed-hwe-6.2
UpgradeStatus: Upgraded to jammy on 2022-08-14 (358 days ago)

Revision history for this message
Garry (garry0garry) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-hwe-6.2 (Ubuntu):
status: New → Confirmed
Revision history for this message
Brett D (brettface) wrote :

Yeah, I'm seeing this too. r8169 as well.

I get 1-3 days of working ethernet, but then it'll randomly crap out with the a "NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out" error in the log. Just started happening about a month or so ago?

I'd roll back to an earlier kernel but I'm not sure how to do that headless.

Revision history for this message
dokuro (markcurrancavan) wrote :

I'm also seeing this problem and I can force it to happen fairly easily. I'm running a application (plex media server) and when I force it to do a metadata refresh of my media which is resident on a NAS mounted via nfsv3 the network crashes within a few minutes. You can see from below this has happened many times now since 6.2 was upgraded on my system.

plex@dokuro:~$ journalctl | grep "NETDEV WATCHDOG"
Aug 05 14:48:12 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 06 09:41:27 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 06 10:19:54 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 06 11:27:59 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 06 16:50:51 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 08 09:27:28 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
Aug 08 13:06:14 dokuro kernel: NETDEV WATCHDOG: enp63s0 (r8169): transmit queue 0 timed out
plex@dokuro:~$

This started when the 6.2.0-26 was rolled out to ubuntu 22.04. Booted into the previous 5.19 kernel this problem does not happen so at least to me its a clear regression with 6.2.

Also, while I have no evidence bar my own gut feeling I sense the problem maybe related to nfs and would be interested to know if others with this issue have NFS mounts.

Revision history for this message
Brett D (brettface) wrote :

It's interesting that you say that. I am also seeing this happen on a server that runs Plex Media Server. However mine accesses its media on a NAS via autofs/cifs/smb, not nfs.

This machine had been running for about 5 years without incident until these errors started in mid-July.

plex@plex:~$ journalctl | grep "NETDEV WATCHDOG"
Jul 17 12:57:10 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 17 18:58:02 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 21 14:44:11 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 22 20:57:42 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 24 11:00:57 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 24 18:00:06 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 25 21:10:53 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 27 04:58:54 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 31 18:02:27 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Aug 02 09:08:28 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Aug 08 04:45:26 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Aug 08 22:07:41 plex kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out

Revision history for this message
Garry (garry0garry) wrote :

I don't use NFS mount.

Revision history for this message
demonpengu (andy-demonpenguin) wrote :

Same issue for me. not using plex - but I am mounting shares with both nfs and cifs

Aug 09 09:30:24 gwen kernel: NETDEV WATCHDOG: enp4s0 (r8169): transmit queue 0 timed out
Aug 12 15:48:57 gwen kernel: NETDEV WATCHDOG: enp4s0 (r8169): transmit queue 0 timed out

The only way to get the network back is a full reboot.

Revision history for this message
dokuro (markcurrancavan) wrote (last edit ):

Issue appears to be with kernel 6.2 and the older Realtek chip set. My system has the following (taken from sudo lshw -c network) ...

description: Ethernet interface
product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
vendor: Realtek Semiconductor Co., Ltd.

For now I have moved back to 5.19.0-50 and the network is back again rock solid without any problems after a few days of usage. While on 6.2.0-26 I could get no more than a day without having to reboot, sometimes multiple times a day. I'd recommend everyone doing the same until this issue is resolved, hopefully in some future kernel update.

Revision history for this message
Brett D (brettface) wrote :

I went back and looked at my apt install logs alongside my network crash logs. I'm pretty sure I was able to pinpoint the version that started it all.

I had no crashes with kernel 6.2.0-24.24 or anything prior.

Once I upped to 6.2.0-25.25, that's when things started breaking.

Lo and behold, there's a bunch of r8169 stuff in that version's changelog:
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_6.2.0-25.25/changelog

  * Fix only reach PC3 when ethernet is plugged r8169 (LP: #1946433)
    - r8169: use spinlock to protect mac ocp register access
    - r8169: use spinlock to protect access to registers Config2 and Config5
    - r8169: enable cfg9346 config register access in atomic context
    - r8169: prepare rtl_hw_aspm_clkreq_enable for usage in atomic context
    - r8169: disable ASPM during NAPI poll
    - r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll

That log references this bug https://bugs.launchpad.net/ubuntu/+source/linux-oem-5.14/+bug/1946433

I don't really understand what's going on there, but at least we have a possible suspect.

Revision history for this message
Brett D (brettface) wrote :

I was able to roll back to 6.2.0-24-generic.

As of today it's been one full week and I haven't experienced a single loss of ethernet.

Just posting to confirm the problem definitely started in 6.2.0-25.

Revision history for this message
Garry (garry0garry) wrote :

6.2.0-31-generic - bug not fixed

Revision history for this message
Garry (garry0garry) wrote :
Download full text (5.1 KiB)

Linux MyHomeAssistant 6.2.0-31-generic #31~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Aug 16 13:45:26 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Description: Ubuntu 22.04.3 LTS
Release: 22.04

вг 31 03:42:41 MyHomeAssistant kernel: ------------[ cut here ]------------
авг 31 03:42:41 MyHomeAssistant kernel: NETDEV WATCHDOG: enp2s0 (r8169): transmit queue 0 timed out
авг 31 03:42:41 MyHomeAssistant kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x21f/0x230
авг 31 03:42:41 MyHomeAssistant kernel: Modules linked in: rfcomm tls xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libc>
авг 31 03:42:41 MyHomeAssistant kernel: libarc4 cec snd_rawmidi btbcm polyval_generic mc snd_seq btintel ghash_clmulni_intel rc_core processor_thermal_device_pci_legacy btmtk sha512_ssse3 joydev snd_pcm input_leds drm_kms_helper snd_seq_d>
авг 31 03:42:41 MyHomeAssistant kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.2.0-31-generic #31~22.04.1-Ubuntu
авг 31 03:42:41 MyHomeAssistant kernel: Hardware name: AZW U59/U59, BIOS JTKT001 05/05/2022
авг 31 03:42:41 MyHomeAssistant kernel: RIP: 0010:dev_watchdog+0x21f/0x230
авг 31 03:42:41 MyHomeAssistant kernel: Code: 00 e9 31 ff ff ff 4c 89 e7 c6 05 f6 8a 78 01 01 e8 b6 ff f7 ff 44 89 f1 4c 89 e6 48 c7 c7 70 41 04 a6 48 89 c2 e8 d1 e6 2b ff <0f> 0b e9 22 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
авг 31 03:42:41 MyHomeAssistant kernel: RSP: 0018:ffffb79b401a8e70 EFLAGS: 00010246
авг 31 03:42:41 MyHomeAssistant kernel: RAX: 0000000000000000 RBX: ffff88ab5293c4c8 RCX: 0000000000000000
авг 31 03:42:41 MyHomeAssistant kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
авг 31 03:42:41 MyHomeAssistant kernel: RBP: ffffb79b401a8e98 R08: 0000000000000000 R09: 0000000000000000
авг 31 03:42:41 MyHomeAssistant kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88ab5293c000
авг 31 03:42:41 MyHomeAssistant kernel: R13: ffff88ab5293c41c R14: 0000000000000000 R15: 0000000000000000
авг 31 03:42:41 MyHomeAssistant kernel: FS: 0000000000000000(0000) GS:ffff88aeb0100000(0000) knlGS:0000000000000000
авг 31 03:42:41 MyHomeAssistant kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
авг 31 03:42:41 MyHomeAssistant kernel: CR2: 0000563400a2f3dc CR3: 00000002cb810000 CR4: 0000000000350ee0
авг 31 03:42:41 MyHomeAssistant kernel: Call Trace:
авг 31 03:42:41 MyHomeAssistant kernel: <IRQ>
авг 31 03:42:41 MyHomeAssistant kernel: ? __pfx_dev_watchdog+0x10/0x10
авг 31 03:42:41 MyHomeAssistant kernel: call_timer_fn+0x29/0x160
авг 31 03:42:41 MyHomeAssistant kernel: ? __pfx_dev_watchdog+0x10/0x10
авг 31 03:42:41 MyHomeAssistant kernel: __run_timers.part.0+0x1fb/0x2b0
авг 31 03:42:41 MyHomeAssistant kernel: ? ktime_get+0x43/0xc0
авг 31 03:42:41 MyHomeAssistant kernel: ? __pfx_tick_sched_timer+0x10/0x10
авг 31 03:42:41 MyHomeAssistant kernel: ? lapic_next_deadline+0x2c/0x50
авг 31 03:42:41 MyHomeAssistant kernel: ? clockevents_program_event+0xb2/0x140
авг 31 03:42:41 MyHomeAssistant kernel: run_timer_softirq+0x2a/0x60
авг 31 03:42:41 MyHomeAssistant kernel: __d...

Read more...

Revision history for this message
Garry (garry0garry) wrote :
Revision history for this message
Brett D (brettface) wrote :

According to the link that Garry posted to 2031537, this may be fixed in kernel 6.2.0-36.37 which was released a week ago.

However it's not clear to me if this bug and that one share a root cause (ie: are duplicates).

Is anyone here brave enough to try the latest and can report back?

Revision history for this message
Garry (garry0garry) wrote :

6.2.0.36. A week without problems.

Revision history for this message
Brett D (brettface) wrote :

I rolled the dice a week ago and upped to Ubuntu 23.10, taking me to kernel 6.5.0-10-generic.

I've had no loss of ethernet since then. This one seems fixed for me!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.