Intel AC 9260 (iwlwifi) completely unstable and halts cores in 19.10

Bug #1855637 reported by Wayne Schroeder
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After updating from 18.04 to 19.10, the entire system will freeze and eventually recover with many dmesg errors with backtraces and all sorts of errors around the iwlwifi module. I eventually restored to an 18.04 backup, but not before logging a lot of information and also validating that it reproduces with the 19.10 "Try Ubuntu" from the iso, so it certainly isn't an artifact of the upgrade (although ubuntu-drivers does appear a bit confused about iwlwifi when I did an upgrade, but that appears to be unrelated).

It is worth noting that I also use the bluetooth module in this card, but it is connected via a cable to an internal USB 2 header to the system board and I believe it uses another driver entirely.

I have additional external drives I can install test bed systems on to test fixes if need be, but it is worth noting that this is rock solid on 18.04.3 and the issue will reproduce within 15 minutes of watching youtube videos on 19.10.

DMESG output from start of issues is attached.

Wayne
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu8.2
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 19.10
InstallationDate: Installed on 2019-12-09 (0 days ago)
InstallationMedia: Ubuntu 19.10 "Eoan Ermine" - Release amd64 (20191017)
MachineType: Gigabyte Technology Co., Ltd. Z97-HD3
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 nouveaudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-24-generic root=UUID=24f8119c-dff3-46da-9242-15c3b9c76355 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.3.0-24.26-generic 5.3.10
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-24-generic N/A
 linux-backports-modules-5.3.0-24-generic N/A
 linux-firmware 1.183.2
Tags: eoan
Uname: Linux 5.3.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 07/31/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F9
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: Z97-HD3
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF9:bd07/31/2015:svnGigabyteTechnologyCo.,Ltd.:pnZ97-HD3:pvrTobefilledbyO.E.M.:rvnGigabyteTechnologyCo.,Ltd.:rnZ97-HD3:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: Z97-HD3
dmi.product.sku: To be filled by O.E.M.
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Wayne Schroeder (razathorn) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1855637/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1855637

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Wayne Schroeder (razathorn) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected eoan
description: updated
Revision history for this message
Wayne Schroeder (razathorn) wrote : AudioDevicesInUse.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : CRDA.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : IwConfig.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : Lspci.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : Lsusb.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : ProcModules.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : PulseList.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : RfKill.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : UdevDb.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote : WifiSyslog.txt

apport information

Revision history for this message
Wayne Schroeder (razathorn) wrote :

Please note that I went ahead and reinstalled on another drive and completely updated, so there is a system ready to go with this issue--I just have to boot into it. It's a slow USB external drive, so, gross, but it works. Also, I'm a former c developer with some kernel dev experience in the past. I'm a bit (lot) rusty, but just so you're aware, I can probably deliver some value in troubleshooting this.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Wayne, I believe this can be fixed in https://launchpad.net/~vicamo/+archive/ubuntu/ppa-1855825, which should be landed soon into Eoan/Focal if you prefer a DKMS based solution. It would take some more time to fix generic kernel in Eoan.

Revision history for this message
Wayne Schroeder (razathorn) wrote :

@You-Sheng, I honestly have been VERY out of the loop on linux kernel/hardware since the 2.4-2.6 days. I've been stuck in AWS for years ;)

Can you verify I'm understanding the situation correctly? It appears that the kernel in Eoan contains a driver that contains this issue, and the backport-iwlwifi package will receive the fix first in the form of a "much more recent driver from intel" made possible by DKMS? Is this interpretation correct?

I'll go ahead and try this PPA and verify if it fixes the issue!

Thinking about it a bit, does this mean that the firmware eaon ships is incompatible with the upstream kernel's driver, or that the upstream kernel driver just has a bug? Are the firmware / kernels tied together, and how do we get the right firmware that will work with the backport-iwlwifi packages?

Thanks in advance for the clarification. Trying to come up to speed again.

Revision history for this message
Wayne Schroeder (razathorn) wrote :

I tried the PPA and authentication failed to the network. I think the applicable messages are as follows:

Dec 11 15:24:28 malt NetworkManager[888]: <info> [1576099468.0140] device (wlp6s0): supplicant interface state: associating -> associated
Dec 11 15:24:28 malt NetworkManager[888]: <info> [1576099468.0140] device (p2p-dev-wlp6s0): supplicant management interface state: associating -> associated
Dec 11 15:24:28 malt NetworkManager[888]: <info> [1576099468.0206] device (wlp6s0): supplicant interface state: associated -> 4-way handshake
Dec 11 15:24:28 malt NetworkManager[888]: <info> [1576099468.0206] device (p2p-dev-wlp6s0): supplicant management interface state: associated -> 4-way handshake
Dec 11 15:24:28 malt wpa_supplicant[889]: nl80211: kernel reports: NLA_F_NESTED is missing
Dec 11 15:24:28 malt wpa_supplicant[889]: wlp6s0: WPA: Failed to set PTK to the driver (alg=3 keylen=16 bssid=9c:3d:cf:1e:d1:05)
Dec 11 15:24:28 malt kernel: [ 543.037235] wlp6s0: deauthenticating from 9c:3d:cf:1e:d1:05 by local choice (Reason: 1=UNSPECIFIED)
Dec 11 15:24:28 malt wpa_supplicant[889]: wlp6s0: CTRL-EVENT-DISCONNECTED bssid=9c:3d:cf:1e:d1:05 reason=1 locally_generated=1
Dec 11 15:24:28 malt wpa_supplicant[889]: wlp6s0: WPA: 4-Way Handshake failed - pre-shared key may be incorrect
Dec 11 15:24:28 malt wpa_supplicant[889]: wlp6s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="chewies-5G" auth_failures=1 duration=10 reason=WRONG_KEY
Dec 11 15:24:28 malt wpa_supplicant[889]: wlp6s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="chewies-5G" auth_failures=2 duration=20 reason=CONN_FAILED

I am adding the full syslog relevant section and dmesg from previous boot with the ppa version loaded.

Revision history for this message
Wayne Schroeder (razathorn) wrote :
Revision history for this message
Wayne Schroeder (razathorn) wrote :
Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Wayne,

TL;dr: simply remove backport-iwlwifi-dkms on your Eoan system with Intel AC 9260 installed.

First, "the entire system will freeze" is to be resolved in bug 1855825, which will fix backport-iwlwifi-dkms in Eoan/Focal for generic Ubuntu, and Bionic for OEM projects. Eventually generic Bionic will have the fix in bug 1856024.

Second, the association failure problem when backport-iwlwifi-dkms is installed is currently a known issue in my list. Probably there is already a bug for it, takes some time to filter they out. TBD.

At last, this (1855637/1855825) issue is caused by a mainline patch backported to stable kernels, which modifies data structure layout, and while DKMS packages may carry a copy of kernel headers that it uses, such different data layout may cause unexpected behaviors, and in this case, it hangs your system. Ubuntu kernels will merge stable patches, so all Ubuntu kernels with this patch included are affected. So it follows, if you're running a slightly older kernel, it will be fine; if you don't use backport-iwlwifi-dkms, it will be fine, too.

DKMS is created to include features/fixes that are not currently in generic kernel "at the time". So it's not always "newer" than generic kernel.

Revision history for this message
Wayne Schroeder (razathorn) wrote :

Current results...

Eoan iso live boot: cores hang and recover
Eoan install + all updates: cores hang and recover
Eoan install + all updates + backport iwlwifi ppa: cannot associate
Eoan install + all updates + backport iwlwifi from Eoan: Kernel panic

Bionic + all updates: Everything works fine.

Not sure where to go from here, but wanted to communicate the results of all the testing I did.

Is there some special version of the linux-firmware package I should be using?

Wayne

Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Wayne,

Simply remove backport-iwlwifi-dkms on your Eoan system.

Revision history for this message
Wayne Schroeder (razathorn) wrote :

@You-Sheng,

I have done that. It has ran all day without issue. Having said that, I am very confused because I previously had the issue with Eoan with all updates but no backport-iwlwifi-dkms installed. I don't know what has changed.

In the apport information, it seems to confirm this--updated stock generic kernel with issue. I'm so confused at this point.

Revision history for this message
Wayne Schroeder (razathorn) wrote :

Just following up to note that this still reproduces (cpu core hang and backtrace) on Eoan with the backport-iwlwifi-dkms package removed.

Revision history for this message
Alejandro Mery (amery) wrote :

problem still happening in latest 20.04 with and without backport-iwlwifi-dkms

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.