Intel Wireless-AC 7260 (rev bb) wifi card crashes randomly

Bug #1913350 reported by Przemek K.
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Confirmed
High
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Intel Wireless-AC 7260 (rev bb) wifi card crashes randomly.
Sometimes after 1h of using my laptop, and sometimes after resuming from sleep.
It can't work again until I reboot my laptop.
The most relevant error message is:
sty 25 14:42:13 leetbook kernel: iwlwifi 0000:25:00.0: Failed to wake NIC for hcmd
sty 25 14:42:13 leetbook kernel: iwlwifi 0000:25:00.0: Error sending STATISTICS_CMD: enqueue_hcmd failed: -5
I'm using Ubuntu 20.04 that was upgraded from 18.04 and earlier 16.04.
I've searched for similar bugs, but it looks like my error message and card model is a bit different.

The workaround script from bug 1673344 works to fix the wifi without rebooting.
The card works fine in Windows 10 so it's not a hardware issue.
This is not a stock card from HP EliteBook 8470w, I've replaced it to gain Wifi 802.11ac transfer speeds.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-64-generic 5.4.0-64.72
ProcVersionSignature: Ubuntu 5.4.0-64.72-generic 5.4.78
Uname: Linux 5.4.0-64-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.14
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: azrael 2915 F.... pulseaudio
 /dev/snd/controlC1: azrael 2915 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Tue Jan 26 23:23:06 2021
EcryptfsInUse: Yes
InstallationDate: Installed on 2016-08-08 (1632 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
MachineType: Hewlett-Packard HP EliteBook 8470w
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=pl_PL.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-64-generic root=/dev/mapper/rootvg-rootlv ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-64-generic N/A
 linux-backports-modules-5.4.0-64-generic N/A
 linux-firmware 1.187.8
SourcePackage: linux
UpgradeStatus: Upgraded to focal on 2021-01-10 (16 days ago)
dmi.bios.date: 04/11/2019
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68ICF Ver. F.74
dmi.board.name: 179B
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 42.38
dmi.chassis.asset.tag: CNU2499X8F
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr68ICFVer.F.74:bd04/11/2019:svnHewlett-Packard:pnHPEliteBook8470w:pvrA1029D1102:rvnHewlett-Packard:rn179B:rvrKBCVersion42.38:cvnHewlett-Packard:ct10:cvr:
dmi.product.family: 103C_5336AN G=N L=BUS B=HP S=ELI
dmi.product.name: HP EliteBook 8470w
dmi.product.sku: LY541EA#AKD
dmi.product.version: A1029D1102
dmi.sys.vendor: Hewlett-Packard
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.17
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: azrael 3033 F.... pulseaudio
 /dev/snd/pcmC0D0p: azrael 3033 F...m pulseaudio
 /dev/snd/controlC1: azrael 3033 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2016-08-08 (1734 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
MachineType: Hewlett-Packard HP EliteBook 8470w
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=pl_PL.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.8.0-50-generic root=/dev/mapper/rootvg-rootlv ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.8.0-50.56~20.04.1-generic 5.8.18
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-50-generic N/A
 linux-backports-modules-5.8.0-50-generic N/A
 linux-firmware 1.187.12
Tags: focal
Uname: Linux 5.8.0-50-generic x86_64
UpgradeStatus: Upgraded to focal on 2021-01-10 (117 days ago)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 04/11/2019
dmi.bios.release: 15.116
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68ICF Ver. F.74
dmi.board.name: 179B
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 42.38
dmi.chassis.asset.tag: CNU2499X8F
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.ec.firmware.release: 66.56
dmi.modalias: dmi:bvnHewlett-Packard:bvr68ICFVer.F.74:bd04/11/2019:br15.116:efr66.56:svnHewlett-Packard:pnHPEliteBook8470w:pvrA1029D1102:rvnHewlett-Packard:rn179B:rvrKBCVersion42.38:cvnHewlett-Packard:ct10:cvr:
dmi.product.family: 103C_5336AN G=N L=BUS B=HP S=ELI
dmi.product.name: HP EliteBook 8470w
dmi.product.sku: LY541EA#AKD
dmi.product.version: A1029D1102
dmi.sys.vendor: Hewlett-Packard
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.17
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: azrael 3033 F.... pulseaudio
 /dev/snd/pcmC0D0p: azrael 3033 F...m pulseaudio
 /dev/snd/controlC1: azrael 3033 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2016-08-08 (1734 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
MachineType: Hewlett-Packard HP EliteBook 8470w
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=pl_PL.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.8.0-50-generic root=/dev/mapper/rootvg-rootlv ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.8.0-50.56~20.04.1-generic 5.8.18
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-50-generic N/A
 linux-backports-modules-5.8.0-50-generic N/A
 linux-firmware 1.187.12
Tags: focal
Uname: Linux 5.8.0-50-generic x86_64
UpgradeStatus: Upgraded to focal on 2021-01-10 (117 days ago)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 04/11/2019
dmi.bios.release: 15.116
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68ICF Ver. F.74
dmi.board.name: 179B
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 42.38
dmi.chassis.asset.tag: CNU2499X8F
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.ec.firmware.release: 66.56
dmi.modalias: dmi:bvnHewlett-Packard:bvr68ICFVer.F.74:bd04/11/2019:br15.116:efr66.56:svnHewlett-Packard:pnHPEliteBook8470w:pvrA1029D1102:rvnHewlett-Packard:rn179B:rvrKBCVersion42.38:cvnHewlett-Packard:ct10:cvr:
dmi.product.family: 103C_5336AN G=N L=BUS B=HP S=ELI
dmi.product.name: HP EliteBook 8470w
dmi.product.sku: LY541EA#AKD
dmi.product.version: A1029D1102
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
In , jagjot.no1 (jagjot.no1-linux-kernel-bugs) wrote :

Created attachment 288875
output of "dmesg | grep iwlwifi"

I'm facing random crashes of iwlwifi driver. I'm using fedora 32 but this issue can also be reproducible on Ubuntu 20.04. I've tried and searched for hours for a fix but unable to find one, filing bug here is the only last resort I had. I have "Intel Corporation Dual Band Wireless-AC 7260" card.
Also here is an output of rfkill:
ID TYPE DEVICE SOFT HARD
 0 wlan ideapad_wlan unblocked unblocked
 1 bluetooth ideapad_bluetooth unblocked unblocked
 2 wlan phy0 unblocked unblocked
 3 bluetooth hci0 unblocked unblocked

Here is an output of $ ethtool -i wlp8s0 | grep firmware

firmware-version: 17.3216344376.0 7260-17.ucode

Revision history for this message
In , jagjot.no1 (jagjot.no1-linux-kernel-bugs) wrote :

Forgot to mention that I've to reboot to wlan work again.

Revision history for this message
In , jagjot.no1 (jagjot.no1-linux-kernel-bugs) wrote :

Can we have some info on this? I even tried upgrading to kernel 5.6.10 but the issue is still there! I'm not the only one with this please see https://forum.mxlinux.org/viewtopic.php?t=55392

This is a Firmware/Driver issue.

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :
Download full text (9.1 KiB)

I'm going to tag along on this bug as I'm seeing likely the same problem.
The crash that I see is not that ramdom, I can easily trigger it by doing
a 1GB curl transfer which won't ever complete as the crash happens to frequently. As far as I can tell a reboot is necessary to clear the fault.
I use one machine as a testbed for wireless cards as it has a mPCIe slot.
Thus far I've used Intel 5100, 6205, 200AX, and now 7260AC cards in this machine.
I've only had problems with 7260AC cards, of which I have two, both exhibit the same problem.
So far I have found that the problem only occurs when connected at 2.4GHz.
Also, only when connected at 40MHz, 20MHz connections OK, xfer BW ~10MB/s.
Additionally, only when connected at 40MHz at boot.
If found that if I connect at 2.4GHz/20MHz and make a test transfer, I can then
reconfigure the AP to force 40MHz BW following which subsequent transfers
complete successfully at ~24MB/s, what I'm expecting from that configuration.
The 5GHz band does not exhibit any firmware crashes but does suffer from highly
variable transfer bandwidth.
When connected at 80MHz BW, maxes out at 30MB/s but that's not consistent. Most
of the time I only see half that, 15MB/s and that's without moving anything.
I'm using Debian Buster with kernel 4.19, though I've also tried 5.6 and 5.7
backported kernels and backported firmware. Same problem seen with the newer kernels and firmware.
I've looked at a number of wireless cards over the years and I haven't seen this kind of flaky problem before. I hope this report is taken seriously.
I regard this card as particularly important in that it is the newest and most
capable Intel Wireless card available in the mPCIE form factor. I would really like to see it working properly under linux.

qm77 motherboard, 3820QM CPU
Linux imb170 4.19.0-10-amd64 #1 SMP Debian 4.19.132-1 (2020-07-24) x86_64 GNU/Linux
03:00.0 Network controller: Intel Corporation Wireless 7260 (rev bb)
       description: Wireless interface
       product: Wireless 7260
       vendor: Intel Corporation
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: wlan0
       version: bb
       serial: 00:16:6f:e7:16:2a
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless
       configuration: broadcast=yes driver=iwlwifi driverversion=4.19.0-10-amd64 firmware=17.3216344376.0 ip=192.168.1.126 latency=0 link=yes multicast=yes wireless=IEEE 802.11
       resources: irq:38 memory:f7a00000-f7a01fff
odule = "iwlwifi"

  Attributes:
    coresize = "249856"
    initsize = "0"
    initstate = "live"
    refcnt = "1"
    taint = ""
    uevent = <store method only>

  Parameters:
    11n_disable = "0"
    amsdu_size = "0"
    antenna_coupling = "0"
    bt_coex_active = "Y"
    d0i3_disable = "Y"
    d0i3_timeout = "1000"
    disable_11ac = "N"
    disable_11ax = "N"
    fw_monitor = "N"
    fw_restart = "Y"
    lar_disable = "N"
    led_mode = "0"
    nvm_file = "(null)"
  ...

Read more...

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :

I'm adding lspci output for my test machine. As one can see, it's pretty much all Intel.

00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C216 Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C216 Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C216 Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4)
00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C216 Chipset Family SMBus Controller (rev 04)
01:00.0 Non-Volatile memory controller: Sandisk Corp WD Black 2018/PC SN720 NVMe SSD
03:00.0 Network controller: Intel Corporation Wireless 7260 (rev bb)
04:00.0 Ethernet controller: Intel Corporation 82583V Gigabit Network Connection

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :
Download full text (3.1 KiB)

After further testing I revise assessment of yesterday concerning the influence of band/bandwidth combinations on the crash manifestation. It turns out that the crash can happen on both the 2.4 and 5GHz bands and at both 20 and 40 MHz bandwidths on the 2.4GHz band. It's just less likely to occur at 20MHz bandwidth in the 2.4GHz band than when using 40MHz bandwidth. It also appears that allowing legacy b rates at the AP increases the frequency of the crash. Also configuring the AP to disssociate on low acknowledgement may also increase the crash likelihood.

By disabling legacy b rates and disassociation of low ack at the AP and by performing enough runs, I was able to get transfer throughput measurements in my test harness to compare the 7260 with other cards that I've tested. My throughput test is a simple 1GB file transfer using curl remotehost -> /dev/null. The transfer direction is toward the wireless device, the data is pulled from a wired host.

Card 2.4/20 2.4/40 5.2/80
N5100 5.1MB/s 24.2 5-10* *Variable, human body modulates BW
N6205 10.1 21.7 22.0 Theoretical max 150/300Mb/s >1/2 OK
AX200 11.2 21.0 49.6 Theoretical 150/300/866Mb/s ~1/2 OK
7260AC 11.0 22.2 23.2 MCS 15 OK, MCS 15 OK, VHT-MCS 7-9 poor AC

In my test harness the 7260 card operates at near what I would call the expected throughput on the 2.4GHz band at both 20MHz and 40MHz channel bandwidths. However, I would characterize the 5.2GHz/80MHz AC throughput as less than stellar, less than half what I would expect. I was hoping for something more akin to the performance of the AX200 client.

The AP that I am currently using is the linksys ea6350v3, which uses the Qualcomm/Atheros ipq4018 SoC. I'm running openWRT on the AP. I like this radio, it is quite consistent, throughtput wise, from one device to another. To clarify, all seven of the ea6350v3's that I have yield similar throughput measurements. For further clarification, I stayed with the geriatric WRT54GL for many years just because I couldn't find a newer device that operated reliably and consistently at a higher level, until the ea6350v3 that is.

root@ea6350f:~# uname -a
Linux ea6350f 4.14.167 #0 SMP Wed Jan 29 16:05:35 2020 armv7l GNU/Linux
root@ea6350f:/etc# cat openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='19.07.1'
DISTRIB_REVISION='r10911-c155900f66'
DISTRIB_TARGET='ipq40xx/generic'
DISTRIB_ARCH='arm_cortex-a7_neon-vfpv4'
DISTRIB_DESCRIPTION='OpenWrt 19.07.1 r10911-c155900f66'
DISTRIB_TAINTS=''

The AP kernel log shows the following messages each time the client 7260 crashes:

[436037.379260] ath10k_ahb a000000.wifi: peer-unmap-event: unknown peer id 0
[436086.794546] ath10k_ahb a800000.wifi: 10.4 wmi init: vdevs: 16 peers: 48 tid: 96
[436086.794591] ath10k_ahb a800000.wifi: msdu-desc: 2500 skid: 32
[436086.841915] ath10k_ahb a800000.wifi: wmi print 'P 48/48 V 16 K 144 PH 176 T 186 msdu-desc: 2500 sw-crypt: 0 ct-sta: 0'
[436086.842966] ath10k_ahb a800000.wifi: wmi print 'free: 56528 iram: 23400 sram: 32520'
[436087.141520] ath10k_ahb a800000.wifi: Firmware lacks feature flag indicating a retry limit of > 2 is OK, requeste...

Read more...

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :
Download full text (6.5 KiB)

Attempted to work around the driver crash by unloading and reloading iwlmvm.ko. This results in a "Error, can not clear persistence bit" message in the kernel message log upon restart. It appears that following the initial crash the 7260 hardware is no longer responding to the kernel driver. Is there some way short of reboot to reset the 7260 device so that the kernel driver can re-initialize?

Unload and reload sequence:

modprobe -r iwlmvm
modprobe iwlmvm

dmesg output:

[19553.559632] Intel(R) Wireless WiFi driver for Linux
[19553.559633] Copyright(c) 2003- 2015 Intel Corporation
[19553.560176] iwlwifi 0000:03:00.0: firmware: direct-loading firmware iwlwifi-7260-17.ucode
[19553.560300] iwlwifi 0000:03:00.0: loaded firmware version 17.3216344376.0 op_mode iwlmvm
[19553.569183] iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0xFFFFFFFF
[19553.569196] iwlwifi 0000:03:00.0: Error, can not clear persistence bit
[19553.587422] ------------[ cut here ]------------
[19553.587423] Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
[19553.587436] WARNING: CPU: 5 PID: 2212 at drivers/net/wireless/intel/iwlwifi/pcie/trans.c:2033 iwl_trans_pcie_grab_nic_access+0x1e8/0x220 [iwlwifi]
[19553.587437] Modules linked in: iwlmvm(+) iwlwifi mac80211 cfg80211 cpufreq_powersave cpufreq_conservative cpufreq_userspace snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl ccm algif_aead cbc des_generic arc4 algif_skcipher cmac sha512_ssse3 sha512_generic x86_pkg_temp_thermal intel_powerclamp md4 algif_hash coretemp af_alg kvm_intel kvm snd_hda_intel snd_hda_codec btusb irqbypass btrtl crct10dif_pclmul btbcm i915 crc32_pclmul snd_hda_core btintel ghash_clmulni_intel snd_hwdep intel_cstate bluetooth drm_kms_helper snd_pcm intel_uncore drbg snd_timer mei_wdt ansi_cprng snd evdev ppdev drm pcc_cpufreq intel_rapl_perf soundcore pcspkr ecdh_generic sg mei_me iTCO_wdt mei iTCO_vendor_support rfkill i2c_algo_bit parport_pc parport button video nfsd auth_rpcgss nfs_acl lockd grace
[19553.587451] sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb sd_mod crc32c_intel ahci libahci libata nvme xhci_pci aesni_intel xhci_hcd ehci_pci e1000e ehci_hcd scsi_mod aes_x86_64 crypto_simd usbcore cryptd i2c_i801 glue_helper nvme_core lpc_ich mfd_core usb_common thermal fan [last unloaded: cfg80211]
[19553.587459] CPU: 5 PID: 2212 Comm: modprobe Tainted: G W 4.19.0-10-amd64 #1 Debian 4.19.132-1
[19553.587459] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./IMB-170, BIOS P1.90 04/30/2018
[19553.587463] RIP: 0010:iwl_trans_pcie_grab_nic_access+0x1e8/0x220 [iwlwifi]
[19553.587464] Code: 09 dd 49 8d 56 08 bf 00 02 00 00 e8 a2 de ff db e9 33 ff ff ff 89 c6 48 c7 c7 80 75 eb c0 c6 05 69 46 02 00 01 e8 c2 0a fe db <0f> 0b e9 ee fe ff ff 48 8b 7b 30 48 c7 c1 e8 75 eb c0 31 d2 31 f6
[19553.587465] RSP: 0018:ffff9b5840d57b40 EFLAGS: 00010086
[19553.587466] RAX: 0000000000000000 RBX: ffff8de5b4840018 RCX: 0000000000000006
[19553.587466] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8de5d65566b0
[19553.587467] RBP: 0000000000000000 R08: 00000000000004cc R09: 0000000000000004
[...

Read more...

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :

I found that the 7260 can be reset without reboot by removing the device at the pci level and then re-scanning the pci bus to bring the device back, thusly:

cd /sys/bus/pci/devices/0000\:03\:00.0
echo 1 | sudo tee remove
cd /sys/bus/pci
echo 1 | sudo tee rescan

According to the kernel message log, an ASPM configuration problem is discovered on pci rescan. This indicates, at least to me, that the failure may therefore be related to ASPM. Perhaps the device is going to a lower ASPM power level without proper coordination with the host.

sudo dmesg

[19453.872457] pci 0000:03:00.0: [8086:08b1] type 00 class 0x028000
[19453.872534] pci 0000:03:00.0: reg 0x10: [mem 0x00000000-0x00001fff 64bit]
[19453.872797] pci 0000:03:00.0: PME# supported from D0 D3hot D3cold
[19453.873017] pcieport 0000:00:1c.2: ASPM: current common clock configuration is broken, reconfiguring
[19453.884494] pci 0000:03:00.0: BAR 0: assigned [mem 0xf7a00000-0xf7a01fff 64bit]
[19453.884594] iwlwifi 0000:03:00.0: enabling device (0100 -> 0102)
[19453.885478] iwlwifi 0000:03:00.0: firmware: direct-loading firmware iwlwifi-7260-17.ucode
[19453.885897] iwlwifi 0000:03:00.0: loaded firmware version 17.3216344376.0 op_mode iwlmvm
[19453.885923] iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0x144
[19453.904591] iwlwifi 0000:03:00.0: base HW address: 00:16:6f:e7:16:2a
[19454.102848] ieee80211 phy2: Selected rate control algorithm 'iwl-mvm-rs'
[19454.400222] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[19457.578660] wlan0: authenticate with 60:38:e0:87:3a:41
[19457.581658] wlan0: send auth to 60:38:e0:87:3a:41 (try 1/3)
[19457.599202] wlan0: authenticated
[19457.600444] wlan0: associate with 60:38:e0:87:3a:41 (try 1/3)
[19457.607129] wlan0: RX AssocResp from 60:38:e0:87:3a:41 (capab=0x421 status=0 aid=1)
[19457.608029] wlan0: associated
[19457.608560] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

Revision history for this message
In , paula (paula-linux-kernel-bugs) wrote :

The ASPM message that I previously noted gave me the idea of forcing ASPM via kernel parameter. This had no effect on 7260 connection reliability.

However, I noticed something that I had previously hadn't, that the 7260 doesn't like suspending and resuming. After an S3 suspend/resume cycle, curl transfer speed is very low < 1MB/s and doesn't last long before the driver crashes as before. Interestingly, this low bandwidth unstable condition survives a device PCI device removal and rescan. To clarify, one the machine has been suspended and resumed, the 7260 never works well or at all until a complete reboot.

One other thing I noted while looking around for 7260 info, at least one person finds that the 7260 is "infamous":

https://askubuntu.com/questions/645506/what-driver-for-intel-ac-7260-adapter

So given my own experience, and finding that the 7260 "17" firmware hasn't been updated for 2.5+ years, I'm thinking now that further effort on the 7260 likely won't be worth anything.

Lastly, the reason that I looked at the 7260 in the first place is that it has mPCI form factor and u.fl RF connectors vs m.2 and MHF4 for newer Intel cards. Though the newer cards and connectors are smaller, using adapters and extra cables actually requires more room than a proper fitting card. Further, I've recently found some newer mPCIe cards with U.fl connectors: QualComm/Atheros QCA6174, Broadcom BCM94352, and what look like Chinese repackaged Intel radios, unbranded 7265AC and unbranded MPE-AX3000H. I've seen the QCA6174 before in a Samsung Ativ Book 9 notebook and I like it. I've also wanted to look at what Broadcom is up to in the client radio space, and looking at these Chinese knockoffs also looks worthwhile. These radios are all cheap enough that I can just buy them all and see what they can do. So right now, I thinking that unless something new happens on the 7260 front, I'm not likely to at it any more.

Revision history for this message
In , jagjot.no1 (jagjot.no1-linux-kernel-bugs) wrote :

Hi, @Paul I'd like to thank you for showing the determination and providing your thoughts on the issue. Actually it seems like commenting here won't do anything, you have to raise a ticket with the intel iwlwifi team. That said I've already raised that ticket and I'm actively working with the support team on this. But due to my job, I'm not able to provide them the info they need ASAP, saying that it's been 2 months since I'm in contact with the support team but just last week they realized it's actually a driver issue. I'm still requested to provide them with more information. I request you to raise a support ticket with them on this matter. I'll also be a huge help to me as you may be able to provide them with the required info faster than me. Feel free to contact me via my mail if you want to help.
Thanks,
JJ

Revision history for this message
In , jagjot.no1 (jagjot.no1-linux-kernel-bugs) wrote :

@Paul I just realized you may not be able to see my mail, you can just comment here, I'll contact you if you are interested.
Thanks,
JJ

Revision history for this message
Przemek K. (azrael) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Przemek K. (azrael) wrote :
Revision history for this message
Przemek K. (azrael) wrote :
description: updated
description: updated
Revision history for this message
In , pkx616 (pkx616-linux-kernel-bugs) wrote :

I'm experiencing the same bug on Ubuntu 20.04 with linux 5.4.0-64-generic.
I've raised a bug in Ubuntu's bug tracker about it:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1913350
It contains lots of logs.

Intel Wireless-AC 7260 (rev bb) wifi card crashes randomly.
Sometimes after 1h of using my laptop, and sometimes after resuming from sleep.
It can't work again until I reboot my laptop.
The most relevant error message is:
sty 25 14:42:13 leetbook kernel: iwlwifi 0000:25:00.0: Failed to wake NIC for hcmd
sty 25 14:42:13 leetbook kernel: iwlwifi 0000:25:00.0: Error sending STATISTICS_CMD: enqueue_hcmd failed: -5

The card works fine in Windows 10 so it's not a hardware issue.
This is not a stock card from HP EliteBook 8470w, I've replaced it to gain Wifi 802.11ac transfer speeds.

The workaround script from Ubuntu bug 1673344 works to fix the wifi without rebooting.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1673344/comments/37
It was also posted on the Kernel's bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=191601

Revision history for this message
In , pkx616 (pkx616-linux-kernel-bugs) wrote :

Created attachment 294873
Journalctl logs - part1

Journalctl logs - part1

Revision history for this message
In , pkx616 (pkx616-linux-kernel-bugs) wrote :

Created attachment 294875
Journalctl logs - part2

Journalctl logs - part2

Revision history for this message
Przemek K. (azrael) wrote :
Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
In , wacossusca34 (wacossusca34-linux-kernel-bugs) wrote :

I can confirm experiencing this bug over a very long period of time on my personal system with this card. Paul Ausbeck's description of the issue is extremely on-point, although I would add that interference from other devices in addition to high bandwidth usage is a much more reliable way to reproduce the crash.

With older kernel versions, this issue used to simply hang the entire system. This can still happen rarely on my machine, but I have not found a way to reproduce that type of crash. I am currently using `5.11.11-arch1-1`.

Revision history for this message
In , wacossusca34 (wacossusca34-linux-kernel-bugs) wrote :

I also happened to converge along the same hacky solution to reload the driver as Paul did, so for any other users stuck with this hardware waiting for Intel's driver team to properly support this product, you can adopt this script to your liking:

echo "Listening for device crashes"
last=`date +%s`
inst=0
dmesg -W -Lnever | grep --line-buffered "iwlwifi 0000:06:00.0: Failed to wake NIC for hcmd" | while read -r l;
do
    inst=$(expr $inst + 1)
    if [ $inst -ge 2 ]
    then
        cur=`date +%s`
        dif=$(expr $cur - $last)
        if [ $dif -ge 5 ]
        then
            echo -n "Detected crash, cycling device..."
            echo "1" > /sys/bus/pci/devices/0000\:06\:00.0//remove
            sleep 1
            echo "1" > /sys/bus/pci/rescan
            echo " done"
            last=`date +%s`
        fi
        inst=0
    fi
done

When the driver crashes, it takes about 5-10 seconds for it to automatically recover if this script is running as superuser.

Revision history for this message
Przemek K. (azrael) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected
description: updated
description: updated
Revision history for this message
Przemek K. (azrael) wrote : CRDA.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : IwConfig.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : Lspci.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : Lspci-vt.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : Lsusb.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : Lsusb-t.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : Lsusb-v.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : ProcModules.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : RfKill.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : UdevDb.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : WifiSyslog.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote : acpidump.txt

apport information

Revision history for this message
Przemek K. (azrael) wrote :

The problem has gotten worse, I'm unable to use my wifi because it disconnects every 1-2 mins. It happens mostly after resuming from sleep.
The workaround script helps only for another 1-3 mins.

To post a comment you must log in.