System hangs after iwlwifi firmware crash

Bug #1728651 reported by Bruce Duncan on 2017-10-30
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux-firmware (Ubuntu)
Undecided
Unassigned

Bug Description

Since upgrading to Kubuntu 17.10 my HP EliteBook 820 G3 hangs at unpredictable times. I have experienced probably 20 hangs. Once the system is hung no mouse movement, NumLock toggle, VT switch, Ctrl+Alt+Del, SysRq keys, SSH attempts have any effect whatsoever. Any audio playing is stuck looping in the hardware buffer. The system is really stuck. Only the keyboard backlight is still responsive.

The only correlation I have noticed is that the system only hangs if WiFi is enabled. This morning I experienced two hangs in ten minutes necessitating reboots. Having disabled WiFi the system has been stable since ~1100 (~5 hours).

Often, I see a iwlwifi hardware reset in the logs before the system dies:

/var/log/syslog:Oct 30 11:49:30 fry kernel: [ 6529.550751] iwlwifi 0000:02:00.0: Microcode SW error detected. Restarting 0x82000000.

(apport has hopefully uploaded the full log, please lmk if not). Sometimes the message doesn't make it into the logs, and ext4 truncates the file at the next mount.

There seems to be no correlation with any messages immediately before the wifi chip dies. It doesn't seem to matter whether I'm at home or at work. I have wired ethernet at work simultaneously with wifi, but the wifi provides IPv6 so I would imagine most traffic uses wifi, so it's hard to say whether there's any effect of the traffic load on the probability of failure.

Here is a sample of the firmware load:

/var/log/syslog.5.gz:Oct 24 15:37:26 fry kernel: [ 7.604743] iwlwifi 0000:02:00.0: enabling device (0000 -> 0002)
/var/log/syslog.5.gz:Oct 24 15:37:26 fry kernel: [ 7.611790] iwlwifi 0000:02:00.0: Direct firmware load for iwlwifi-8000C-33.ucode failed with error -2
/var/log/syslog.5.gz:Oct 24 15:37:26 fry kernel: [ 7.611930] iwlwifi 0000:02:00.0: Direct firmware load for iwlwifi-8000C-32.ucode failed with error -2
/var/log/syslog.5.gz:Oct 24 15:37:26 fry kernel: [ 7.620028] iwlwifi 0000:02:00.0: loaded firmware version 31.532993.0 op_mode iwlmvm
/var/log/syslog.5.gz:Oct 24 15:37:26 fry kernel: [ 7.650972] iwlwifi 0000:02:00.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208

02:00.0 Network controller: Intel Corporation Wireless 8260 (rev 3a)

Any help appreciated.

Thanks,
Bruce

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: xorg 1:7.7+19ubuntu3
ProcVersionSignature: Ubuntu 4.13.0-16.19-generic 4.13.4
Uname: Linux 4.13.0-16-generic x86_64
ApportVersion: 2.20.7-0ubuntu3.1
Architecture: amd64
CompositorRunning: None
CurrentDesktop: KDE
Date: Mon Oct 30 16:15:43 2017
DistUpgraded: 2017-10-15 14:47:25,517 DEBUG Running PostInstallScript: './xorg_fix_proprietary.py'
DistroCodename: artful
DistroVariant: kubuntu
GraphicsCard:
 Intel Corporation HD Graphics 520 [8086:1916] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Hewlett-Packard Company HD Graphics 520 [103c:807c]
InstallationDate: Installed on 2016-09-09 (416 days ago)
InstallationMedia: Kubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
MachineType: HP HP EliteBook 820 G3
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.13.0-16-generic.efi.signed root=/dev/mapper/kubuntu--vg-root ro
SourcePackage: xorg
Symptom: display
UpgradeStatus: Upgraded to artful on 2017-10-15 (15 days ago)
dmi.bios.date: 11/01/2016
dmi.bios.vendor: HP
dmi.bios.version: N75 Ver. 01.13
dmi.board.name: 807C
dmi.board.vendor: HP
dmi.board.version: KBC Version 85.74
dmi.chassis.asset.tag: 5CG6354JW5
dmi.chassis.type: 10
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrN75Ver.01.13:bd11/01/2016:svnHP:pnHPEliteBook820G3:pvr:rvnHP:rn807C:rvrKBCVersion85.74:cvnHP:ct10:cvr:
dmi.product.family: 103C_5336AN G=N L=BUS B=HP S=ELI
dmi.product.name: HP EliteBook 820 G3
dmi.sys.vendor: HP
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.83-1
version.libgl1-mesa-dri: libgl1-mesa-dri 17.2.2-0ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 17.2.2-0ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.19.5-0ubuntu2
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.10.5-1ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.10.0-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20170309-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.15-2

Bruce Duncan (bwduncan) wrote :
Bruce Duncan (bwduncan) wrote :

I don't easily see the firmware hang message in the attached files, so I'm pasting it here:

Microcode SW error detected. Restarting 0x82000000.
Start IWL Error Log Dump:
Status: 0x00000200, count: 6
0x00000038 | BAD_COMMAND
0x00000220 | trm_hw_status0
0x00000000 | trm_hw_status1
0x00010040 | branchlink2
0x00028DA6 | interruptlink1
0x00000000 | interruptlink2
0x009A001C | data1
0x0000029B | data2
0x0000009B | data3
0x31818996 | beacon time
0xDA54AE68 | tsf low
0x00000927 | tsf hi
0x00000000 | time gp1
0x34A281CF | time gp2
0x00000001 | uCode revision type
0x0000001F | uCode version major
0x00082201 | uCode version minor
0x00000201 | hw version
0x00489008 | board version
0x009A001C | hcmd
0x24022080 | isr0
0x01000000 | isr1
0x2820180A | isr2
0x00417CC0 | isr3
0x00000000 | isr4
0x0099014E | last cmd Id
0x00000000 | wait_event
0x00004288 | l2p_control
0x00018030 | l2p_duration
0x000003BF | l2p_mhvalid
0x000000E7 | l2p_addr_match
0x0000000D | lmpm_pmg_sel
0x15062149 | timestamp
0x0000E0F0 | flow_handler
Start IWL Error Log Dump:
Status: 0x00000200, count: 7
0x00000070 | ADVANCED_SYSASSERT
0x00000000 | umac branchlink1
0xC0086B70 | umac branchlink2
0xC00844E0 | umac interruptlink1
0xC00844E0 | umac interruptlink2
0x00000800 | umac data1
0xC00844E0 | umac data2
0xDEADBEEF | umac data3
0x0000001F | umac major
0x00082201 | umac minor
0xC088627C | frame pointer
0xC088627C | stack pointer
0x009A001C | last host cmd
0x00000000 | isr status reg
FW Error notification: type 0x00000000 cmd_id 0x1C
FW Error notification: seq 0x009A service 0x0000001C
FW Error notification: timestamp 0x 34A1CCC1

(I trimmed words from the start of each line like: "Oct 23 16:31:59 fry kernel: [29533.463611] iwlwifi 0000:02:00.0:")

Bruce Duncan (bwduncan) on 2017-10-30
affects: xorg (Ubuntu) → linux-firmware (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-firmware (Ubuntu):
status: New → Confirmed
Peter Windridge (twowheels) wrote :
Download full text (4.8 KiB)

I believe I also have this bug on a HP Envy 13-ab002na. It is a vanilla 17.10 install.

As with @bwduncan the laptop becomes completely unresponsive to any inputs and needs hard power off. The last thing in /var/log/kern.log is iwlwifi Microcode SW error detected. It has happened 3 times in the last 30 min. It had previously been less frequent and mostly only when the machine was sleeping.

If I disable WiFi and tether to my phone (via USB) I have not yet experienced such a lockup.

I installed 17.10 from new with this laptop so cannot say whether it is 17.10 specific (previously had a ThinkPad; seems I had a narrow escape with the BIOS write bit!).

[...]
Hardware name: HP HP ENVY Notebook 13-ab0XX/82B9, BIOS F.04 09/29/2016
[...]

Dec 30 21:35:56 schramm kernel: [ 1364.091405] iwlwifi 0000:01:00.0: Microcode SW error detected. Restarting 0x82000000.
Dec 30 21:35:56 schramm kernel: [ 1364.091570] iwlwifi 0000:01:00.0: Start IWL Error Log Dump:
Dec 30 21:35:56 schramm kernel: [ 1364.091577] iwlwifi 0000:01:00.0: Status: 0x00000200, count: 6
Dec 30 21:35:56 schramm kernel: [ 1364.091583] iwlwifi 0000:01:00.0: Loaded firmware version: 29.610311.0
Dec 30 21:35:56 schramm kernel: [ 1364.091589] iwlwifi 0000:01:00.0: 0x00000038 | BAD_COMMAND
Dec 30 21:35:56 schramm kernel: [ 1364.091594] iwlwifi 0000:01:00.0: 0x00000220 | trm_hw_status0
Dec 30 21:35:56 schramm kernel: [ 1364.091599] iwlwifi 0000:01:00.0: 0x00000000 | trm_hw_status1
Dec 30 21:35:56 schramm kernel: [ 1364.091603] iwlwifi 0000:01:00.0: 0x00043D58 | branchlink2
Dec 30 21:35:56 schramm kernel: [ 1364.091608] iwlwifi 0000:01:00.0: 0x0004B016 | interruptlink1
Dec 30 21:35:56 schramm kernel: [ 1364.091612] iwlwifi 0000:01:00.0: 0x00000000 | interruptlink2
Dec 30 21:35:56 schramm kernel: [ 1364.091617] iwlwifi 0000:01:00.0: 0x0072001C | data1
Dec 30 21:35:56 schramm kernel: [ 1364.091621] iwlwifi 0000:01:00.0: 0x00000573 | data2
Dec 30 21:35:56 schramm kernel: [ 1364.091625] iwlwifi 0000:01:00.0: 0x00000073 | data3
Dec 30 21:35:56 schramm kernel: [ 1364.091630] iwlwifi 0000:01:00.0: 0xD4817170 | beacon time
Dec 30 21:35:56 schramm kernel: [ 1364.091634] iwlwifi 0000:01:00.0: 0x6C4BEE92 | tsf low
Dec 30 21:35:56 schramm kernel: [ 1364.091638] iwlwifi 0000:01:00.0: 0x00000090 | tsf hi
Dec 30 21:35:56 schramm kernel: [ 1364.091643] iwlwifi 0000:01:00.0: 0x00000000 | time gp1
Dec 30 21:35:56 schramm kernel: [ 1364.091647] iwlwifi 0000:01:00.0: 0x509414C2 | time gp2
Dec 30 21:35:56 schramm kernel: [ 1364.091651] iwlwifi 0000:01:00.0: 0x00000001 | uCode revision type
Dec 30 21:35:56 schramm kernel: [ 1364.091656] iwlwifi 0000:01:00.0: 0x0000001D | uCode version major
Dec 30 21:35:56 schramm kernel: [ 1364.091660] iwlwifi 0000:01:00.0: 0x00095007 | uCode version minor
Dec 30 21:35:56 schramm kernel: [ 1364.091665] iwlwifi 0000:01:00.0: 0x00000210 | hw version
Dec 30 21:35:56 schramm kernel: [ 1364.091669] iwlwifi 0000:01:00.0: 0x00C89200 | board version
Dec 30 21:35:56 schramm kernel: [ 1364.091674] iwlwifi 0000:01:00.0: 0x0072001C | hcmd
Dec 30 21:35:56 schramm kernel: [ 1364.091678] iwlwifi 0000:01:00.0: 0x24022080 | isr0
Dec 30 21:35:56 schramm kernel: [ 1364.091682] i...

Read more...

Peter Windridge (twowheels) wrote :

Based on skimming https://bugzilla.kernel.org/show_bug.cgi?id=195299 I think this might be fixed upstream but didn't make it to kernel 4.13. When will Ubuntu move to 4.14? :)

For now I am trying an older microcode (27.541033.0) but am not optimistic..

Ulli Dahoam (ulli2k) wrote :

I have the same issue on the NUC7i3BN!
Is there already a bugfix available?

Download full text (5.9 KiB)

Am suffering from the same bug on a Thinkpad T470s with an Intel 8265 wifi chip running Ubuntu 17.10, Linux kernel 4.13.0-37-generic, and iwlwifi 31.560484.0. There are multiple reports like the errors below in kern.log. After a while, the machine completely locks up. Interestingly, the system only seems to lock up when transferring files using Syncthing or Resilio Sync.

Mar 31 15:36:50 albatross kernel: [66351.160529] iwlwifi 0000:3a:00.0: Microcode SW error detected. Restarting 0x82000000.
Mar 31 15:36:50 albatross kernel: [66351.160659] iwlwifi 0000:3a:00.0: Start IWL Error Log Dump:
Mar 31 15:36:50 albatross kernel: [66351.160661] iwlwifi 0000:3a:00.0: Status: 0x00000200, count: 6
Mar 31 15:36:50 albatross kernel: [66351.160662] iwlwifi 0000:3a:00.0: Loaded firmware version: 31.560484.0
Mar 31 15:36:50 albatross kernel: [66351.160664] iwlwifi 0000:3a:00.0: 0x00000038 | BAD_COMMAND
Mar 31 15:36:50 albatross kernel: [66351.160665] iwlwifi 0000:3a:00.0: 0x00000220 | trm_hw_status0
Mar 31 15:36:50 albatross kernel: [66351.160667] iwlwifi 0000:3a:00.0: 0x00000000 | trm_hw_status1
Mar 31 15:36:50 albatross kernel: [66351.160668] iwlwifi 0000:3a:00.0: 0x0002495C | branchlink2
Mar 31 15:36:50 albatross kernel: [66351.160670] iwlwifi 0000:3a:00.0: 0x0003962E | interruptlink1
Mar 31 15:36:50 albatross kernel: [66351.160671] iwlwifi 0000:3a:00.0: 0x00000000 | interruptlink2
Mar 31 15:36:50 albatross kernel: [66351.160673] iwlwifi 0000:3a:00.0: 0x0065001C | data1
Mar 31 15:36:50 albatross kernel: [66351.160674] iwlwifi 0000:3a:00.0: 0x00000066 | data2
Mar 31 15:36:50 albatross kernel: [66351.160675] iwlwifi 0000:3a:00.0: 0x00000067 | data3
Mar 31 15:36:50 albatross kernel: [66351.160677] iwlwifi 0000:3a:00.0: 0x00016976 | beacon time
Mar 31 15:36:50 albatross kernel: [66351.160678] iwlwifi 0000:3a:00.0: 0x0398C5D9 | tsf low
Mar 31 15:36:50 albatross kernel: [66351.160680] iwlwifi 0000:3a:00.0: 0x00000014 | tsf hi
Mar 31 15:36:50 albatross kernel: [66351.160681] iwlwifi 0000:3a:00.0: 0x00000000 | time gp1
Mar 31 15:36:50 albatross kernel: [66351.160682] iwlwifi 0000:3a:00.0: 0x000C98DD | time gp2
Mar 31 15:36:50 albatross kernel: [66351.160684] iwlwifi 0000:3a:00.0: 0x00000001 | uCode revision type
Mar 31 15:36:50 albatross kernel: [66351.160685] iwlwifi 0000:3a:00.0: 0x0000001F | uCode version major
Mar 31 15:36:50 albatross kernel: [66351.160687] iwlwifi 0000:3a:00.0: 0x00088D64 | uCode version minor
Mar 31 15:36:50 albatross kernel: [66351.160688] iwlwifi 0000:3a:00.0: 0x00000230 | hw version
Mar 31 15:36:50 albatross kernel: [66351.160689] iwlwifi 0000:3a:00.0: 0x00C89000 | board version
Mar 31 15:36:50 albatross kernel: [66351.160691] iwlwifi 0000:3a:00.0: 0x0065001C | hcmd
Mar 31 15:36:50 albatross kernel: [66351.160692] iwlwifi 0000:3a:00.0: 0x24022082 | isr0
Mar 31 15:36:50 albatross kernel: [66351.160694] iwlwifi 0000:3a:00.0: 0x00000000 | isr1
Mar 31 15:36:50 albatross kernel: [66351.160695] iwlwifi 0000:3a:00.0: 0x28201802 | isr2
Mar 31 15:36:50 albatross kernel: [66351.160697] iwlwifi 0000:3a:00.0: 0x004120C0 | isr3
Mar 31 15:36:50 albatross kernel: [66351.160698] iwlwifi 0000:3a:00.0: 0x00000000 | isr4
Mar 31 15:36:50 a...

Read more...

Ped (ped) wrote :

I did switch to mainline kernel 4.15.13 about 10 days back, and so far no single freeze happened.

I did use this web page for instructions/etc: https://wiki.ubuntu.com/Kernel/MainlineBuilds

I'm on KDE Neon distro, which is basically Ubuntu 16.04 LTS (with latests KDE packages on top of it).

This was very annoying period of time (full 3 months?) on the 4.13 kernel with freezing at least 2-3 times per week, I wonder if there's not large enough portion of users affected to check if the update of kernel for ordinary users can be accelerated?

Upon upgrading to Ubuntu 18.04 beta, which runs kernel 4.15.0-13-generic and iwlwifi 8265 driver firmware 34.0.1, this problem seems to have gone away.

On Ubuntu 18.04, the warnings in kern.log still happen, and a kernel oops is reported, but the system doesn't hang.

Just had a system hang on 18.04 due to this. The hangs are definitely much rarer, but do happen.

Ped (ped) wrote :

I'm now on 4.16.3 kernel, and while I haven't encountered freeze since moving to 4.15.13+, the WiFi connection becomes slow/unstable after few minutes. I'm not sure this is connected to the same part of code, or my HW meanwhile degraded a bit (did the 4.4 kernel work without a hitch? I may try to boot it for few days to see if it's HW issue, or still regression in kernel and wifi card driver).

At this moment this is just a disclaimer to my post above, to make people not expect everything works perfectly after update of kernel, YMMV. For me it at least doesn't freeze any more.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.