Kernel Call Trace After Disabling an ath10k Wireless Device (Atheros QCA6174 802.11ac (rev 32))

Bug #1670706 reported by Dmitrii Shcherbakov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Disabled a wireless adapter via NetworkManager. Got a kernel trace in dmesg.

3b:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)

uname -r
4.10.0-9-generic

https://paste.ubuntu.com/24131142/

There are also usb hot-plug messages in between ath10k related ones - this is because I disabled a wireless adapter and immediately plugged in a usb type-c dock so don't mind those.

ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: linux-image-4.10.0-9-generic 4.10.0-9.11
ProcVersionSignature: Ubuntu 4.10.0-9.11-generic 4.10.0
Uname: Linux 4.10.0-9-generic x86_64
NonfreeKernelModules: nvidia_drm nvidia_modeset nvidia
ApportVersion: 2.20.4-0ubuntu2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: dima 3007 F.... pulseaudio
CurrentDesktop: Unity:Unity7
Date: Tue Mar 7 17:28:37 2017
InstallationDate: Installed on 2017-02-27 (8 days ago)
InstallationMedia: Ubuntu 17.04 "Zesty Zapus" - Alpha amd64 (20170227)
MachineType: Razer Blade
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.10.0-9-generic.efi.signed root=UUID=3f515c94-cd91-48b4-80f6-84ec24cb7b8f ro rootflags=subvol=@ quiet button.lid_init_state=open pcie_aspm=off
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-9-generic N/A
 linux-backports-modules-4.10.0-9-generic N/A
 linux-firmware 1.163
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/10/2017
dmi.bios.vendor: Razer
dmi.bios.version: 1.00
dmi.board.name: Razer
dmi.board.vendor: Razer
dmi.chassis.type: 9
dmi.chassis.vendor: Razer
dmi.modalias: dmi:bvnRazer:bvr1.00:bd01/10/2017:svnRazer:pnBlade:pvr6.06:rvnRazer:rnRazer:rvr:cvnRazer:ct9:cvr:
dmi.product.name: Blade
dmi.product.version: 6.06
dmi.sys.vendor: Razer

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Kernel Call Trace After Disabling an ath10k Wireless Device

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.11 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc1/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Cannot reliably reproduce it even on the same kernel - will keep trying.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Reproduced on 4.10.0-11-generic after a long period of laptop inactivity (see traces at the bottom).

https://paste.ubuntu.com/24184748/

Tried to unload and reload modules afterwards:

[99449.709484] ath10k_pci 0000:3b:00.0: failed to read device register, device is gone
[99449.709487] ath10k_pci 0000:3b:00.0: failed to reset chip: -5
[99449.710036] ath10k_pci: probe of 0000:3b:00.0 failed with error -5

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Changed in linux (Ubuntu):
status: Expired → Incomplete
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (10.8 KiB)

Still getting the same on the released 4.12 kernel.

Observations this time:

* Within a good range of an access point
* No power events (haven't closed a lid or anything like that)
* No high load no the wireless card

[12920.203097] usbcore: registered new interface driver snd-usb-audio
[15764.155095] perf: interrupt took too long (2522 > 2500), lowering kernel.perf_event_max_sample_rate to 79250

# no messages at all for a while

[17837.193714] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000a0b at 0x0003543c: -110
[17838.346474] ath10k_pci 0000:3b:00.0: failed to wake target for read32 at 0x0003a028: -110
[17840.125795] ath10k_pci 0000:3b:00.0: failed to wake target for read32 at 0x0003a028: -110

[18026.693944] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c19 at 0x0003543c: -110
[18027.359637] ath10k_pci 0000:3b:00.0: failed to start hw scan: -11
[18027.717997] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c1b at 0x0003543c: -110
[18028.741757] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c1d at 0x0003543c: -110
[18029.765882] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c1f at 0x0003543c: -110
[18030.788925] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c21 at 0x0003543c: -110
[18031.455584] ath10k_pci 0000:3b:00.0: failed to start hw scan: -11
[18031.809091] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c23 at 0x0003543c: -110
[18032.837767] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c25 at 0x0003543c: -110
[18033.861726] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c27 at 0x0003543c: -110
[18034.783615] ath10k_pci 0000:3b:00.0: failed to delete WMI vdev 1: -11
[18034.885704] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c29 at 0x0003543c: -110
[18035.909919] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c2b at 0x0003543c: -110
[18036.933988] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c2d at 0x0003543c: -110
[18037.855672] ath10k_pci 0000:3b:00.0: failed to set 2g txpower 20: -11
[18037.855676] ath10k_pci 0000:3b:00.0: failed to setup tx power 20: -11
[18037.855679] ath10k_pci 0000:3b:00.0: failed to recalc tx power: -11
[18037.961916] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c2f at 0x0003543c: -110
[18038.981786] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c31 at 0x0003543c: -110
[18040.005258] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c33 at 0x0003543c: -110
[18040.927892] ath10k_pci 0000:3b:00.0: failed to set inactivity time for vdev 0: -11
[18040.927896] ath10k_pci 0000:3b:00.0: failed to setup powersave: -11
[18040.927921] wlp59s0: deauthenticating from 2c:e6:cc:27:91:68 by local choice (Reason: 3=DEAUTH_LEAVING)
[18041.029995] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c35 at 0x0003543c: -110
[18042.054088] ath10k_pci 0000:3b:00.0: failed to wake target for write32 of 0x00000c37 at 0x0003543c: -110
[18043.077916] ath10k_pci 0000:3b:00.0: failed to...

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Changed in linux (Ubuntu):
status: Expired → New
summary: - Kernel Call Trace After Disabling an ath10k Wireless Device
+ Kernel Call Trace After Disabling an ath10k Wireless Device (Atheros
+ QCA6174 802.11ac (rev 32))
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Add "ath10k_core.debug_mask=0x4041" to kernel parameter and attach dmesg when the issue happens.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Got a repro on 4.13.0-041300rc7 with the debug mask above set.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Seems like the PCI device no longer responds, so there are lots of -ETIMEDOUT.

Does same thing happen when you press the wireless on/off hotkey?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (11.8 KiB)

I don't have a special wireless off/on button on my keyboard but I'm assuming rfkill <block|unblock> <id> will do (soft kill switch).

The issue does not get triggered by the following loop (I may leave it for several hours to get a better picture and modify to wait more intelligently):

➜ ~ while test 1 ; do rfkill block 1 ; sleep 10 ; rfkill unblock 1 ; sleep 10 ; done
сен 05 01:26:13 blade kernel: wlp59s0: deauthenticating from 18:d6:c7:b0:26:ee by local choice (Reason: 3=DEAUTH_LEAVING)
сен 05 01:26:24 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:24 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:29 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:29 blade kernel: wlp59s0: authenticate with 18:d6:c7:b0:26:ee
сен 05 01:26:29 blade kernel: wlp59s0: send auth to 18:d6:c7:b0:26:ee (try 1/3)
сен 05 01:26:29 blade kernel: wlp59s0: authenticated
сен 05 01:26:29 blade kernel: wlp59s0: associate with 18:d6:c7:b0:26:ee (try 1/3)
сен 05 01:26:29 blade kernel: wlp59s0: RX AssocResp from 18:d6:c7:b0:26:ee (capab=0x431 status=0 aid=1)
сен 05 01:26:29 blade kernel: wlp59s0: associated
сен 05 01:26:29 blade kernel: IPv6: ADDRCONF(NETDEV_CHANGE): wlp59s0: link becomes ready
сен 05 01:26:33 blade kernel: wlp59s0: deauthenticating from 18:d6:c7:b0:26:ee by local choice (Reason: 3=DEAUTH_LEAVING)
сен 05 01:26:44 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:44 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:49 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:26:49 blade kernel: wlp59s0: authenticate with 18:d6:c7:b0:26:ee
сен 05 01:26:49 blade kernel: wlp59s0: send auth to 18:d6:c7:b0:26:ee (try 1/3)
сен 05 01:26:49 blade kernel: wlp59s0: authenticated
сен 05 01:26:49 blade kernel: wlp59s0: associate with 18:d6:c7:b0:26:ee (try 1/3)
сен 05 01:26:49 blade kernel: wlp59s0: RX AssocResp from 18:d6:c7:b0:26:ee (capab=0x431 status=0 aid=1)
сен 05 01:26:49 blade kernel: wlp59s0: associated
сен 05 01:26:49 blade kernel: IPv6: ADDRCONF(NETDEV_CHANGE): wlp59s0: link becomes ready
сен 05 01:26:53 blade kernel: wlp59s0: deauthenticating from 18:d6:c7:b0:26:ee by local choice (Reason: 3=DEAUTH_LEAVING)
сен 05 01:27:04 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:27:04 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:27:09 blade kernel: IPv6: ADDRCONF(NETDEV_UP): wlp59s0: link is not ready
сен 05 01:27:09 blade kernel: wlp59s0: authenticate with 18:d6:c7:b0:26:ee
сен 05 01:27:09 blade kernel: wlp59s0: send auth to 18:d6:c7:b0:26:ee (try 1/3)
сен 05 01:27:09 blade kernel: wlp59s0: send auth to 18:d6:c7:b0:26:ee (try 2/3)
сен 05 01:27:09 blade kernel: wlp59s0: send auth to 18:d6:c7:b0:26:ee (try 3/3)
сен 05 01:27:09 blade kernel: wlp59s0: authentication with 18:d6:c7:b0:26:ee timed out
сен 05 01:27:10 blade kernel: wlp59s0: authenticate with d8:74:95:e5:88:38
сен 05 01:27:10 blade kernel: wlp59s0: send auth to d8:74:95:e5:88:38 (try 1/3)
сен 05 01:27:10 blade kernel: wlp59s0: authenticated
сен 05 01:27:10 blade kerne...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Seems like it's a PCI bus issue? Tons of errors happened before the ath10k_pci errors:

мар 07 01:54:20 hostname kernel: ax88179_178a 4-1.1:1.0 enx00056b006ad7: Failed to read reg index 0x0002: -19
мар 07 01:54:20 hostname kernel: ax88179_178a 4-1.1:1.0 enx00056b006ad7: Failed to write reg index 0x0002: -19
мар 07 01:54:21 hostname kernel: ax88179_178a 4-1.1:1.0 enx00056b006ad7 (unregistered): Failed to write reg index 0x0002: -19
мар 07 01:54:21 hostname kernel: ax88179_178a 4-1.1:1.0 enx00056b006ad7 (unregistered): Failed to write reg index 0x0001: -19
мар 07 01:54:21 hostname kernel: ax88179_178a 4-1.1:1.0 enx00056b006ad7 (unregistered): Failed to write reg index 0x0002: -19
мар 07 01:54:21 hostname kernel: xhci_hcd 0000:08:00.0: Host halt failed, -19
мар 07 01:54:21 hostname kernel: xhci_hcd 0000:08:00.0: Host not accessible, reset failed.
..
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
мар 07 15:10:56 hostname kernel: hid-rmi 0018:06CB:5F41.0005: can not read F11 control registers

So the question here is, do other devices also fail to work?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I did get a lot of corrected PCIe errors previously which is why I set pcie_aspm=off (otherwise it flooded my kernel log instantly) https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1687714
https://launchpadlibrarian.net/318000853/dmesg_pcie_aspm_rc8.log

I need to think of a good test for other devices but I have not noticed any obvious failures yet.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (3.7 KiB)

This is the most heavily loaded configuration I use with this device and I don't have issues with either an NVMe SSD or GPU. No hardware-related issues with any USB devices (data goes through PCIe in the end).

➜ ~ lspci -v -t
-[0000:00]-+-00.0 Intel Corporation Device 5910
           +-01.0-[01]----00.0 NVIDIA Corporation GP106M [GeForce GTX 1060 Mobile]
           +-02.0 Intel Corporation Device 591b
           +-14.0 Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-H Thermal subsystem
           +-15.0 Intel Corporation Sunrise Point-H Serial IO I2C Controller #0
           +-16.0 Intel Corporation Sunrise Point-H CSME HECI #1
           +-1c.0-[02-3a]----00.0-[03-06]--+-00.0-[04]--
           | +-01.0-[05]--
           | \-02.0-[06]----00.0 Intel Corporation Device 15db
           +-1c.5-[3b]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-1d.0-[3c]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
           +-1e.0 Intel Corporation Sunrise Point-H Serial IO UART #0
           +-1f.0 Intel Corporation Sunrise Point-H LPC Controller
           +-1f.2 Intel Corporation Sunrise Point-H PMC
           +-1f.3 Intel Corporation Device a171
           \-1f.4 Intel Corporation Sunrise Point-H SMBus

➜ ~ lsusb -v -t
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
    |__ Port 5: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
    |__ Port 4: Dev 2, If 1, Class=Wireless, Driver=btusb, 12M
    |__ Port 4: Dev 2, If 0, Class=Wireless, Driver=btusb, 12M
    |__ Port 5: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 13, If 2, Class=Audio, Driver=snd-usb-audio, 480M
        |__ Port 1: Dev 13, If 0, Class=Video, Driver=uvcvideo, 480M
        |__ Port 1: Dev 13, If 3, Class=Audio, Driver=snd-usb-audio, 480M
        |__ Port 1: Dev 13, If 1, Class=Video, Driver=uvcvideo, 480M
        |__ Port 2: Dev 7, If 0, Class=Hub, Driver=hub/4p, 12M
            |__ Port 3: Dev 10, If 1, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 3: Dev 10, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 1: Dev 8, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
            |__ Port 1: Dev 8, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
            |__ Port 4: Dev 11, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 2: Dev 47, If 2, Class=Audio, Driver=snd-usb-audio, 12M
            |__ Port 2: Dev 47, If 0, Class=Audio, Driver=snd-usb-audio, 12M
            |__ Port 2: Dev 47, If 3, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 2: Dev 47, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 7: Dev 4, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 7: ...

Read more...

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I will try 2 things:

1. re-seat the card

2. get a card from a different manufacturer and test.

It might be a buggy slot in the motherboard and if the problem persists with a new card I will know for sure that the slot is defective.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

You mean both ax88179_178a and hid-rmi work?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Sorry, I should have mentioned that before: in the original test I also had a device plugged in via usb type-c (thunderbolt 3) - that device had an ethernet interface which you can see in the logs. It's another problem that I need to debug because type-c hot-plug of that device worked once but successive attempts did not (no devices were attached) - will try that on 4.13 but not in the scope of this thread.

I think we can safely ignore ax88179_178a for now:

мар 07 17:23:47 blade kernel: ax88179_178a 4-1.1:1.0 eth0: register 'ax88179_178a' at usb-0000:10:00.0-1.1, ASIX AX88179 USB 3.0 Gigabit Ethernet, 00:05:6b:00:6a:d7

Regarding hid-rmi - I have not encountered visible problems with a touchpad or keyboard after restore (or have not noticed them).

The original log messages were on suspend:

[18433.618894] PM: Suspending system (mem)
[18433.618956] Suspending console(s) (use no_console_suspend to debug)
[18434.736072] hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
[18435.760072] hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
[18436.784071] hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
[18437.808071] hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
[18438.832072] hid-rmi 0018:06CB:5F41.0005: rmi_read_block: timeout elapsed
[18438.832076] hid-rmi 0018:06CB:5F41.0005: can not read F11 control registers
[18439.041909] pcieport 0000:03:02.0: Refused to change power state, currently in D3
[18439.041917] pcieport 0000:03:00.0: Refused to change power state, currently in D3
[18439.168298] ACPI : EC: event blocked
[18439.377161] PM: suspend of devices complete after 5653.226 msecs
[18439.399735] PM: late suspend of devices complete after 22.568 msecs

After systemctl suspend and resume on 4.13:

http://paste.ubuntu.com/25505431/
http://paste.ubuntu.com/25505433/ (full dmesg)
[132638.096488] PM: Suspending system (mem)
[132638.096517] Suspending console(s) (use no_console_suspend to debug)
[132639.666295] PM: suspend of devices complete after 1356.062 msecs
[132639.689669] PM: late suspend of devices complete after 23.366 msecs
[132639.774466] PM: noirq suspend of devices complete after 84.791 msecs
...

---

So, I think both of those may be ignored for now.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Gathering more evidence in support of a faulty motherboard theory.

At some point I got this (system wasn't under heavy load - just a browser opened + remote video playback):

https://gist.github.com/dshcherb/06f4e4a0260b6d5313df1594d959849a#file-nvme-failure-razer-dmesg-log-L2073

сен 10 22:23:39 blade kernel: nvme nvme0: I/O 916 QID 3 timeout, aborting
сен 10 22:23:39 blade kernel: nvme nvme0: I/O 917 QID 3 timeout, aborting
сен 10 22:23:39 blade kernel: nvme nvme0: I/O 918 QID 3 timeout, aborting
сен 10 22:23:39 blade kernel: nvme nvme0: I/O 919 QID 3 timeout, aborting
сен 10 22:23:45 blade kernel: nvme nvme0: I/O 920 QID 3 timeout, aborting
сен 10 22:23:55 blade kernel: nvme nvme0: I/O 921 QID 3 timeout, aborting
сен 10 22:24:10 blade kernel: nvme nvme0: I/O 916 QID 3 timeout, reset controller
сен 10 22:24:39 blade kernel: nvme nvme0: I/O 11 QID 0 timeout, reset controller
сен 10 22:25:42 blade kernel: nvme nvme0: Device not ready; aborting reset

So the kernel tried to reset a controller after a timeout which is normal and could not do so after 100ms => aborted reset.

http://elixir.free-electrons.com/linux/v4.13/source/drivers/nvme/host/core.c#L1378

Revision history for this message
Gold Star (goldstar611) wrote :

I have the same issue with Linux Mint 19 (Ubuntu 18.04 based) using kernel 4.15 but for QCA9377 chipset.

I compiled the 4.18 kernel using `make localmodconfig` and have the same problem.
I had sources for kernel version 4.9.24 and it exhibits the same behavior (failed to wake target for write32) after going to suspend and then turning back on.

Computer is a Dell Inspiron 14 3473.

I've tried numerous debug steps like restarting network-manager, removing mac80211 cfg80211 ath10k ath10k_core and ath10k_pci modules then reinserting them and then restarting network-manager with no improvement.

I have not played with "Wake on Wireless" settings yet (https://wireless.wiki.kernel.org/en/users/documentation/wowlan)

While I have the QCA9377 I do believe this post is quite relavant: https://patchwork.kernel.org/patch/7548331/ so I'm going to check if that patch works for the QCA9377 and report back.

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.