kernel NULL pointer dereference in iwlmvm iwl_mvm_enable_txq

Bug #1733194 reported by munin on 2017-11-19
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Linux
Expired
Medium
linux (Debian)
New
Unknown
linux (Ubuntu)
High
Unassigned

Bug Description

When in AP mode after some time, get this BUG:

Nov 18 23:21:31 bifrost kernel: [18345.860393] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
Nov 18 23:21:31 bifrost kernel: [18345.860552] IP: iwl_trans_pcie_txq_enable+0x62/0x440 [iwlwifi]
Nov 18 23:21:31 bifrost kernel: [18345.860644] PGD 0
Nov 18 23:21:31 bifrost kernel: [18345.860646] P4D 0
Nov 18 23:21:31 bifrost kernel: [18345.860682]
Nov 18 23:21:31 bifrost kernel: [18345.860747] Oops: 0002 [#1] SMP
Nov 18 23:21:31 bifrost kernel: [18345.860800] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink dummy ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs ccm xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo xt_policy xt_multiport ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter nls_iso8859_1 cmdlinepart intel_spi_platform intel_spi spi_nor mtd arc4 intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp coretemp kvm_intel bridge kvm stp llc iwlmvm irqbypass punit_atom_debug mac80211 intel_cstate snd_hda_codec_hdmi iwlwifi snd_hda_codec_realtek snd_hda_codec_generic cfg80211 btusb snd_hda_intel lpc_ich btrtl snd_hda_codec snd_intel_sst_acpi
Nov 18 23:21:31 bifrost kernel: [18345.861897] mei_txe snd_hda_core mei snd_hwdep shpchp snd_intel_sst_core snd_soc_sst_atom_hifi2_platform hci_uart snd_soc_sst_match snd_soc_core btbcm serdev snd_compress btqca ac97_bus btintel snd_pcm_dmaengine snd_pcm dw_dmac dw_dmac_core snd_timer bluetooth snd soundcore mac_hid intel_int0002_vgpio ecdh_generic spi_pxa2xx_platform rfkill_gpio pwm_lpss_platform pwm_lpss 8250_dw ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 drm_kms_helper crct10dif_pclmul igb syscopyarea sysfillrect crc32_pclmul sysimgblt fb_sys_fops dca ghash_clmulni_intel i2c_algo_bit cryptd ahci ptp drm pps_core libahci video
Nov 18 23:21:31 bifrost kernel: [18345.862995] i2c_hid hid sdhci_acpi sdhci
Nov 18 23:21:31 bifrost kernel: [18345.863068] CPU: 1 PID: 1202 Comm: kworker/1:2 Tainted: G W 4.13.0-16-generic #19-Ubuntu
Nov 18 23:21:31 bifrost kernel: [18345.863203] Hardware name: NF541 NF541/NF541, BIOS BAR1NA02 02/25/2016
Nov 18 23:21:31 bifrost kernel: [18345.863326] Workqueue: events iwl_mvm_add_new_dqa_stream_wk [iwlmvm]
Nov 18 23:21:31 bifrost kernel: [18345.863428] task: ffff96862c1c5800 task.stack: ffffa98a817c0000
Nov 18 23:21:31 bifrost kernel: [18345.863539] RIP: 0010:iwl_trans_pcie_txq_enable+0x62/0x440 [iwlwifi]
Nov 18 23:21:31 bifrost kernel: [18345.863635] RSP: 0018:ffffa98a817c3be0 EFLAGS: 00010246
Nov 18 23:21:31 bifrost kernel: [18345.863718] RAX: 00000000000009c4 RBX: 000000000000001f RCX: 0000000000000000
Nov 18 23:21:31 bifrost kernel: [18345.863824] RDX: 0000000000000000 RSI: 000000000000001f RDI: 0000000000002710
Nov 18 23:21:31 bifrost kernel: [18345.863932] RBP: ffffa98a817c3c30 R08: 0000000000002710 R09: 0000000000000001
Nov 18 23:21:31 bifrost kernel: [18345.864039] R10: 0000000000000000 R11: ffff9686344ce010 R12: 0000000000000000
Nov 18 23:21:31 bifrost kernel: [18345.864145] R13: ffff96862c9d0018 R14: 0000000000000000 R15: 0000000000000000
Nov 18 23:21:31 bifrost kernel: [18345.864253] FS: 0000000000000000(0000) GS:ffff96863fc80000(0000) knlGS:0000000000000000
Nov 18 23:21:31 bifrost kernel: [18345.864373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 18 23:21:31 bifrost kernel: [18345.864461] CR2: 0000000000000070 CR3: 000000023068b000 CR4: 00000000001006e0
Nov 18 23:21:31 bifrost kernel: [18345.864568] Call Trace:
Nov 18 23:21:31 bifrost kernel: [18345.864633] iwl_mvm_enable_txq+0x212/0x3a0 [iwlmvm]
Nov 18 23:21:31 bifrost kernel: [18345.864732] iwl_mvm_add_new_dqa_stream_wk+0x7e8/0x15e0 [iwlmvm]
Nov 18 23:21:31 bifrost kernel: [18345.864843] ? iwl_mvm_add_new_dqa_stream_wk+0x7e8/0x15e0 [iwlmvm]
Nov 18 23:21:31 bifrost kernel: [18345.864945] ? __switch_to+0x211/0x520
Nov 18 23:21:31 bifrost kernel: [18345.865008] ? put_prev_entity+0x23/0xf0
Nov 18 23:21:31 bifrost kernel: [18345.865075] process_one_work+0x1e7/0x410
Nov 18 23:21:31 bifrost kernel: [18345.865143] worker_thread+0x4a/0x410
Nov 18 23:21:31 bifrost kernel: [18345.865204] kthread+0x125/0x140
Nov 18 23:21:31 bifrost kernel: [18345.865260] ? process_one_work+0x410/0x410
Nov 18 23:21:31 bifrost kernel: [18345.869337] ? kthread_create_on_node+0x70/0x70
Nov 18 23:21:31 bifrost kernel: [18345.873431] ret_from_fork+0x25/0x30
Nov 18 23:21:31 bifrost kernel: [18345.877527] Code: 4c 8b b4 c7 08 7e 00 00 f0 48 0f ab 87 08 8e 00 00 73 0d 80 3d d6 3b 02 00 00 0f 84 a1 03 00 00 44 89 c7 e8 21 2f 6b dc 4d 85 e4 <49> 89 46 70 0f 84 d9 02 00 00 41 0f b6 04 24 89 45 b8 41 0f b6
Nov 18 23:21:31 bifrost kernel: [18345.886085] RIP: iwl_trans_pcie_txq_enable+0x62/0x440 [iwlwifi] RSP: ffffa98a817c3be0
Nov 18 23:21:31 bifrost kernel: [18345.890363] CR2: 0000000000000070
Nov 18 23:21:31 bifrost kernel: [18345.894670] ---[ end trace 128827eedfd09435 ]---
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.13.0-16-generic.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.7-0ubuntu3.4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/hwC0D2', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D3p', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 17.10
HibernationDevice: RESUME=/dev/mapper/bifrost--vg-swap_1
InstallationDate: Installed on 2017-02-18 (274 days ago)
InstallationMedia: Ubuntu-Server 16.04.2 LTS "Xenial Xerus" - Release amd64 (20170215.8)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 05e3:0608 Genesys Logic, Inc. Hub
 Bus 001 Device 002: ID 8087:07dc Intel Corp.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: NF541 NF541
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.13.0-16-generic.efi.signed root=/dev/mapper/hostname--vg-root ro
ProcVersionSignature: Ubuntu 4.13.0-16.19-generic 4.13.4
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-16-generic N/A
 linux-backports-modules-4.13.0-16-generic N/A
 linux-firmware 1.169
RfKill: Error: [Errno 2] No such file or directory
Tags: artful
Uname: Linux 4.13.0-16-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: Upgraded to artful on 2017-11-18 (0 days ago)
UserGroups:

_MarkForUpload: False
dmi.bios.date: 02/25/2016
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: BAR1NA02
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: NF541
dmi.board.vendor: NF541
dmi.board.version: 1.0
dmi.chassis.type: 3
dmi.chassis.vendor: NF541
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrBAR1NA02:bd02/25/2016:svnNF541:pnNF541:pvr1.0:rvnNF541:rnNF541:rvr1.0:cvnNF541:ct3:cvr1.0:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: NF541
dmi.product.version: 1.0
dmi.sys.vendor: NF541

affects: linux-meta (Ubuntu) → linux (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1733194

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful

apport information

tags: added: apport-collected
description: updated
munin (munin-n) wrote : CRDA.txt

apport information

apport information

apport information

apport information

munin (munin-n) wrote : IwConfig.txt

apport information

apport information

munin (munin-n) wrote : Lspci.txt

apport information

apport information

apport information

apport information

apport information

munin (munin-n) wrote : UdevDb.txt

apport information

apport information

Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14

Changed in linux (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
status: Triaged → Incomplete
munin (munin-n) wrote :
Download full text (4.4 KiB)

I upgraded from 16.04, where I was not having this issue. I think that was running 4.10?

I tested with the upstream kernel, and the problem persists. Here is a backtrace from dmesg:

[Mon Nov 20 21:22:24 2017] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[Mon Nov 20 21:22:24 2017] IP: iwl_trans_pcie_txq_enable+0x62/0x440 [iwlwifi]
[Mon Nov 20 21:22:24 2017] PGD 0 P4D 0
[Mon Nov 20 21:22:24 2017] Oops: 0002 [#1] SMP
[Mon Nov 20 21:22:24 2017] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs ccm xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo xt_policy xt_multiport ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter nls_iso8859_1 arc4 cmdlinepart intel_spi_platform intel_spi spi_nor mtd intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp coretemp kvm_intel bridge stp llc kvm snd_hda_codec_hdmi iwlmvm irqbypass mac80211 snd_hda_codec_realtek snd_hda_codec_generic punit_atom_debug iwlwifi intel_cstate cfg80211 lpc_ich snd_intel_sst_acpi snd_hda_intel btusb snd_intel_sst_core snd_hda_codec btrtl snd_soc_sst_atom_hifi2_platform
[Mon Nov 20 21:22:24 2017] snd_hda_core hci_uart snd_soc_sst_match snd_hwdep btbcm serdev snd_soc_core mei_txe btqca btintel shpchp mei snd_compress bluetooth ac97_bus snd_pcm_dmaengine snd_pcm dw_dmac ecdh_generic dw_dmac_core snd_timer rfkill_gpio snd intel_int0002_vgpio mac_hid soundcore 8250_dw spi_pxa2xx_platform pwm_lpss_platform pwm_lpss ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 crct10dif_pclmul drm_kms_helper igb crc32_pclmul syscopyarea sysfillrect dca ghash_clmulni_intel sysimgblt fb_sys_fops cryptd ptp pps_core drm i2c_algo_bit ahci libahci video i2c_hid sdhci_acpi hid sdhci
[Mon Nov 20 21:22:24 2017] CPU: 3 PID: 585 Comm: kworker/3:2 Tainted: G W 4.14.0-041400-generic #201711122031
[Mon Nov 20 21:22:24 2017] Hardware name: NF541 NF541/NF541, BIOS BAR1NA02 02/25/2016
[Mon Nov 20 21:22:24 2017] Workqueue: events iwl_mvm_add_new_dqa_stream_wk [iwlmvm]
[Mon Nov 20 21:22:24 2017] task: ffff9939ebd4d700 task.stack: ffffb8ed813c8000
[Mon Nov 20 21:22:24 2017] RIP: 0010:iwl_trans_pcie_txq_enable+0x62/0x440 [iwlwifi]
[Mon Nov 20 21:22:24 2017] RSP: 0018:ffffb8ed813cbc00 EFLAGS: 00010246
[Mon Nov 20 21:22:24 2017] RAX: 00000000000009c4 RBX: 000000000000001f RCX: 0000000000000000
[Mon Nov 20 21:22:24 2017] RDX: 0000000000000000 RSI: 000000000000001f RDI: 0000000000002710
[Mon Nov 20 21:22:24 2017] RBP: ffffb8ed813cbc50 R08: 0000000000002710 R09: 0000000000000001
[Mon Nov 20 21:22:24 2017] R10: 0000000000000000 R11: ffff9939edd4e010 R12: 0000000000000000
[Mon Nov 20 21:22:24 2017] R13: ffff9939f0a80018 R14: 0000000000000000 R15: 0000000000000000
[Mon Nov 20 21:22:24 2017] FS: 0000000000000000(0000) GS:ffff9939ffd80000(0000) knlGS:0000000000000000
...

Read more...

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Bartosz Kwitniewski (zerg2000) wrote :
Download full text (6.7 KiB)

Just a note, I have the same problem with Intel® Wireless-AC 3168 (firmware version 29.541020.0 op_mode iwlmvm) on Gentoo linux-4.14.8-gentoo-r1 in AP mode (802.11n).

Sometimes kernel only shows warnings from drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1363:
-----
        case TX_STATUS_FAIL_DEST_PS:
            /* the FW should have stopped the queue and not
             * return this status
             */
            WARN_ON(1);
            info->flags |= IEEE80211_TX_STAT_TX_FILTERED;
            break;
-----
2018-01-03T21:16:51+01:00 [warning] kernel: ------------[ cut here ]------------
2018-01-03T21:16:51+01:00 [warning] kernel: WARNING: CPU: 2 PID: 81 at drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1363 iwl_mvm_rx_tx_cmd+0x361/0x610
2018-01-03T21:16:51+01:00 [warning] kernel: Modules linked in: rtl8812au(O)
2018-01-03T21:16:51+01:00 [warning] kernel: CPU: 2 PID: 81 Comm: irq/123-iwlwifi Tainted: G W O 4.14.8-gentoo-r1 #1
2018-01-03T21:16:51+01:00 [warning] kernel: Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0041.2017.0825.1152 08/25/2017
2018-01-03T21:16:51+01:00 [warning] kernel: task: ffff9ea3f4730000 task.stack: ffffb50340270000
2018-01-03T21:16:51+01:00 [warning] kernel: RIP: 0010:iwl_mvm_rx_tx_cmd+0x361/0x610
2018-01-03T21:16:51+01:00 [warning] kernel: RSP: 0018:ffffb50340273cb8 EFLAGS: 00010246
2018-01-03T21:16:51+01:00 [warning] kernel: RAX: 0000000000000088 RBX: ffff9ea3d6107700 RCX: 0000000000031040
2018-01-03T21:16:51+01:00 [warning] kernel: RDX: 000000000003103f RSI: ffff9ea3ffd20dc0 RDI: ffff9ea3f45ed200
2018-01-03T21:16:51+01:00 [warning] kernel: RBP: ffffb50340273d38 R08: 0000000000020dc0 R09: ffffffffb74e5b5d
2018-01-03T21:16:51+01:00 [warning] kernel: R10: ffff9ea3d6107730 R11: ffffffffb737b110 R12: ffff9ea3f40f9388
2018-01-03T21:16:51+01:00 [warning] kernel: R13: 000000000000d000 R14: ffff9ea3d6107700 R15: ffff9ea3e1e93000
2018-01-03T21:16:51+01:00 [warning] kernel: FS: 0000000000000000(0000) GS:ffff9ea3ffd00000(0000) knlGS:0000000000000000
2018-01-03T21:16:51+01:00 [warning] kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2018-01-03T21:16:51+01:00 [warning] kernel: CR2: 00007ed717e8eee0 CR3: 000000023fe09000 CR4: 00000000003406e0
2018-01-03T21:16:51+01:00 [warning] kernel: Call Trace:
2018-01-03T21:16:51+01:00 [warning] kernel: iwl_mvm_rx_common+0x181/0x2b0
2018-01-03T21:16:51+01:00 [warning] kernel: iwl_mvm_rx+0x6b/0x80
2018-01-03T21:16:51+01:00 [warning] kernel: iwl_pcie_rx_handle+0x2ee/0x860
2018-01-03T21:16:51+01:00 [warning] kernel: iwl_pcie_irq_handler+0x18d/0x6c0
2018-01-03T21:16:51+01:00 [warning] kernel: ? irq_forced_thread_fn+0x80/0x80
2018-01-03T21:16:51+01:00 [warning] kernel: irq_thread_fn+0x2a/0x60
2018-01-03T21:16:51+01:00 [warning] kernel: irq_thread+0x149/0x1b0
2018-01-03T21:16:51+01:00 [warning] kernel: ? __schedule+0x1c6/0x4f0
2018-01-03T21:16:51+01:00 [warning] kernel: ? wake_threads_waitq+0x40/0x40
2018-01-03T21:16:51+01:00 [warning] kernel: kthread+0x106/0x140
2018-01-03T21:16:51+01:00 [warning] kernel: ? irq_thread_dtor+0xb0/0xb0
2018-01-03T21:16:51+01:00 [warning] kernel: ? kthread_create_on_node+0x70/0x70
2018-01-03T21:16:51+01:00 [warning] k...

Read more...

Niklas Bölter (nboelter) wrote :
Download full text (3.5 KiB)

I have the same issue with the current Ubuntu 18.04 kernel 4.15.0-23-generic, and also with ubuntu mainline kernel 4.17.0-041700, using Intel Wireless 8260 (Firmware 34.0.1) in AP mode.

Not sure if this is relevant or not: if i reboot the system, the wifi card vanishes completely (even from lspci), i always have to shut down and press the power button instead.

------

BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
IP: iwl_trans_pcie_txq_enable+0x62/0x460 [iwlwifi]
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
Modules linked in: ccm bridge stp llc pppoe pppox nf_conntrack_ipv6 nf_defrag_ipv6 ip6t_rt ip6table_filter ip6_tables nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_policy xt_multiport xt_conntrack iptable_filter bnep ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c xt_TCPMSS xt_tcpudp iptable_mangle nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core intel_rapl snd_soc_sst_dsp intel_telemetry_pltdrv snd_soc_sst_ipc intel_punit_ipc snd_soc_acpi intel_telemetry_core intel_pmc_ipc snd_soc_core x86_pkg_temp_thermal arc4 intel_powerclamp coretemp kvm_intel snd_compress ac97_bus kvm snd_hda_codec_hdmi irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_generic crc32_pclmul ghash_clmulni_intel
 snd_pcm_dmaengine pcbc snd_hda_intel snd_hda_codec snd_hda_core iwlmvm snd_hwdep mac80211 aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf iwlwifi snd_pcm btusb btrtl btbcm btintel serio_raw bluetooth snd_seq_midi snd_seq_midi_event ecdh_generic cfg80211 snd_rawmidi lpc_ich snd_seq snd_seq_device snd_timer snd mei_me mei mac_hid soundcore shpchp sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 uas usb_storage i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops psmouse sdhci_pci drm sdhci r8169 ahci mii libahci i2c_hid hid video pinctrl_broxton
CPU: 3 PID: 5529 Comm: kworker/3:0 Not tainted 4.15.0-23-generic #25-Ubuntu
Hardware name: NA ZBOX-CI327NANO-GS-01/ZBOX-CI327NANO-GS-01, BIOS 5.12 01/16/2018
Workqueue: events iwl_mvm_add_new_dqa_stream_wk [iwlmvm]
RIP: 0010:iwl_trans_pcie_txq_enable+0x62/0x460 [iwlwifi]
RSP: 0018:ffffa38b41577c10 EFLAGS: 00010246
RAX: 00000000000009c4 RBX: 000000000000001f RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000001f RDI: 0000000000002710
RBP: ffffa38b41577c60 R08: 0000000000002710 R09: 0000000000000001
R10: 0000000000000000 R11: ffff8d6db6399fd0 R12: 0000000000000000
R13: ffff8d6db5aa0018 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8d6dbfd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 0000000136e0a000 CR4: 00000000003406e0
Call Trace:
 iwl_mvm_enable_txq+0x21a/0x3b0 [iwlmvm]
 iwl_mvm_add_new_dqa_stream_wk+0x809/0x1690 [iwlmvm]
 ? iwl_mvm_add_new_dqa_stream_wk+0x809/0x1690 [iwlmvm]
 ? __switch_to+0xad/0x500
 ? put_prev_entity+0x25/0x100
 process_one_work+0x1de/0x410
 worker_thread+0x32/0x410
 kthread+0x121/0x140
 ? process_one_work+0x410/0x410
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x35/0...

Read more...

Created attachment 277929
dmesg

I see this oops hit every couple of days on my Intel 8265 (in master mode) running a vanilla 4.14.52 kernel (Alpine Linux 3.8); dmesg attached.

Some searching turned up a very similar oops that someone had posted on pastebin, as well as on github (https://gist.github.com/aplund/7ba82370be0388abfa1974d13102ae9a), but I was unable to find a matching issue in the issue tracker.

Created attachment 277931
iwlwifi.ko

Emmanuel Grumbach (egrumbach) wrote :

please report this NULL pointer exception to bugzilla.kernel.org and CC <email address hidden> to the bug.

So we fail here (last line):
0000000000008e7d <iwl_trans_pcie_txq_enable>:
    8e7d: e8 00 00 00 00 callq 8e82 <iwl_trans_pcie_txq_enable+0x5>
    8e82: 41 57 push %r15
    8e84: 41 56 push %r14
    8e86: 49 89 fe mov %rdi,%r14
    8e89: 41 55 push %r13
    8e8b: 41 54 push %r12
    8e8d: 49 89 cd mov %rcx,%r13
    8e90: 55 push %rbp
    8e91: 53 push %rbx
    8e92: 41 89 d4 mov %edx,%r12d
    8e95: 89 f3 mov %esi,%ebx
    8e97: 48 83 ec 20 sub $0x20,%rsp
    8e9b: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
    8ea2: 00 00
    8ea4: 48 89 44 24 18 mov %rax,0x18(%rsp)
    8ea9: 31 c0 xor %eax,%eax
    8eab: 48 63 c6 movslq %esi,%rax
    8eae: 66 89 54 24 02 mov %dx,0x2(%rsp)
    8eb3: 4c 8b bc c7 08 7e 00 mov 0x7e08(%rdi,%rax,8),%r15
    8eba: 00
    8ebb: f0 48 0f ab 87 08 8e lock bts %rax,0x8e08(%rdi)
    8ec2: 00 00
    8ec4: 73 28 jae 8eee <iwl_trans_pcie_txq_enable+0x71>
    8ec6: 80 3d 00 00 00 00 00 cmpb $0x0,0x0(%rip) # 8ecd <iwl_trans_pcie_txq_enable+0x50>
    8ecd: 75 1f jne 8eee <iwl_trans_pcie_txq_enable+0x71>
    8ecf: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
    8ed6: 44 89 44 24 04 mov %r8d,0x4(%rsp)
    8edb: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # 8ee2 <iwl_trans_pcie_txq_enable+0x65>
    8ee2: e8 00 00 00 00 callq 8ee7 <iwl_trans_pcie_txq_enable+0x6a>
    8ee7: 0f 0b ud2
    8ee9: 44 8b 44 24 04 mov 0x4(%rsp),%r8d
    8eee: 44 89 c7 mov %r8d,%edi
    8ef1: e8 00 00 00 00 callq 8ef6 <iwl_trans_pcie_txq_enable+0x79>
    8ef6: 4d 85 ed test %r13,%r13
    8ef9: 49 89 47 70 mov %rax,0x70(%r15)

Clearly, r15 is 0. r15 is assigned as mov 0x7e08(%rdi,%rax,8),%r15 which teaches me that r15 much be the pointer to the txq. rdi is the first param to the function (trans) and apparently rax is the txq_id (the second parameter although this doesn't come natural from the calling convention, rax is has been assigned to be txq_id).
The txq assignment is: struct iwl_txq *txq = trans_pcie->txq[txq_id];

Bottom line, txq is NULL...
Note that we tried (and failed) to open AMPDU a bit before the crash and this is clearly not a classic scenario.
I really don't see how trans_pcie->txq[txq_id] could be NULL... If only we knew what was the value of txq_id...
Can you load iwlwifi with debug=0x80000000 ?

Changed in linux:
importance: Unknown → Medium
status: Unknown → Incomplete
cyphaw (util000) wrote :

I suffer from this bug too, with this card in access point mode:

02:00.0 Network controller [0280]: Intel Corporation Wireless 7260 [8086:08b1] (rev 73)
 Subsystem: Intel Corporation Dual Band Wireless-AC 7260 [8086:4070]

And kernel 4.15.0-33-generic on Ubuntu 16.04 Xenial, though I encountered it with previous versions too, and I think with 4.4 kernels too.

It seems to happen when the load on the network is high.

But luckily, according to https://www.spinics.net/lists/linux-wireless/msg169581.html it seems to be fixed in Intel by https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.git/commit/?id=b8e2a319df8cc60626ab7dcabbbab64ff56be41e in their release/core35.

I think the resulting bug is probably in iwlwifi.ko (linux-image-extra package) (I don't think it is in iwlwifi.ucode).
Is there any way to track which release/coreXX is in a kernel version?

Ping? Is this still reproducible?

please re-open if you have the data we asked for.

Changed in linux:
status: Incomplete → Expired
Changed in linux (Debian):
status: Unknown → New
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.