[3.13.0-30.55] rtl8821ae Kernel PANIC due to calling incorrect function

Bug #1354469 reported by TJ on 2014-08-08
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Trusty
Undecided
Tim Gardner
Utopic
High
Unassigned

Bug Description

I had a support incident with a user of an Asus X551MA containing a Realtek RTL8821AE WiFi card. After the kernel update from 3.13.0-24 to 3.13.0-30 there was a kernel Panic as soon as the wifi card began scanning (photograph attached).

 I investigated the bug in detail and diagnosed the cause to commit 22bf70f which modifies a function prototype called by the RTL8821ae driver but does not update the driver to call the alternative function.

Corrective patch attached.

RIP [<ffffffffa042ffe5>] rtl8821ae_rx_query_desc+0x1d5/0xa50 [rtl8821ae]

No changes were introduced in the rtl8821ae module between 3.13.0-24 and 3.13.0-30. The only changes were in mac80211, which rtl8821ae depends on (along with cfg80211):

# check rtl8821ae
$ gitlog Ubuntu-3.13.0-24.47..Ubuntu-3.13.0-30.55 -- drivers/staging/rtl8821ae
# check mac80211
$ gitlog Ubuntu-3.13.0-24.47..Ubuntu-3.13.0-30.55 -- net/mac80211
7049ad3 Mon May 19 18:45:30 2014 +0100 Michael Braun mac80211: fix WPA with VLAN on AP side with ps-sta again
5d31275 Mon May 19 18:45:30 2014 +0100 Johannes Berg mac80211: fix suspend vs. authentication race
56f2ea4 Mon May 19 18:45:29 2014 +0100 Johannes Berg mac80211: fix potential use-after-free
22bf70f Tue Apr 15 15:27:46 2014 +0100 Johannes Berg mac80211: add length check in ieee80211_is_robust_mgmt_frame()
# check cfg80211
$ gitlog Ubuntu-3.13.0-24.47..Ubuntu-3.13.0-30.55 -- net/wireless/
$

The faulting location is in function rx_query_desc() at offset 0x1d5.

$ objdump -d /lib/modules/3.13.0-30-generic/kernel/drivers/staging/rtl8821ae/rtl8821ae.ko

0000000000033e40 <rtl8821ae_rx_query_desc>:

Faulting instruction is at 0x33e40 + 0x1d5 = 0x34015

Now I examine the debug-symbols of the module with:

$ gdb -d drivers/staging/rtl8821ae -d drivers/staging/rtl8821ae/rtl8821ae /usr/lib/debug/modules/3.13.0-30-generic/kernel/drivers/staging/rtl8821ae/rtl8821ae.dbgsym.ko

(gdb) info line rtl8821ae_rx_query_desc
Line 539 of "/build/buildd/linux-3.13.0/drivers/staging/rtl8821ae/rtl8821ae/trx.c" starts at address 0x33e40 <rtl8821ae_rx_query_desc>
    and ends at 0x33e65 <rtl8821ae_rx_query_desc+37>.
(gdb) x/i 0x34015
    0x34015 <rtl8821ae_rx_query_desc+469>: movzwl (%rdi),%esi
(gdb) disas rtl8821ae_rx_query_desc
...
    0x0000000000033ffe <+446>: je 0x34641 <rtl8821ae_rx_query_desc+2049>
    0x0000000000034004 <+452>: cmpl $0x18,0x68(%rdx)
    0x0000000000034008 <+456>: jbe 0x34268 <rtl8821ae_rx_query_desc+1064>
    0x000000000003400e <+462>: mov 0xd8(%rdx),%rdi /* hdr->frame_control */
    0x0000000000034015 <+469>: movzwl (%rdi),%esi /* FAULT %rdi invalid */
    0x0000000000034018 <+472>: mov %esi,%ecx
    0x000000000003401a <+474>: and $0xfc,%cx
    0x000000000003401f <+479>: cmp $0xa0,%cx
    0x0000000000034024 <+484>: je 0x34068 <rtl8821ae_rx_query_desc+552>
...
(gdb) info line *0x34015
Line 2194 of "/build/buildd/linux-3.13.0/include/linux/ieee80211.h" starts at address 0x34015 <rtl8821ae_rx_query_desc+469>
    and ends at 0x34018 <rtl8821ae_rx_query_desc+472>.

---- include/linux/ieee80211.h -----
/**
  * _ieee80211_is_robust_mgmt_frame - check if frame is a robust management frame
  * @hdr: the frame (buffer must include at least the first octet of payload)
  */
static inline bool _ieee80211_is_robust_mgmt_frame(struct ieee80211_hdr *hdr)
{
   if (ieee80211_is_disassoc(hdr->frame_control) || /* LINE 2194 */
       ieee80211_is_deauth(hdr->frame_control))
     return true;

/**
  * ieee80211_is_disassoc - check if IEEE80211_FTYPE_MGMT && IEEE80211_STYPE_DISASSOC
  * @fc: frame control bytes in little-endian byteorder
  */
static inline int ieee80211_is_disassoc(__le16 fc)
{
   return (fc & cpu_to_le16(IEEE80211_FCTL_FTYPE | IEEE80211_FCTL_STYPE)) ==
          cpu_to_le16(IEEE80211_FTYPE_MGMT | IEEE80211_STYPE_DISASSOC);
}

----- drivers/staging/rtl8821ae/rtl8821ae/trx.c::rtl8821ae_rx_query_desc() -----
...
     if ((ieee80211_is_robust_mgmt_frame(hdr)) && /* FAULT LOCATION */
       (ieee80211_has_protected(hdr->frame_control)))
       rx_status->flag &= ~RX_FLAG_DECRYPTED;
     else
       rx_status->flag |= RX_FLAG_DECRYPTED;
   }
...
----- 8-< -----

On investigation it appears that gdb may have an incorrect debug reference for the location of ieee80211_is_robust_mgmt_frame() since the
location it references is for the underscore-prefix function _ieee80211_is_robust_mgmt_frame(). This may be due to both functions being inline.

The changes introduced in commit:

22bf70f Tue Apr 15 15:27:46 2014 +0100 Johannes Berg mac80211: add length check in ieee80211_is_robust_mgmt_frame()

include renaming the existing

ieee80211_is_robust_mgmt_frame(struct ieee80211_hdr *hdr)

to

_ieee80211_is_robust_mgmt_frame(struct ieee80211_hdr *hdr)

and replacing the original function with one taking an skb, not ieee80211_hdr:

+ * ieee80211_is_robust_mgmt_frame - check if skb contains a robust mgmt frame
+ * @skb: the skb containing the frame, length will be checked
+ */
+static inline bool ieee80211_is_robust_mgmt_frame(struct sk_buff *skb)
+{
+ if (skb->len < 25)
+ return false;
+ return _ieee80211_is_robust_mgmt_frame((void *)skb->data);
+}
+
+/**

Not being able to debug a live kernel with this hardware I'm unable to pursue much further, but commit 22bf70f suggests that the wrong function is now being called by rtl8821ae because it isn't
patched to call the underscore version of the function as all other rtl* drivers were. If this is the case, the receiving function is expecting a skb.

The required change therefore probably should be:

$ git diff drivers/staging/rtl8821ae/rtl8821ae/trx.c
diff --git a/drivers/staging/rtl8821ae/rtl8821ae/trx.c b/drivers/staging/rtl8821ae/rtl8821ae/trx.c
index 75ae438..963b55f 100644
--- a/drivers/staging/rtl8821ae/rtl8821ae/trx.c
+++ b/drivers/staging/rtl8821ae/rtl8821ae/trx.c
@@ -616,7 +616,7 @@ bool rtl8821ae_rx_query_desc(struct ieee80211_hw *hw,
                                 return false;
                 }

- if ((ieee80211_is_robust_mgmt_frame(hdr)) &&
+ if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
                         (ieee80211_has_protected(hdr->frame_control)))
                         rx_status->flag &= ~RX_FLAG_DECRYPTED;
                 else
---

TJ (tj) wrote :
TJ (tj) wrote :
TJ (tj) on 2014-08-08
description: updated
tags: added: patch
BemNum (bemnum) wrote :

Hello,

I can confirm the kernel panics after loading rtl8821ae module, while ubuntu boots on Asus R510L notebook.
$ sudo lspci|grep -i rtl
02:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter

After blacklisting the rtl8821ae module the computer starts fine but without the network (wlan or ethernet):
$ cat /etc/modprobe.d/blacklist-asus_rtl.conf
blacklist rtl8821ae

02:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
 Subsystem: ASUSTeK Computer Inc. Device 200f
 Flags: bus master, fast devsel, latency 0, IRQ 65
 I/O ports at e000 [size=256]
 Memory at f7914000 (64-bit, non-prefetchable) [size=4K]
 Memory at f7910000 (64-bit, non-prefetchable) [size=16K]
 Capabilities: [40] Power Management version 3
 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
 Capabilities: [70] Express Endpoint, MSI 01
 Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
 Capabilities: [d0] Vital Product Data
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [160] Device Serial Number 34-80-75-<removed>
 Capabilities: [170] Latency Tolerance Reporting
 Capabilities: [178] L1 PM Substates
 Kernel driver in use: r8169

03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter
 Subsystem: AzureWave Device 2161
 Flags: bus master, fast devsel, latency 0, IRQ 10
 I/O ports at d000 [size=256]
 Memory at f7800000 (64-bit, non-prefetchable) [size=16K]
 Capabilities: [40] Power Management version 3
 Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
 Capabilities: [70] Express Endpoint, MSI 00
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [140] Device Serial Number 00-e0-4c-ff-<removed>
 Capabilities: [150] Latency Tolerance Reporting
 Capabilities: [158] L1 PM Substates

Regards,
BemNum

psl (slansky) wrote :

Ubuntu 14.04.1, amd64
PC: ASUS VIVO PC VM40B (mini PC with Celeron 1007U @ 1.50GHz)

It is not 100% repeatable but in many cases I get kernel panic related to WiFi, rtl8821ae module during boot (rtl8821ae_rx_query_desc).
When I disable WiFi in BIOS, PC boots fine.
When I am lucky and I boot with enabled WiFi and there is no kernel panic, WiFi doesn't work, I cannot connect to my WiFi network.

$ uname -a
Linux vivo 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$lspci | grep RTL8821AE
02:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter

Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Utopic):
status: Triaged → Fix Released
Changed in linux (Ubuntu Trusty):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Tim Gardner (timg-tpi) on 2014-08-11
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Brad Figg (brad-figg) wrote :

The one line fix is obviously the right thing here.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Adam Fischer (the-1-vp) wrote :

Is there anything I can do to apply this patch right away, or is this going to be released fairly soon? I have a new laptop with this exact problem, and would like to get it deployed soon. It appears I am too late to use Proposed.

Launchpad Janitor (janitor) wrote :
Download full text (38.9 KiB)

This bug was fixed in the package linux - 3.13.0-35.62

---------------
linux (3.13.0-35.62) trusty; urgency=low

  [ Joseph Salisbury ]

  * Release Tracking Bug
    - LP: #1357148

  [ Brad Figg ]

  * Start new release

  [ dann frazier ]

  * SAUCE: (no-up) Fix build failure on arm64
    - LP: #1353657
  * [debian] Allow for package revisions condusive for branching

  [ David Henningsson ]

  * SAUCE: Call broadwell specific functions from the hda driver
    - LP: #1317865

  [ Edward Lin ]

  * SAUCE: (no-up) Add use native backlight quirk for Dell Inspiron
    5547/5447
    - LP: #1332437

  [ Imre Deak ]

  * SAUCE: drm/i915: move power domain init earlier during system resume
    - LP: #1353405

  [ Jani Nikula ]

  * SAUCE: drm/i915: use lane count and link rate from VBT as minimums for
    eDP
    - LP: #1338582
  * SAUCE: drm/i915/dp: force eDP lane count to max available lanes on BDW
    - LP: #1338582
  * SAUCE: drm/i915: provide interface for audio driver to query cdclk
    - LP: #1188091
  * SAUCE: drm/i915: demote opregion excessive timeout WARN_ONCE to
    DRM_INFO_ONCE
    - LP: #1351014

  [ Joseph Salisbury ]

  * [Config] updateconfigs after Linux 3.13.11.6 updates

  [ Luis Henriques ]

  * Revert "[Packaging] linux-udeb-flavour -- standardise on linux prefix"

  [ Ming Lei ]

  * Revert "SAUCE: (no-up) ata: Fix the dma state machine lockup for the
    IDENTIFY DEVICE PIO mode command."
    - LP: #1335645

  [ Paulo Zanoni ]

  * SAUCE: drm/i915: consider the source max DP lane count too
    - LP: #1338582

  [ Tim Gardner ]

  * [Config] CONFIG_GPIO_SYSFS=y
    - LP: #1342153
  * [Config] CONFIG_KEYS_DEBUG_PROC_KEYS=y
    - LP: #1344405
  * [Config] updateconfigs
  * [Config] CONFIG_SCSI_IPR_TRACE=y, CONFIG_SCSI_IPR_DUMP=y
    - LP: #1343109
  * [Config] CONFIG_CONTEXT_TRACKING_FORCE=n
    - LP: #1349028

  [ Timo Aaltonen ]

  * SAUCE: Fix a typo in hda i915_bdw support.
    - LP: #1343140

  [ Upstream Kernel Changes ]

  * Revert "net/mlx4_en: Fix bad use of dev_id"
    - LP: #1347012
  * Revert "ACPI / AC: Remove AC's proc directory."
    - LP: #1356913
  * Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan"
    - LP: #1356913
  * mm, pcp: allow restoring percpu_pagelist_fraction default
    - LP: #1347088
  * net: Fix permission check in netlink_connect()
    - LP: #1312989
  * netlink: Rename netlink_capable netlink_allowed
    - LP: #1312989
  * net: Move the permission check in sock_diag_put_filterinfo to
    packet_diag_dump
    - LP: #1312989
  * net: Add variants of capable for use on on sockets
    - LP: #1312989
  * net: Add variants of capable for use on netlink messages
    - LP: #1312989
  * net: Use netlink_ns_capable to verify the permisions of netlink
    messages
    - LP: #1312989
  * netlink: Only check file credentials for implicit destinations
    - LP: #1312989
  * igb: fix stats for i210 rx_fifo_errors
    - LP: #1338893
  * HID: use multi input quirk for 22b9:2968
    - LP: #1339567
  * crypto/nx: disable NX on little endian builds
    - LP: #1338666
  * ACPI / video: Add Dell Inspiron 5737 to the blacklist
    - LP: #1250401
  * Input: elantech - deal with clickpads reportin...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
BemNum (bemnum) wrote :

The kernel 3.13.0-35.62 fixes the problem with ethernet and wifi on Asus R510L notebook.

Thank you for fixing.

Regards,
BN

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers