0cf3:7015 [HP Pavilion Media Center a1630n Desktop PC] ath9k_htc 14.04 NetworkManager hangs after 'ath: phy0: Unable to remove station entry for...'

Bug #1301558 reported by Jason Conti
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

This problem has been happening randomly since I installed 14.04. Today it happened twice while trying to zsync the latest iso.

I use a TP-Link TL-WN821N wireless adapter to connect to the internet, which uses the ath9k_htc module. NetworkManager occasionally locks up to the point where any commands I try to kill it or restart it hang as well, and my only way out is REISUB. Right before it happens I have noticed twice in my dmesg:

[ 494.896048] ath: phy0: Unable to remove station entry for: f4:ec:38:ec:42:02 For reference that mac address is the default router I connect to daily.

and then shortly after we get a kernel dump for NetworkManager:

[ 720.752078] INFO: task NetworkManager:850 blocked for more than 120 seconds.
[ 720.752090] Not tainted 3.13.0-21-generic #43-Ubuntu
[ 720.752095] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 720.752100] NetworkManager D ffff88011fd14440 0 850 1 0x00000000
[ 720.752110] ffff8800cd3bd8d0 0000000000000002 ffff8800df0a5fc0 ffff8800cd3bdfd8
[ 720.752119] 0000000000014440 0000000000014440 ffff8800df0a5fc0 ffff8800cedfe8d8
[ 720.752127] ffff8800cedfe8dc ffff8800df0a5fc0 00000000ffffffff ffff8800cedfe8e0
[ 720.752135] Call Trace:
[ 720.752150] [<ffffffff81719259>] schedule_preempt_disabled+0x29/0x70
[ 720.752159] [<ffffffff8171b0c5>] __mutex_lock_slowpath+0x135/0x1b0
[ 720.752167] [<ffffffff8171b15f>] mutex_lock+0x1f/0x2f
[ 720.752224] [<ffffffffa0355e5d>] nl80211_dump_scan+0x5d/0x630 [cfg80211]
[ 720.752233] [<ffffffff8160a4a1>] ? __kmalloc_reserve.isra.25+0x31/0x90
[ 720.752243] [<ffffffff816476fe>] genl_lock_dumpit+0x2e/0x50
[ 720.752249] [<ffffffff81644954>] netlink_dump+0x84/0x240
[ 720.752256] [<ffffffff8164523b>] __netlink_dump_start+0x1ab/0x220
[ 720.752264] [<ffffffff81649166>] genl_family_rcv_msg+0x276/0x370
[ 720.752271] [<ffffffff8160b0be>] ? __alloc_skb+0x4e/0x2b0
[ 720.752278] [<ffffffff816476d0>] ? genl_lock_done+0x50/0x50
[ 720.752284] [<ffffffff81647680>] ? genl_unlock+0x20/0x20
[ 720.752295] [<ffffffff811a2608>] ? __kmalloc_node_track_caller+0x58/0x1e0
[ 720.752302] [<ffffffff81649260>] ? genl_family_rcv_msg+0x370/0x370
[ 720.752309] [<ffffffff816492f1>] genl_rcv_msg+0x91/0xd0
[ 720.752316] [<ffffffff81647379>] netlink_rcv_skb+0xa9/0xc0
[ 720.752322] [<ffffffff81647878>] genl_rcv+0x28/0x40
[ 720.752329] [<ffffffff816469a5>] netlink_unicast+0xd5/0x1b0
[ 720.752336] [<ffffffff81646d7f>] netlink_sendmsg+0x2ff/0x740
[ 720.752343] [<ffffffff81644cb2>] ? netlink_recvmsg+0x1a2/0x3a0
[ 720.752351] [<ffffffff81601f3b>] sock_sendmsg+0x8b/0xc0
[ 720.752359] [<ffffffff81601c2e>] ? move_addr_to_kernel.part.16+0x1e/0x60
[ 720.752366] [<ffffffff816027e1>] ? move_addr_to_kernel+0x21/0x30
[ 720.752373] [<ffffffff816027a9>] ___sys_sendmsg+0x3a9/0x3c0
[ 720.752382] [<ffffffff811cd000>] ? poll_select_copy_remaining+0x130/0x130
[ 720.752390] [<ffffffff811cd000>] ? poll_select_copy_remaining+0x130/0x130
[ 720.752399] [<ffffffff8101b763>] ? native_sched_clock+0x13/0x80
[ 720.752406] [<ffffffff8101b763>] ? native_sched_clock+0x13/0x80
[ 720.752413] [<ffffffff8109d12d>] ? sched_clock_local+0x1d/0x80
[ 720.752423] [<ffffffff811112cc>] ? acct_account_cputime+0x1c/0x20
[ 720.752429] [<ffffffff8109d75b>] ? account_user_time+0x8b/0xa0
[ 720.752437] [<ffffffff81602eb2>] __sys_sendmsg+0x42/0x80
[ 720.752444] [<ffffffff81602f02>] SyS_sendmsg+0x12/0x20
[ 720.752452] [<ffffffff817254ff>] tracesys+0xe1/0xe6

I don't know how to reproduce it, but it happened once on March 28 when I installed and three times on April 2 in a short period of time. I will attach the dmesg I took before the last REISUB to the bug. Please let me know if any more information is useful, I'll try to get it next time it happens.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-21-generic 3.13.0-21.43
ProcVersionSignature: Ubuntu 3.13.0-21.43-generic 3.13.8
Uname: Linux 3.13.0-21-generic x86_64
ApportVersion: 2.14-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC3: jconti 2024 F.... pulseaudio
 /dev/snd/controlC0: jconti 2024 F.... pulseaudio
 /dev/snd/controlC2: jconti 2024 F.... pulseaudio
 /dev/snd/controlC1: jconti 2024 F.... pulseaudio
CurrentDesktop: Unity
Date: Wed Apr 2 14:53:42 2014
HibernationDevice: RESUME=UUID=f62b1259-1f84-4a06-9c97-90a5eb41cfd4
InstallationDate: Installed on 2014-03-28 (5 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Daily amd64 (20140328)
MachineType: HP Pavilion 061 RC658AA-ABA a1630n
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-21-generic root=UUID=863ed135-24c6-4cf5-b1b8-b4e76181f621 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-21-generic N/A
 linux-backports-modules-3.13.0-21-generic N/A
 linux-firmware 1.127
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/02/2006
dmi.bios.vendor: Phoenix Technologies, LTD
dmi.bios.version: 3.07
dmi.board.name: NODUSM3
dmi.board.vendor: ASUSTek Computer INC.
dmi.board.version: 1.05
dmi.chassis.type: 3
dmi.chassis.version: 1111
dmi.modalias: dmi:bvnPhoenixTechnologies,LTD:bvr3.07:bd08/02/2006:svnHPPavilion061:pnRC658AA-ABAa1630n:pvr0nx1114RE101NODM300:rvnASUSTekComputerINC.:rnNODUSM3:rvr1.05:cvn:ct3:cvr1111:
dmi.product.name: RC658AA-ABA a1630n
dmi.product.version: 0nx1114RE101NODM300
dmi.sys.vendor: HP Pavilion 061

Revision history for this message
Jason Conti (jconti) wrote :
description: updated
Revision history for this message
Jason Conti (jconti) wrote :

Interestingly I just tried to zsync today's amd64 ubuntu iso yet again and it happened a third time. So maybe I can temporarily reproduce it. This time I also had a kernel dump for chrome-sandbox which I tried to start to add a comment to this bug, before realizing how foolish that was.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue just start happening after a recent kernel update?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-trusty/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Jason Conti (jconti) wrote :

I have not been able to reproduce it after a bit of testing on the mainline kernel 3.14.0-031400-generic, but it seems to happen randomly so I'm not sure that means it is fixed. I will continue to test for a couple days and see if it pops up with the 3.14 kernel.

I just got this desktop on Friday and 14.04 was the first system I installed on it, so it is difficult to say when the issue started. I checked the old /var/log/kern.log.1 and it seems it also happened on March 28 with the 3.13.0-19-generic kernel. Today it didn't start until after I installed linux-image-3.13.0-21-generic.

Jason Conti (jconti)
description: updated
Jason Conti (jconti)
tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Jason Conti (jconti) wrote :

Reproduced today with 3.14.0-031400-generic while updating a vm:

[ 4800.756513] INFO: task kworker/0:2:3505 blocked for more than 120 seconds.
[ 4800.756519] Not tainted 3.14.0-031400-generic #201403310035
[ 4800.756522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4800.756526] kworker/0:2 D ffffffff818118e0 0 3505 2 0x00000000
[ 4800.756541] Workqueue: ipv6_addrconf addrconf_verify_work
[ 4800.756545] ffff880119251d18 0000000000000002 ffff880119bb4800 ffff880119251fd8
[ 4800.756552] 00000000000144c0 00000000000144c0 ffffffff81c144a0 ffff8800df9818f0
[ 4800.756559] ffff880119251cf8 ffffffff81cd24a0 ffffffff81cd24a4 00000000ffffffff
[ 4800.756567] Call Trace:
[ 4800.756575] [<ffffffff81764f89>] schedule+0x29/0x70
[ 4800.756582] [<ffffffff817652ae>] schedule_preempt_disabled+0xe/0x10
[ 4800.756589] [<ffffffff817670e4>] __mutex_lock_slowpath+0x114/0x1b0
[ 4800.756597] [<ffffffff817671a3>] mutex_lock+0x23/0x37
[ 4800.756604] [<ffffffff816616f5>] rtnl_lock+0x15/0x20
[ 4800.756612] [<ffffffff816fc0ae>] addrconf_verify_work+0xe/0x20
[ 4800.756620] [<ffffffff81087e2f>] process_one_work+0x17f/0x4c0
[ 4800.756628] [<ffffffff8108908b>] worker_thread+0x11b/0x3d0
[ 4800.756635] [<ffffffff81088f70>] ? manage_workers.isra.21+0x190/0x190
[ 4800.756642] [<ffffffff8108fe89>] kthread+0xc9/0xe0
[ 4800.756650] [<ffffffff8108fdc0>] ? flush_kthread_worker+0xb0/0xb0
[ 4800.756657] [<ffffffff8177203c>] ret_from_fork+0x7c/0xb0
[ 4800.756664] [<ffffffff8108fdc0>] ? flush_kthread_worker+0xb0/0xb0

New stuff in addition to a similar dump for NetworkManager and the warning about removing the station from ath9k_htc.

Revision history for this message
Jason Conti (jconti) wrote :

Rebuilt 3.14 with some additional debugging enabled and triggered it again today while simply wgetting an iso. Results in some new messages that are kind of interesting:

[ 3501.052975] usb 1-2: ath: firmware panic! exccause: 0x0000000d; pc: 0x0090a641; badvaddr: 0x12345678.
[ 3517.088168] wlan0: moving STA f4:ec:38:ec:42:02 to state 3
[ 3517.088180] wlan0: moving STA f4:ec:38:ec:42:02 to state 2
[ 3519.088060] ath: phy0: Unable to remove station entry for: f4:ec:38:ec:42:02
[ 3519.088078] wlan0: moving STA f4:ec:38:ec:42:02 to state 1
[ 3519.088084] wlan0: Removed STA f4:ec:38:ec:42:02
[ 3519.088240] wlan0: Destroyed STA f4:ec:38:ec:42:02

penalvch (penalvch)
tags: added: needs-full-computer-model
summary: - [ath9k_htc] 14.04 NetworkManager hangs after 'ath: phy0: Unable to
- remove station entry for...'
+ 0cf3:7015 [ath9k_htc] 14.04 NetworkManager hangs after 'ath: phy0:
+ Unable to remove station entry for...'
penalvch (penalvch)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Jason Conti (jconti) wrote :

Sure thing, though that is almost exactly it. The bottom decal only has "HP Pavilion a1630n" though there is another sticker with "Microsoft Windows XP Media Center Edition" and it is definitely a desktop.

penalvch (penalvch)
summary: - 0cf3:7015 [ath9k_htc] 14.04 NetworkManager hangs after 'ath: phy0:
- Unable to remove station entry for...'
+ 0cf3:7015 [HP Pavilion Media Center a1630n Desktop PC] ath9k_htc 14.04
+ NetworkManager hangs after 'ath: phy0: Unable to remove station entry
+ for...'
tags: added: bios-outdated-3.10
removed: needs-full-computer-model
Revision history for this message
Jason Conti (jconti) wrote :

Updated to 3.10, output of sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date:

 3.10
12/13/2006

May take several days to find out if it made any difference, I'll post if/when it happens again.

Revision history for this message
Jason Conti (jconti) wrote :
Download full text (4.5 KiB)

That was surprisingly quick. With 3.13.0-24-generic reproduced it while wgetting the kubuntu iso:

Apr 20 16:53:15 jconti-desktop kernel: [ 4016.152042] ath: phy0: Unable to remove station entry for: f4:ec:38:ec:42:02
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748073] INFO: task NetworkManager:844 blocked for more than 120 seconds.
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748086] Not tainted 3.13.0-24-generic #46-Ubuntu
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748096] NetworkManager D ffff88011fc14440 0 844 1 0x00000000
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748106] ffff8800de37f8d0 0000000000000002 ffff8800364417f0 ffff8800de37f
fd8
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748115] 0000000000014440 0000000000014440 ffff8800364417f0 ffff880119612
8d8
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748122] ffff8801196128dc ffff8800364417f0 00000000ffffffff ffff880119612
8e0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748131] Call Trace:
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748147] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748156] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748163] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748225] [<ffffffffa01cbe5d>] nl80211_dump_scan+0x5d/0x630 [cfg80211]
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748234] [<ffffffff8160aa71>] ? __kmalloc_reserve.isra.25+0x31/0x90
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748245] [<ffffffff81647dae>] genl_lock_dumpit+0x2e/0x50
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748252] [<ffffffff81645004>] netlink_dump+0x84/0x240
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748259] [<ffffffff816458eb>] __netlink_dump_start+0x1ab/0x220
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748269] [<ffffffff81649816>] genl_family_rcv_msg+0x276/0x370
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748276] [<ffffffff81647d80>] ? genl_lock_done+0x50/0x50
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748283] [<ffffffff81647d30>] ? genl_unlock+0x20/0x20
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748295] [<ffffffff811a2778>] ? __kmalloc_node_track_caller+0x58/0x1e0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748303] [<ffffffff81649910>] ? genl_family_rcv_msg+0x370/0x370
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748310] [<ffffffff816499a1>] genl_rcv_msg+0x91/0xd0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748316] [<ffffffff81647a29>] netlink_rcv_skb+0xa9/0xc0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748323] [<ffffffff81647f28>] genl_rcv+0x28/0x40
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748330] [<ffffffff81647055>] netlink_unicast+0xd5/0x1b0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748337] [<ffffffff8164742f>] netlink_sendmsg+0x2ff/0x740
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748343] [<ffffffff81645362>] ? netlink_recvmsg+0x1a2/0x3a0
Apr 20 16:56:20 jconti-desktop kernel: [ 4200.748352] [<ffffffff816...

Read more...

Revision history for this message
penalvch (penalvch) wrote :

Jason Conti, could you please test the latest mainline kernel via http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc2-trusty/ and advise if this is reproducible?

tags: added: latest-bios-3.10
removed: bios-outdated-3.10
Revision history for this message
Jason Conti (jconti) wrote :

Reproduced with 3.15.0-031500rc2-generic, though the results are slightly different. I got:

Apr 23 13:07:28 jconti-desktop kernel: [ 291.815888] usb 1-2: ath: firmware panic! exccause: 0x0000000d; pc: 0x0090a641; badvaddr: 0x12345678.

Seems to be the same values as the comment above, but did not get the "Unable to remove station entry" message. A similar thing happened when I was testing -rc1. The network connection does die, and if I try to reboot it hangs, but none of the processes seem to hang. Although I am getting terrible download speeds with -rc2 compared to 3.13.0-24-generic, so it seems to break other things (upload seems about the same though, I actually reproduced it using the comcast speed test during the upload phase).

Revision history for this message
penalvch (penalvch) wrote :

Jason Conti, for regression testing purposes, could you please test 12.04.0 via http://old-releases.ubuntu.com/releases/12.04.0/ and advise to the results?

tags: added: kernel-bug-exists-upstream-3.15-rc2
removed: kernel-bug-exists-upstream
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.