Connecting an Osprey 2 Mini (USB cellular wireless router) sometimes causes an endless reset loop or a kernel panic

Bug #1556471 reported by ais523
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

Steps to reproduce (happens > 50% of the time, but not every time):

Turn on an Osprey 2 Mini, and connect it to a USB port via a USB cable while both the device and computer are on (i.e. hotplug).

Observed symptoms:
- sometimes the Osprey 2 Mini goes into an endless reset loop (visible as the LEDs on the front of the device flashing rapidly), being reset approximately every 200ms; on such occasions, the Ubuntu user interface often reacts as though I'd inserted an audio CD containing corrupted data (e.g. via repeatedly adding an "Audio CD" icon to the launcher and then removing it, and putting up dialog boxes complaining about failure to read the data).
- sometimes the kernel panics (often some time after the reset loop starts, perhaps up to 10 seconds); when this happens there was always or nearly always a reset loop occurring beforehand.
- sometimes the device resets once or not at all, then works as expected (showing up as a network interface, and allowing me to communicate to the Internet with a wired connection); I'm currently using an Osprey 2 Mini for this purpose right now.

The kernel panics lock up the entire system (forcing a hard power off). lt-Sysrq commands don't work in this state (even though I have them enabled), and sometimes the Caps Lock light flashes repeatedly. The logs end some time before the panic occurs.

/var/log/syslog:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1556471/+attachment/4598145/+files/syslog

The reset loops also occurred when I was using Ubuntu vivid. The kernel panics only happened after an upgrade to wily. Output of lsusb -v is attached.

The issue was root caused to the device violating the Mass Storage protocol. Hence, the preferred way to resolve is the vendor provides updated firmware.

WORKAROUND: echo 1bbb:f000:i | sudo tee /sys/module/usb_storage/parameters/quirks

---
ApportVersion: 2.19.1-0ubuntu5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ais523 6233 F.... pulseaudio
CurrentDesktop: Unity
DistroRelease: Ubuntu 15.10
HibernationDevice: RESUME=UUID=bfbc4bf8-2e40-4799-bbef-9c5044c87007
InstallationDate: Installed on 2014-06-03 (648 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
MachineType: Hewlett-Packard HP Pavilion 15 Notebook PC
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-32-generic.efi.signed root=UUID=e92d655d-cf36-4d45-90e7-30a0f9d0949e ro quiet splash
ProcVersionSignature: Ubuntu 4.2.0-32.37-generic 4.2.8-ckt4
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-32-generic N/A
 linux-backports-modules-4.2.0-32-generic N/A
 linux-firmware 1.149.3
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: wily
Uname: Linux 4.2.0-32-generic x86_64
UpgradeStatus: Upgraded to wily on 2016-02-18 (22 days ago)
UserGroups: adm audio cdrom dip lpadmin plugdev sambashare sudo wireshark
_MarkForUpload: True
dmi.bios.date: 12/04/2013
dmi.bios.vendor: Insyde
dmi.bios.version: F.42
dmi.board.asset.tag: Type2 - Board Asset Tag
dmi.board.name: 2186
dmi.board.vendor: Hewlett-Packard
dmi.board.version: 35.12
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnInsyde:bvrF.42:bd12/04/2013:svnHewlett-Packard:pnHPPavilion15NotebookPC:pvr098B110000404100000620180:rvnHewlett-Packard:rn2186:rvr35.12:cvnHewlett-Packard:ct10:cvrChassisVersion:
dmi.product.name: HP Pavilion 15 Notebook PC
dmi.product.version: 098B110000404100000620180
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
ais523 (ais523) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1556471

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
ais523 (ais523) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected wily
description: updated
Revision history for this message
ais523 (ais523) wrote : CRDA.txt

apport information

Revision history for this message
ais523 (ais523) wrote : CurrentDmesg.txt

apport information

Revision history for this message
ais523 (ais523) wrote : IwConfig.txt

apport information

Revision history for this message
ais523 (ais523) wrote : JournalErrors.txt

apport information

Revision history for this message
ais523 (ais523) wrote : Lspci.txt

apport information

Revision history for this message
ais523 (ais523) wrote : Lsusb.txt

apport information

Revision history for this message
ais523 (ais523) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
ais523 (ais523) wrote : ProcEnviron.txt

apport information

Revision history for this message
ais523 (ais523) wrote : ProcInterrupts.txt

apport information

Revision history for this message
ais523 (ais523) wrote : ProcModules.txt

apport information

Revision history for this message
ais523 (ais523) wrote : PulseList.txt

apport information

Revision history for this message
ais523 (ais523) wrote : UdevDb.txt

apport information

Revision history for this message
ais523 (ais523) wrote : UdevLog.txt

apport information

Revision history for this message
ais523 (ais523) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
ais523 (ais523) wrote :

apport-collect has attached the requested files.

Revision history for this message
ais523 (ais523) wrote :
Download full text (9.8 KiB)

Some more log files I got experimenting.

Here's a kern.log excerpt for a successful (i.e. intended behaviour, no visible bug reproduction) connection of the device (I included every line from the relevant time period because I'm not 100% sure which are relevant):

Mar 13 03:09:27 tundra kernel: [ 91.132481] usb 3-1.2: new high-speed USB device number 3 using ehci-pci
Mar 13 03:09:27 tundra kernel: [ 91.245683] usb 3-1.2: New USB device found, idVendor=1bbb, idProduct=f000
Mar 13 03:09:27 tundra kernel: [ 91.245689] usb 3-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Mar 13 03:09:27 tundra kernel: [ 91.245691] usb 3-1.2: Product: MobileBroadBand
Mar 13 03:09:27 tundra kernel: [ 91.245693] usb 3-1.2: Manufacturer: Alcatel
Mar 13 03:09:27 tundra kernel: [ 91.245695] usb 3-1.2: SerialNumber: 0123456789ABCDEF
Mar 13 03:09:27 tundra kernel: [ 91.338218] usb-storage 3-1.2:1.0: USB Mass Storage device detected
Mar 13 03:09:27 tundra kernel: [ 91.338719] scsi host6: usb-storage 3-1.2:1.0
Mar 13 03:09:27 tundra kernel: [ 91.338900] usbcore: registered new interface driver usb-storage
Mar 13 03:09:27 tundra kernel: [ 91.354544] usbcore: registered new interface driver uas
Mar 13 03:09:27 tundra kernel: [ 91.377716] [UFW BLOCK] IN=wlan0 OUT= MAC= SRC=fe80:0000:0000:0000:9ed2:1eff:fe03:ad17 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=UDP SPT=8612 DPT=8612 LEN=24
Mar 13 03:09:27 tundra kernel: [ 91.377752] [UFW BLOCK] IN=wlan0 OUT= MAC= SRC=fe80:0000:0000:0000:9ed2:1eff:fe03:ad17 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=UDP SPT=8612 DPT=8610 LEN=24
Mar 13 03:09:27 tundra kernel: [ 91.388020] [UFW BLOCK] IN=wlan0 OUT= MAC= SRC=fe80:0000:0000:0000:9ed2:1eff:fe03:ad17 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=UDP SPT=8612 DPT=8612 LEN=24
Mar 13 03:09:27 tundra kernel: [ 91.388070] [UFW BLOCK] IN=wlan0 OUT= MAC= SRC=fe80:0000:0000:0000:9ed2:1eff:fe03:ad17 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=64 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=UDP SPT=8612 DPT=8610 LEN=24
Mar 13 03:09:28 tundra kernel: [ 92.338460] scsi 6:0:0:0: Direct-Access ONETOUCH LINK4 2.31 PQ: 0 ANSI: 2
Mar 13 03:09:28 tundra kernel: [ 92.339441] scsi 6:0:0:1: CD-ROM ONETOUCH LINK4 2.31 PQ: 0 ANSI: 2
Mar 13 03:09:28 tundra kernel: [ 92.340452] sd 6:0:0:0: Attached scsi generic sg2 type 0
Mar 13 03:09:28 tundra kernel: [ 92.350362] sr 6:0:0:1: [sr1] scsi-1 drive
Mar 13 03:09:28 tundra kernel: [ 92.350592] sr 6:0:0:1: Attached scsi CD-ROM sr1
Mar 13 03:09:28 tundra kernel: [ 92.350733] sr 6:0:0:1: Attached scsi generic sg3 type 5
Mar 13 03:09:28 tundra kernel: [ 92.363363] sd 6:0:0:0: [sdb] Attached SCSI removable disk
Mar 13 03:09:28 tundra kernel: [ 92.444043] usb 3-1.2: reset high-speed USB device number 3 using ehci-pci
Mar 13 03:09:28 tundra kernel: [ 92.643995] usb 3-1.2: reset high-speed USB device number 3 using ehci-pci
Mar 13 03:09:35 tundra kernel: [ 99.016607] usb 3-1.2: USB disconnect, device number 3
Mar 13 03:09:35 tundra kernel: [ 99.245827] usb 3-1.2: ne...

Revision history for this message
penalvch (penalvch) wrote :

Cut from Description.

description: updated
Changed in linux (Ubuntu):
importance: Undecided → Low
status: Confirmed → Incomplete
tags: added: vivid
Revision history for this message
ais523 (ais523) wrote :

The sticker on the computer itself is reasonably cryptic:

Product: F9E45EA#ABU
Model: 15-n298sa

I also checked the box the computer came in. It contains the same product and model numbers, and additionally the description "HP Pavilion 15 Notebook PC".

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
penalvch (penalvch)
tags: added: bios-outdated-f.68
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
ais523 (ais523) wrote :

The most recent BIOS offered via the page you linked was F.66. I updated my BIOS to that version:

$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
F.66
07/01/2014

After updating, I connected and disconnected the Osprey 2 Mini four times, with no crashes. Then I rebooted, connected it and got a reset loop, and disconnected it and got a system hang (Caps Lock was not flashing but the system did not respond to any input, not even Alt-SysRq-B, and I had to hold down the power button to shut it off).

As such, I believe the BIOS update did nothing to prevent the bug occurring, and that the bug is still present.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
penalvch (penalvch)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
ais523 (ais523) wrote :

Thanks. I found the update in question. (It seems that some HP BIOS updates install on Linux, and others require Windows; luckily I had a dual-boot setup available to install it.)

$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
F.68
09/21/2015

I can confirm that the issue still occurs. I also determined some further information:

- The hangs are definitely triggered upon disconnecting the device (or turning it off) during a reset loop. This has a 100% correlation so far; if it's not in a reset loop when I disconnect it (even if there was a loop earlier), the kernel is fine, and if it's in a reset loop but I don't disconnect it, the kernel is also fine.

- The kernel/device have, on occasion, recovered from a reset loop "spontaneously" after several seconds. (This has happened twice now.) My current boot is one such occasion; I attached the device while the system was booting (reasonably late in the boot sequence), and the reset loop stopped after a while. I've attached the syslog from this boot session, in case it helps.

- I managed to catch one of the non-panic hangs (i.e. computer responds to no input, not even Alt-SysRq combinations, but the Caps Lock light does not flash) while on the Ctrl-Alt-F1 text terminal. It happened when I disconnected the Osprey Mini 2 during a reset loop, and started (as happens in other cases) with text scrolling far too fast to read. Then it stabilized into displaying messages about CPUs being stuck every 20s or so. One of them was along the lines of "CPU #2 HARD lockup" (I forget the exact phrasing, and didn't have time to write it down or take a photograph); all the others mentioned other CPUs (mostly CPU #3). The error message for CPU #3 was always almost the same:

NMI watchdog: BUG: CPU #3 stuck for 22s! [apache2:4544]

(the time shown changed on occasion, sometimes it was 21s)

The information only stayed onscreen for a limited time, so I couldn't write much down; I decided that the top of the stack trace would probably be the most helpful piece of information for debugging:

queue_read_lock_slowpath+0x92/0xa0
_raw_read_lock+0x1c/0x20
no_wait [I didn't write down the offsets from here on so that I could get more of the stack]
SyS_wait4
? _task_stopped_code

Additionally, after a while of this, the CPU fan settled on a speed that it normally reaches if 1 of my 4 cores is running at 100% and the others are all idle. It thus seems most likely that CPU #2 was in a tight loop, and the other CPUs were blocking on locks that CPU #1 held, thus preventing the system making any progress.

- The error message in the panic case is not always the same. I've seen a few different stacktraces, and although I can't remember the details, they were definitely different each time.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
ais523 (ais523) wrote :

Oh, and recovering from a reset loop spontaneously doesn't always happen, on an earlier occasion, I was stuck in a reset loop for several minutes before I decided to disconnect the device to see if, when and how the system would crash.

Revision history for this message
penalvch (penalvch) wrote :

ais523, in order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following:
1) The one to test is at the very top line at the top of the page (not the daily folder).
2) The release names are irrelevant.
3) The folder time stamps aren't indicative of when the kernel actually was released upstream.
4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds .

If testing on your main install would be inconvenient, one may:
1) Install Ubuntu to a different partition and then test this there.
2) Backup, or clone the primary install.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Also, you don't need to apport-collect further unless specifically requested to do so.

Once testing of the latest upstream kernel is complete, please mark this report Status Confirmed. Please let us know your results.

Thank you for your understanding.

tags: added: latest-bios-f.68
removed: bios-outdated-f.68
Changed in linux (Ubuntu):
importance: Low → Medium
status: Confirmed → Incomplete
Revision history for this message
ais523 (ais523) wrote :

Thanks for your advice. I will test the recent mainstream kernels in a few days, but I am not able to do so immediately. I will leave the bug as incomplete until then.

Revision history for this message
ais523 (ais523) wrote :

OK, testing on the most recent mainstream kernel, 4.5.0, is now complete. The Ubuntu and kernel version:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 15.10
Release: 15.10
Codename: wily
$ uname -a
Linux tundra 4.5.0-040500-generic #201603140130 SMP Mon Mar 14 05:32:22 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

The bug is partially fixed, and partially present:
- The reset loops still occur intermittently when the device is connected.
- However, unplugging the device during a reset loop no longer causes a kernel panic or hang. (There isn't any observable "action that happens instead"; the device stops resetting but that isn't surprising, given that it's no longer physically connected to the computer.)

I have used both fixed-upstream and exists-upstream tags because there is still a bug with recognising the device upstream, but the worst symptom (the panic) appears to have been fixed, leaving only a more minor symptom. (This is the same behaviour that I observed when running Ubuntu Vivid.) If this is incorrect, feel free to correct the tags.

The syslog from this boot, on which I plugged and unplugged the device repeatedly, is attached.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-fixed-upstream kernel-fixed-upstream-3.5
tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-3.5
penalvch (penalvch)
tags: added: kernel-bug-exists-upstream-4.5
removed: kernel-bug-exists-upstream-3.5 kernel-fixed-upstream kernel-fixed-upstream-3.5
Revision history for this message
penalvch (penalvch) wrote :

ais523, given the root issue of resets is still not addressed, and the panics appear collateral damage from this, let the scope of this report be the resets.

To clarify, could you please post the results of the following terminal command when the device is working fine:
usb-devices

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
ais523 (ais523) wrote :

$ usb-devices

I've attached the output of the command.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

ais523, the issue you are reporting is an upstream one. Could you please report this problem following the instructions verbatim at https://wiki.ubuntu.com/Bugs/Upstream/kernel to the appropriate mailing list (TO Greg Kroah-Hartman CC linux-usb)?

Please provide a direct URL to your post to the mailing list when it becomes available so that it may be tracked.

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
ais523 (ais523) wrote :

I have sent the email. I will check the mailing list archives on occasion, and provide a link when it becomes available.

Revision history for this message
ais523 (ais523) wrote :

My email to the mailing list is archived here: http://marc.info/?l=linux-usb&m=145812520514664&w=2

Revision history for this message
ais523 (ais523) wrote :

OK, after discussion on the upstream mailing lists, we've concluded that the bug causing the reset loop is in the device itself (that it is failing to respect the USB protocols), but that the buggy part of the Osprey 2 Mini is related to installing the Windows and Mac OS X drivers, which aren't needed by Linux. It's possible to blacklist the buggy part (which does not work under Linux even when the reset loops don't happen) via changing a module parameter; this causes the buggy part (the driver installation) to fail, thus allowing the part that most users are using the device for (the Internet connection) to work.

While the system is running, this can be achieved via the following command:

$ echo 1bbb:f000:i | sudo tee /sys/module/usb_storage/parameters/quirks

(The 1bbb:f000 is specific to the device in question, so it shouldn't have effects on other devices.)

I've tested this on the mainline kernel, and it seems to prevent the bug from occurring. I suspect that the same would occur on Ubuntu kernels, but have not tested this yet (I can test it if you think that would be helpful).

It might also be possible to set the parameter in question (usb_storage.quirks=1bbb:f000:i) via the kernel command line or configuration; I don't know of a method other than telling the bootloader to set it, which would be far too intrusive a solution for a problem that won't affect most users, but there might be a less intrusive one. This would have the advantage of preventing a crash (on an Ubuntu kernel) if the device were attached during boot. Eventually the Ubuntu kernel is likely to catch up to upstream, meaning that the impact would be much lower and people could just disconnect and reconnect the device to work around this situation.

I think it might be worthwhile to get Ubuntu to set the parameter in question automatically in order to fix the bug. However, I'm not sure where the best place to set it would be. The obvious thing to do is to use /etc/sysctl.d, but unfortunately it only seems to affect /proc files rather than /sys files (and the /proc/sys tree and /sys trees are different). Alternatively, if it's possible to set default module arguments in the kernel configuration, that would be another good place.

penalvch (penalvch)
description: updated
Revision history for this message
ais523 (ais523) wrote :

Just as a followup: the workaround command I gave only works if the usb_storage module is already loaded.

If it isn't loaded, you can write it like this:

sudo modprobe usb_storage quirks=1bbb:f000:i

Probably it's best to set a default parameter for the module, though, if that's possible.

Revision history for this message
ais523 (ais523) wrote :

I can no longer reproduce this bug. I assume it got fixed at some point, either intentionally or as a side effect of some other upstream change.

Marking it as invalid – I assume that's the right status for bugs that can no longer be reproduced?

Changed in linux (Ubuntu):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.