USB stops working after a while

Bug #228746 reported by Beata Graff av dalHagen
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I have a desktop machine (MSI system board, Athlon XP, Via KT600, USB 2.0) which after upgrading to Feisty, the USB ports stop working after a while. There does not seem to be any info in syslog/dmesg. 'lsusb' appears to hang.
However, the USB keyboard and mouse which are connected, continue to work.
Also have a laptop (Compaq Armada E500, P2, USB 1.0) which does not experience any USB failures.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Brunellus (luigi12081) wrote :

I can confirm that this still exists as of kernel 2.6.27-9-generic, on AMD64. I am attaching dmesg and lspci output.

Other symptoms:

* lsusb hangs, and there is no way to kill existing lsusb processes

*

Revision history for this message
Brunellus (luigi12081) wrote :

lspci output. attached.

Changed in linux:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Andy Whitcroft (apw) wrote :
Download full text (4.9 KiB)

From Brunellus' attachment:

[ 82.204035] usb 1-6: reset high speed USB device using ehci_hcd and address 8
[ 82.260071] scsi 10:0:0:1: Device offlined - not ready after error recovery
[ 82.260175] usb 1-6: USB disconnect, address 8
[ 82.262702] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 82.262708] IP: [<ffffffff804346d9>] device_pm_remove+0x39/0x60
[ 82.262717] PGD 0
[ 82.262720] Oops: 0002 [1] SMP
[ 82.262723] CPU 1
[ 82.262725] Modules linked in: af_packet binfmt_misc bridge stp bnep sco rfcomm l2cap bluetooth ppdev cpufreq_powersave cpufreq_ondemand cpufreq_stats cpufreq_userspace freq_table cpufreq_conservative container video output wmi sbs pci_slot sbshc battery iptable_filter ip_tables x_tables ac sbp2 lp k8temp pcspkr evdev nvidia(P) gspca_spca561 gspca_main compat_ioctl32 videodev v4l1_compat shpchp pci_hotplug snd_ice1724 snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_pt2258 snd_i2c snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device parport_pc parport snd soundcore button i2c_nforce2 i2c_core ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif sg usbhid hid usb_storage libusual ata_generic pata_amd sata_nv pata_acpi ohci1394 ohci_hcd ieee1394 libata scsi_mod forcedeth ehci_hcd dock usbcore thermal processor fan fbcon tileblit font bitblit softcursor fuse
[ 82.262789] Pid: 1319, comm: khubd Tainted: P 2.6.27-9-generic #1
[ 82.262792] RIP: 0010:[<ffffffff804346d9>] [<ffffffff804346d9>] device_pm_remove+0x39/0x60
[ 82.262796] RSP: 0018:ffff880037555aa0 EFLAGS: 00010286
[ 82.262799] RAX: 0000000000000000 RBX: ffff88007d403368 RCX: ffff88007d403510
[ 82.262801] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff80673da0
[ 82.262803] RBP: ffff880037555ab0 R08: 00000000ffffffff R09: 000000168d25073e
[ 82.262805] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007d403368
[ 82.262808] R13: 0000000000000246 R14: ffff88007d403120 R15: ffffffffa0173168
[ 82.262811] FS: 00007f296006d770(0000) GS:ffff88007f802880(0000) knlGS:0000000000000000
[ 82.262813] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 82.262815] CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006e0
[ 82.262818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 82.262820] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 82.262823] Process khubd (pid: 1319, threadinfo ffff880037554000, task ffff8800375159c0)
[ 82.262825] Stack: ffff88007d403864 ffff88007d403368 ffff880037555ae0 ffffffff8042d8df
[ 82.262830] ffff88007d403368 ffff88007d403120 0000000000000246 ffff88007d403800
[ 82.262834] ffff880037555b00 ffffffff8042dac6 ffff88007d403800 ffff88007d403000
[ 82.262838] Call Trace:
[ 82.262843] [<ffffffff8042d8df>] device_del+0x1f/0x1f0
[ 82.262846] [<ffffffff8042dac6>] device_unregister+0x16/0x30
[ 82.262867] [<ffffffffa00af60b>] __scsi_remove_device+0x4b/0xa0 [scsi_mod]
[ 82.262880] [<ffffffffa00abc92>] scsi_forget_host+0x72/0x80 [scsi_mod]
[ 82.262894] [<ffffffffa00a3dd1>] scsi...

Read more...

Revision history for this message
Jim Lieb (lieb) wrote :

From this stack trace and the dmesg it appears that this is happening when a CF device is being scanned thru a usb->cf adaptor. The oops is due to
an attempt to list_del_init dev->power.entry when it is null.

What was going on at this time? Was there a CF card in the slot? If there was
a CF card, was it there through the boot and/or was it inserted/removed
later? Is this repeatable?
What happens if the system is booted with maxcpus=0 or 1?

Jim Lieb (lieb)
Changed in linux:
assignee: nobody → lieb
status: Triaged → In Progress
Revision history for this message
Jim Lieb (lieb) wrote :

There are a series of usb bugs in LKML that all boil down to recursive calls to
device_del. Some show up as null dereference OOPs in sysfs and others
show up similar to this for USB storage. The problem is that the usb disable
calls device_del which eventually gets around to releasing the scsi layer which very politely does a device_del on its layered device, namely usb. All these
null dereferences are from the previous pass because device_del clipped the device from all its lists, the one here being for power, as the first thing.
If you look at these various stack traces going all the way back, you see two calls to device_del. For stacked devs like usb over/under scsi or serial, I don't (yet) know how this works at all. There must be some escape in the chain that occasionally doesn't happen. In the scsi case, however, the final+fatal call is deterministic. One solution is to simply null test before removing from lists,
a good thing anyway, but that still leaves this logic lurking about for other innocent damsels to ravish.

One thing for certain is that my questions in the previous comment are no longer relevant. I can crash it on my own box with the same trace.

Revision history for this message
Jim Lieb (lieb) wrote :

This bug can be reproduced by a USB to SD/MMC/MS/CF adaptor that cannot handle SDHC. The failure of the adaptor to recognize/connect the SDHC which uses a slightly different protocol, triggers the error by looping the probe/reset/probe long enough to get messages and state out of sync.

After further investigation, what appeared to be a recursive call that attempted to device_del twice, it turns out that the problem is that the device_add never happened because the adaptor did a reset (and sent a message back) before the initialization was complete. It seems that the device_add in question, the top scsi device is last and the disconnect/reset occurs before we get that far. I cleaned up the power structures init only to fall into the sysfs cleanup of its uninit'ed structures (this moves the null deref from device_del->device_pm_remove to device_del->dpm_sysfs_remove, essentially the next line in device_del.

The problem (and its solution) is that usb_disable_device has to be held off until the initialization is complete. There is a race here in that the failure->reset->reconnect sequence loops around a number of times before it eventually fails. If the setup completes (even in error), the disconnect "does the right thing". There are a number of other open bugs that show a similar signature even though they may be serial or other usb type devs.

Revision history for this message
ser (seanerussell) wrote :

I get this error, too. In my case, the laptop does have an MMC/SD card reader, but no cards are being inserted when this event occurs. Additionally, this has *only* started happening since I performed a fresh Intrepid install on a new hard drive. With a different hard drive that has Intrepid installed -- but installed via an upgrade from Hardy -- I do not see this problem.

The last time this happened to me, USB was working when I went to bed and was hung when I woke up.

I do see that lsusb hangs; I do see that USB stops working; I do not see the kernel stack traces.

--- SER

Revision history for this message
ser (seanerussell) wrote :

After cloning my old hard drive (Intrepid upgraded from Hardy) to the new hard drive (Intrepid from live CD) the problems went away, and USB is now stable.

--- SER

Revision history for this message
Simon Holm (odie-cs) wrote :

Jim,

it almost sounded like you had a fix to this problem, any updates? Perhaps your findings should simply be sent to linux-usb-devel or filed on bugzilla.kernel.org?

Revision history for this message
Bryn Hughes (linux-nashira) wrote :
Download full text (4.2 KiB)

I can duplicate this behavior on my machine quite easily by connecting a cell phone via USB. I appear to have a very similar traceback:

[ 100.949036] usb 3-2: new full speed USB device using ohci_hcd and address 3
[ 101.089059] usb 3-2: device descriptor read/64, error -62
[ 101.333068] usb 3-2: device descriptor read/64, error -62
[ 101.573059] usb 3-2: new full speed USB device using ohci_hcd and address 4
[ 101.713068] usb 3-2: device descriptor read/64, error -62
[ 101.993238] usb 3-2: configuration #1 chosen from 1 choice
[ 102.136454] BUG: unable to handle kernel NULL pointer dereference at 00000003
[ 102.136465] IP: [<f82e05f5>] wdm_probe+0x165/0x410 [cdc_wdm]
[ 102.136477] *pde = 00000000
[ 102.136485] Oops: 0000 [#1] SMP
[ 102.136491] last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/host0/target0:0:0/0:0:0:0/block/sda/sda5/stat
[ 102.136498] Dumping ftrace buffer:
[ 102.136503] (ftrace buffer empty)
[ 102.136507] Modules linked in: cdc_wdm(+) binfmt_misc bridge stp bnep input_polldev video output lp snd_cmipci gameport snd_opl3_lib snd_hda_intel snd_hwdep snd_mpu401_uart snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse snd k8temp serio_raw pcspkr soundcore snd_page_alloc i2c_piix4 ppdev parport_pc parport fglrx(P) ati_agp agpgart usbhid tg3 floppy fbcon tileblit font bitblit softcursor
[ 102.136567]
[ 102.136574] Pid: 3636, comm: modprobe Tainted: P (2.6.28-11-generic #42-Ubuntu) HP Compaq dc5750 Microtower
[ 102.136581] EIP: 0060:[<f82e05f5>] EFLAGS: 00010286 CPU: 1
[ 102.136588] EIP is at wdm_probe+0x165/0x410 [cdc_wdm]
[ 102.136593] EAX: 00000000 EBX: 00000800 ECX: f82e4600 EDX: efde6a00
[ 102.136598] ESI: f098be25 EDI: efb10600 EBP: efb3bcf4 ESP: efb3bca8
[ 102.136602] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 102.136607] Process modprobe (pid: 3636, ti=efb3a000 task=f0ae57f0 task.ti=efb3a000)
[ 102.136611] Stack:
[ 102.136614] efb3bcb4 c01cf830 f07db428 efb3bcd8 c020c036 efde6a00 efde6a1c c020b87b
[ 102.136626] 00000000 c0501a4b efd48000 fffffff4 0800b2f4 efd4805c efde6ab8 f82e2394
[ 102.136638] efde6a00 00000000 efde6a1c efb3bd20 c03bb942 efde6a94 efb3bd0c c020c692
[ 102.136650] Call Trace:
[ 102.136654] [<c01cf830>] ? iput+0x20/0x60
[ 102.136663] [<c020c036>] ? sysfs_addrm_finish+0x36/0xf0
[ 102.136673] [<c020b87b>] ? sysfs_addrm_start+0x5b/0xa0
[ 102.136680] [<c0501a4b>] ? mutex_lock+0xb/0x20
[ 102.136689] [<c03bb942>] ? usb_probe_interface+0xa2/0x130
[ 102.136699] [<c020c692>] ? sysfs_create_link+0x12/0x20
[ 102.136708] [<c034f096>] ? really_probe+0xe6/0x180
[ 102.136716] [<c03bada1>] ? usb_match_id+0x41/0x60
[ 102.136724] [<c034f16e>] ? driver_probe_device+0x3e/0x50
[ 102.136732] [<c034f209>] ? __driver_attach+0x89/0x90
[ 102.136739] [<c034e943>] ? bus_for_each_dev+0x53/0x80
[ 102.136746] [<c034eec9>] ? driver_attach+0x19/0x20
[ 102.136752] [<c034f180>] ? __driver_attach+0x0/0x90
[ 102.136759] [<c034e317>] ? bus_add_driver+0x1c7/0x240
[ 102.136768] [<c034f3a9>] ? driver_register+0x69/0x140
[ 102.136776] [<c03bbc0c>] ? usb_register_driver+0x...

Read more...

Revision history for this message
Bryn Hughes (linux-nashira) wrote :

Forgot to mention, I also have an Athalon CPU

Revision history for this message
Simon Holm (odie-cs) wrote :

d3mia7: Though I only had a quick glance at your oops, I fail to see how your problem is similar to the one reported by Beata. You should file a new bug if it is a different problem.

Revision history for this message
Andy Whitcroft (apw) wrote :

It would be helpful to test the latest mainline kernel to see if that is affected also. I would recommend trying the latest v2.6.28.x kernel (currently 2.6.28.10). It is also worth trying the latest mainline kernel currently 2.6.30. It is possible to install these kernels in parallel with your regular kernels. See the URL below for information on where to get the kernels and how to install them. Please report back here!

    https://wiki.ubuntu.com/KernelMainlineBuilds

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Wasim (wasimahmed) wrote :

I've had at least a similar problem (if not exactly the same) with at least intrepid and jaunty. I have an Asus M2NPV-VM motherboard, and after connecting some devices (eg my phone or pda) lsusb would completely hang.

Since upgrading to karmic (alpha 2) I've found that it now appears to be fixed :) So I guess anyone with this problem would probably find 2.6.30 work for them.

Thanks!

Revision history for this message
Boon Hian Tek (btek) wrote :

Hi, I am having a very similar problem with my Ubuntu 9.04 - the Jaunty Jackalope (Linux btek-phenom 2.6.28-13-generic #45-Ubuntu SMP Tue Jun 30 22:12:12 UTC 2009 x86_64 GNU/Linux).

I have a Logitech Cordless Internet Pro (mouse + keyboard), and a Logitech Bluetooth USB Click mouse.

I usually do not have the Bluetooth mouse plugged-in.

After a while (ranging from a few minutes after reboot to almost a day after), one of the USB device (keyboard / mouse, though most of the time the mouse) will stop responding. However, when the mouse is dead, I could still plug in the other USB wireless mouse and have it working. It will then work for only a few seconds (most of the time) before it dies as well.

Any attempt to lsusb, rmmod will just hang the acting process.

Oh, and when all keyboard and mice are dead, I can still plug in a PS2 keyboard and it will work (indefinitely).

This started about a week ago and am driving me nuts, will need to go look for a USB->PS2 converter or a PS2 mouse soon :(

Revision history for this message
Kasper Peeters (kasper-peeters) wrote :

Confirmed with jaunty: at random moments, the usb mouse stops working, and plugging it back in, or plugging it back into a different port, or connecting a different mouse, makes no difference. The usb keyboard still works at that stage. It's not my machine, but will try to get a dmesg output.

Revision history for this message
Kasper Peeters (kasper-peeters) wrote :

In my case, a BIOS upgrade solved the problem.

For anyone else desparately googling for this problem:
it's a Dell Dimension C521 with an nVidia card, and I upgraded the BIOS from 1.0.3 to 1.1.1.
Changing kernels, disabling nvidia-driver, removing xorg-driver-synaptic and various other attempts at solving it did not make any difference, but the BIOS upgrade seems to have done the trick.

Revision history for this message
Matt Scholz (scholzilla) wrote :

Confirmed in Karmic: Apparently, I have the same or a similar problem on a Gateway MT6452 Notebook (AMD Turion 64 TL-52, but running 32 bit karmic). The problem either started with ibex or jaunty and seems to have gotten worse with karmic. I have 4 usb ports. Sometimes one will just die and the problem can be remedied by switching to another; other times, all four will die or all four will be dead upon reboot. A new problem has emerged with karmic, which is that wireless via usb will be unstable. I'm happy to post any outputs, but will need instruction as I'm a fairly naive user. Upgrading my BIOS isn't really an option for me, or at least one that I'm willing to try. As with other posters, lsusb will hang on me.

Revision history for this message
Elaine Logan (culross7) wrote :

I have a Sony Vaio, and the USB ports stopped working in Karmic. On boot up, the system tells me that it's failing to enumerate USB ports 1 and 2

Revision history for this message
john (no2498) wrote :

i need to ask
will this make my webcam fps drop to under 1 like .7
it just started doing it
im on hardy 804 kernel linux 2.6.24-26 sever gnome 2.22.3

Revision history for this message
jegpad (jane-padley) wrote :

I have just installed Ubuntu Linux on an IBM Thinkpad T43.
The USB corded optical mouse stops working after a few seconds to a few minutes every time.
It has been suggested I get a new mouse, but this mouse works fine if I reboot into Windows.
I was going to get a Logitech mouse advertised on their website as Linux compatible, but I am yet to be convinced it is not the software that is causing the problem.

Revision history for this message
Tuomas Lähteenmäki (lahtis) wrote :

i have a similar problem. Sometimes the USB ports stop working after a while.
 i use 2.6.32-33-generic kernel and prosessor is AMD athlon xp 1800+ is

kernel.log say
 [ 2.723602] usbcore: registered new interface driver hiddev
 [ 2.740064] input: Compaq Compaq Internet Keyboard as /devices/pci0000:00/0000:00:11.2/usb7/7-1/7-1:1.0/input/input4
 [ 2.740548] generic-usb 0003:049F:000E.0001: input,hidraw0: USB HID v1.00 Keyboard [Compaq Compaq Internet Keyboard] on usb-0000:00:11.2-1/input0
 [ 2.740808] irq 17: nobody cared (try booting with the "irqpoll" option)
 [ 2.740816] Pid: 61, comm: udevd Not tainted 2.6.32-33-generic #72-Ubuntu
 [ 2.740820] Call Trace:
 [ 2.740839] [<c058960e>] ? printk+0x1d/0x1f
 [ 2.740851] [<c01a246c>] __report_bad_irq+0x2c/0x90
 [ 2.740856] [<c01a0b84>] ? handle_IRQ_event+0x54/0x150
 [ 2.740861] [<c01a2620>] note_interrupt+0x150/0x190
 [ 2.740867] [<c01a2c2c>] handle_fasteoi_irq+0xac/0xd0
 [ 2.740877] [<c01059ed>] handle_irq+0x1d/0x30
 [ 2.740887] [<c05901dc>] do_IRQ+0x4c/0xc0
 [ 2.740896] [<c021a389>] ? sys_poll+0x59/0xc0
 [ 2.740901] [<c0103a30>] common_interrupt+0x30/0x40
 [ 2.740904] handlers:
 [ 2.740907] [<c0447db0>] (usb_hcd_irq+0x0/0x80)
 [ 2.740922] [<c0447db0>] (usb_hcd_irq+0x0/0x80)
 [ 2.740928] Disabling IRQ #17
 [ 2.801023] usb 7-2: configuration #1 chosen from 1 choice
 [ 2.849279] input: Compaq Compaq Internet Keyboard as /devices/pci0000:00/0000:00:11.2/usb7/7-1/7-1:1.1/input/input5
 [ 2.850037] generic-usb 0003:049F:000E.0002: input,hiddev96,hidraw1: USB HID v1.00 Device [Compaq Compaq Internet Keyboard] on usb-0000:00:11.2-1/input1
 [ 2.863643] input: B16_b_02 USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:11.2/usb7/7-2/7-2:1.0/input/input6
 [ 2.864104] generic-usb 0003:046D:C024.0003: input,hidraw2: USB HID v1.10 Mouse [B16_b_02 USB-PS/2 Optical Mouse] on usb-0000:00:11.2-2/input0
 [ 2.864182] usbcore: registered new interface driver usbhid
 [ 2.864433] usbhid: v2.6:USB HID core driver

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
assignee: Jim Lieb (lieb) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.