System hangs due to cascade of smsc95xx errors

Bug #664477 reported by Dave Martin
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Linux
Fix Released
Medium
Unassigned

Bug Description

Observed in:
linux-image-2.6.35-1006-linaro-omap (2.6.35-1006.12)
linaro alip images
 * htp://snapshots.linaro.org/10.11-daily/linaro-alip/20101021/0/images/tar/linaro-m-alip-tar-20101021-0.tar.gz
 * http://jameswestby.net\:8080/job/linaro-omap3%20hwpack/58/artifact/hwpack_linaro-omap3_20101021-58_armel_supported.tar.gz

Sometimes, the kernel prints out messages like this repeatedly to the console. This can continue for minutes at a time, during which the rest of the system appears to hang completely:

[ 1500.360473] smsc95xx 1-2.1:1.0: usb0: kevent 4 may have been dropped

The system is totally unusable while this is going on; the mouse pointer won't move etc.

After a while, the problem seems to resolve itself, and the network continues to work OK.

Tags: armel omap3
Changed in linux-linaro:
importance: Undecided → Medium
Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

Possibly related; on 2.6.37-1002-linaro-omap #5-Ubuntu on a Panda I'm seeing an occasional stream of:

[122686.343780] smsc95xx 1-1.1:1.0: usb0: kevent 2 may have been dropped

(This may be related to some page allocation failures I'm seeing associated with the ether).

Dave

Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

'4' appears to be EVENT_LINK_RESET while 2 is EVENT_RX_MEMORY.
(from linux/usb/usbnet.h) so it appears we have different causes

Dave

Revision history for this message
warmcat (andy-warmcat) wrote :

FWIW I have seen the occasional

[ 334.310791] smsc95xx 1-1.1:1.0: usb0: kevent 4 may have been dropped

On Panda, it doesn't block the thing noticeably and occurs once now and again. Maybe it's something to do with, eg, IRQ latency.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Note that I've not been able to reproduce the original catastrophic error originally seen on xM.

Revision history for this message
Mounir Bsaibes (mounir-bsaibes) wrote :

warmcat, are you still seeing this error occasioanly?

Revision history for this message
Mounir Bsaibes (mounir-bsaibes) wrote :

Per the minutes of the kernel meeting of 3/7/2011. this bug can be closed.
I changed the status to fix released - but no particular fix is known. It is just not reproduce-able any more.

Changed in linux-linaro:
status: New → Fix Released
Revision history for this message
Josef Raschen (josef-raschen) wrote :

[ 282.992797] smsc95xx 1-1.1:1.0: usb0: kevent 4 may have been dropped

On my Pandaboard Rev A2 this line appears, when an ethernet cable is plugged into the board. When removed, the line is no longer written. Tested with latest builds of ubuntu desktop and netbook.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Josef

Can you still reproduce this? If you have a reliable test case, specifying exact images then this would be really useful since we've had trouble reproducing this problem so far.

Revision history for this message
Josef Raschen (josef-raschen) wrote :

The SD-Card I used for linaro has been formatted. If I have enough time (and find another 4G SD-Card), I'll try to reproduce the error the next weekend.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Josef

Thanks, if you get an opportunity for that it would be great.

Revision history for this message
Josef Raschen (josef-raschen) wrote :

This is the output of the serial console I get from my Pandaboard (rev A2) using linaro daily build (hwpack: 20110409, ubuntu desktop 20110409):
http://pastebin.com/uyLa30P2

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Josef

Thanks

[attaching the pastebin content from the above post so we can still see it after the pastebin post expires]

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Josef

Can you describe what you did to trigger the error?

Since this issue appears to be tricky to reproduce, it would be useful to have an description of exactly what you were doing that might have caused it to happen.

Revision history for this message
Josef Raschen (josef-raschen) wrote :

I just boot and when I plug the ethernet cable into the board I get these errors. In ubuntu I cannot get a working network connection. When I remove the cable, the error messages stop.

But I just found out something interesting: I changed my network switch (from SMC 2804WBR EU to some noname-device) and now I get no more error messages and the network connection is working.

Revision history for this message
Ben Gamari (bgamari) wrote :

I can confirm that this bug still exists on a BeagleBoard XM running a 3.1-rc9 kernel (no Linaro patches). The problem occurs often when the adapter is under heavy-ish load (>1MB/s). I've attached the output of dmesg after such an incident.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The workaround we have at the official Ubuntu images:
# The driver smsc95xx, used by both Beagleboard XM and Pandaboard uses
# turbo mode by default, that enables multiple frames per Rx transaction,
# increasing performance but consuming more kernel memory.
# To avoid page allocation failures and smsc kevent drops we need to
# increase the minimum free system memory in the kernel to a higher value.
# If you're still having page allocation failures, try to increase this
# value to 12288 or even higher. You could also disable the driver's turbo
# mode, but decreasing the ethernet performance.
# If you encounter problems due to the settings please file a bug
# against the jasper-initramfs ubuntu package.
# For more details please check http://bugs.launchpad.net/bugs/746137
vm.min_free_kbytes = 8192

Revision history for this message
Ben Gamari (bgamari) wrote :

Thanks for the explanation!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.