[lucid] system locks up (system_call_fastpath+0x16/0x1b)

Bug #671001 reported by Peter Matulis
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Unassigned
Lucid
Won't Fix
High
Unassigned

Bug Description

Ubuntu 10.04.1 LTS

Linux hostname10 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:21:58 UTC 2010 x86_64 GNU/Linux

This system is experiencing regular lockups and is only accessible via IPMI (screenshots attached). It is a physical host under heavy load (couchdb) with some swapping occurring. The I/O is taking place on the XFS filesystem.

The output to 'lspci -vvnnk' is also attached (lspci.txt) and a full report (apport-collect) should be forthcoming.

Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Latest screen shot from Customer. It appears in all cases of the softlockup, system_call_fastpath is always the common system call.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Requested the following from Customer:
apport-collect -p linux 671001

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

We are unable to attach the output of apport due to bug: https://bugs.launchpad.net/ubuntu/+source/apport/+bug/626215

Is there an alternative way to collect this data?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The bug in comment 12 is not related. It was resolved by disabling XFS debugging, which is not enabled on these machines.

Stefan Bader (smb)
tags: added: kernel-server
Revision history for this message
Stefan Bader (smb) wrote :

Usually with those lockup things it would be good to have the first one and in order (I personally find it hard to get the picture from multiple screenshots). The system_fastcall just means that this is a system call. Which was a futex call in one of the recent screenshots, but I think there have been other calls too.

Maybe it is possible to run a tail -f on /var/log/syslog over ssh and get at least some of the early message before the system locks up completely?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the feedback, Stefan. We will try running the tail -f on /var/log/syslog over ssh like you suggest.

Another thought was to setup crashdump and set kernel.softlockup_panic = 1. This will cause the system to panic when the soft lockup happens, and may allow us to get further debug information. However, we will try the tail -f first.

tags: added: kj-triage
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

An attempt was made to collect crashdump data. However, we ran into the following bug:
https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/710733

Stefan Bader (smb)
Changed in linux (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Changed in linux (Ubuntu Lucid):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Nhomar - Vauxoo (nhomar) wrote :

I have the same problem just intalling ubuntu 10.04 64bits.
Procesor i3
Motherboard h55m-Le brand asrock

Revision history for this message
Nhomar - Vauxoo (nhomar) wrote :

I attach screenshot

Revision history for this message
Stefan Bader (smb) wrote :

Nahomar, unfortunately the screenshot is only partially useful as it is missing the top of the trace. Also it would be good if you can provide more details in general. Did you install from the release or later point release CDs? Have you updated the system (iow what kernel version are you using)? Does this happen randomly or can you trigger it to happen, if so, what steps cause it to happen?

Jospeh, is this problem still happening on your side? If yes (and after having a bit more hands on ipmi experience), maybe it would be possible to use SOL instead of the iKVM. For example when set to be the second serial port, "console=ttyS1,115200n8" (if the port runs at that speed) redirects all console output there. And maybe this allows to get a better capture.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Stefan

We no longer have access to the environment where this issue was happening. If I come across another environment where this happens, I will update the bug.

Revision history for this message
Michael Hansen (mica) wrote :

I seems to be having the same problem on Ubuntu 11.04 64bit. Often my system freezes when booting. I have attached a picture of the screen in recovery boot mode. My system was installed as an update on 10.04. The computer did not have the problem before updating.

Revision history for this message
Stefan Bader (smb) wrote :

@Epeedk, no, pretty certain not. Unfortunately the generic title of this report will cause many people to believe so. The problem is that the last line of a stack trace is the beginning of functions being called and this will be system_call_fastpath in most of the cases (maybe even always) as that is user-space calling some kernel function. The interesting parts are the top of that trace and unfortunately that is off the screen so often.
In your case this looks like some problem with the interrupt handler of the ene_ir module. And since it did not happen before this is either a now driver or a now bug there. But please, open a new report for that. Maybe check before whether there is always ene_ir present in the trace when boot fails. To open a new bug just run "ubuntu-bug linux" from the running system. Thanks.

Revision history for this message
Stefan Bader (smb) wrote :

The report now is really old and there was no update for it for a long time (just stumbled over it doing a bit of housekeeping). Unfortunately this one was never resolved. Or, as there has never been more queries about it, it got resolved in some other way. I just mark it wontfix to have it in a closed state.

Changed in linux (Ubuntu Lucid):
assignee: Stefan Bader (stefan-bader-canonical) → nobody
status: Confirmed → Won't Fix
Revision history for this message
dino99 (9d9) wrote :

This is no more a supported version now

Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.