rcu_sched_state detected stalls on CPUs/tasks: { 20} (detected by 38, t=xxxxxx jiffiles)

Bug #939676 reported by C de-Avillez
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

The message above was repeated, with changes in the CPUs and the number of files; end result was system either did not boot (more probable, nothing visible in the syslog after a successful reboot), or completely failed soon after boot.

Sequence (times in UTC):

* on 21012-02-22 17:55 the system was dist-upgraded (see below for packages upgraded), but not rebooted.

* at around 19:09, same day, system became unresponsive. Due to a series of other issues, access to the console (via KVM, system is on a remote, lights-off, site) was not possible;

* at least twice until 02:00 2012-02-23 system was rebooted via the power control interface, still failing to appear available under SSH;

* at around 02:00 remote KVM access was successful, and the message in the bug title was collected;

* another reboot was attempted, now successful.

A brief search on Google shows many hits for this message. It is unknown if the dist-upgrade has any causal relation to the error.

UPGRADED PACKAGES:

Start-Date: 2012-02-22 17:55:17
Commandline: apt-get dist-upgrade
Install: python-pycurl:amd64 (7.19.0-4ubuntu2, automatic), linux-headers-3.0.0-16:amd64 (3.0.0-16.28, automatic), linux-headers-3.0.0-16-server:amd64 (3.0.0-16.28, automatic), linux-image-3.0.0-16-server:amd64 (3.0.0-16.28)
Upgrade: bsdtar:amd64 (2.8.4-1, 2.8.4-1ubuntu0.11.10.1), language-pack-gnome-en-base:amd64 (11.10+20111025, 11.10+20120103), libarchive1:amd64 (2.8.4-1, 2.8.4-1ubuntu0.11.10.1), sysvinit-utils:amd64 (2.88dsf-13.10ubuntu4, 2.88dsf-13.10ubuntu4.1), libicu44:amd64 (4.4.2-2, 4.4.2-2ubuntu0.11.10.1), linux-server:amd64 (3.0.0.14.16, 3.0.0.16.19), libxml2-utils:amd64 (2.7.8.dfsg-4, 2.7.8.dfsg-4ubuntu0.1), language-pack-gnome-en:amd64 (11.10+20111121, 11.10+20120103), accountsservice:amd64 (0.6.14-1git1ubuntu1, 0.6.14-1git1ubuntu1.1), python-software-properties:amd64 (0.81.13.1, 0.81.13.3), python-launchpadlib:amd64 (1.9.8-2, 1.9.8-2ubuntu0.1), icedtea-6-jre-cacao:amd64 (6b23~pre11-0ubuntu1.11.10, 6b23~pre11-0ubuntu1.11.10.1), language-selector-common:amd64 (0.56, 0.56.1), jenkins-cli:amd64 (1.409.1-0ubuntu4.1, 1.409.1-0ubuntu4.2), openjdk-6-jre-lib:amd64 (6b23~pre11-0ubuntu1.11.10, 6b23~pre11-0ubuntu1.11.10.1), libaccountsservice0:amd64 (0.6.14-1git1ubuntu1, 0.6.14-1git1ubuntu1.1), linux-headers-server:amd64 (3.0.0.14.16, 3.0.0.16.19), update-manager-core:amd64 (0.152.25.5, 0.152.25.8), openjdk-6-jre-headless:amd64 (6b23~pre11-0ubuntu1.11.10, 6b23~pre11-0ubuntu1.11.10.1), udev:amd64 (173-0ubuntu4, 173-0ubuntu4.1), libvirt0:amd64 (0.9.2-4ubuntu15.1, 0.9.2-4ubuntu15.2), qemu-kvm:amd64 (0.14.1+noroms-0ubuntu6, 0.14.1+noroms-0ubuntu6.2), libcurl3:amd64 (7.21.6-3ubuntu3, 7.21.6-3ubuntu3.2), ifupdown:amd64 (0.7~alpha5.1ubuntu5, 0.7~alpha5.1ubuntu5.1), icedtea-6-jre-jamvm:amd64 (6b23~pre11-0ubuntu1.11.10, 6b23~pre11-0ubuntu1.11.10.1), devscripts:amd64 (2.11.1ubuntu3, 2.11.1ubuntu3.1), libxml2:amd64 (2.7.8.dfsg-4, 2.7.8.dfsg-4ubuntu0.1), curl:amd64 (7.21.6-3ubuntu3, 7.21.6-3ubuntu3.2), isc-dhcp-client:amd64 (4.1.1-P1-17ubuntu10, 4.1.1-P1-17ubuntu10.1), libvorbisenc2:amd64 (1.3.2-1ubuntu2, 1.3.2-1ubuntu2.1), libpng12-0:amd64 (1.2.46-3ubuntu1, 1.2.46-3ubuntu1.1), libudev0:amd64 (173-0ubuntu4, 173-0ubuntu4.1), x11-common:amd64 (7.6+7ubuntu7, 7.6+7ubuntu7.1), libpq5:amd64 (9.1.1-1, 9.1.2-0ubuntu0.11.10.2), jenkins-slave:amd64 (1.409.1-0ubuntu4.1, 1.409.1-0ubuntu4.2), linux-image-server:amd64 (3.0.0.14.16, 3.0.0.16.19), openssl:amd64 (1.0.0e-2ubuntu4, 1.0.0e-2ubuntu4.2), libcurl3-gnutls:amd64 (7.21.6-3ubuntu3, 7.21.6-3ubuntu3.2), python-libvirt:amd64 (0.9.2-4ubuntu15.1, 0.9.2-4ubuntu15.2), linux-libc-dev:amd64 (3.0.0-14.23, 3.0.0-16.28), sysv-rc:amd64 (2.88dsf-13.10ubuntu4, 2.88dsf-13.10ubuntu4.1), bridge-utils:amd64 (1.5-2ubuntu1, 1.5-2ubuntu1.1), isc-dhcp-common:amd64 (4.1.1-P1-17ubuntu10, 4.1.1-P1-17ubuntu10.1), libvorbis0a:amd64 (1.3.2-1ubuntu2, 1.3.2-1ubuntu2.1), libvirt-bin:amd64 (0.9.2-4ubuntu15.1, 0.9.2-4ubuntu15.2), ubuntu-iso-testing:amd64 (1.3+255~oneiric1, 1.3+256~oneiric1), language-pack-en-base:amd64 (11.10+20111025, 11.10+20120103), binutils:amd64 (2.21.53.20110810-0ubuntu5, 2.21.53.20110810-0ubuntu5.1), openjdk-6-jre:amd64 (6b23~pre11-0ubuntu1.11.10, 6b23~pre11-0ubuntu1.11.10.1), initscripts:amd64 (2.88dsf-13.10ubuntu4, 2.88dsf-13.10ubuntu4.1), language-pack-en:amd64 (11.10+20111121, 11.10+20120103), python-libxml2:amd64 (2.7.8.dfsg-4, 2.7.8.dfsg-4ubuntu0.1), libssl1.0.0:amd64 (1.0.0e-2ubuntu4, 1.0.0e-2ubuntu4.2), qemu-common:amd64 (0.14.1+noroms-0ubuntu6, 0.14.1+noroms-0ubuntu6.2)
End-Date: 2012-02-22 17:56:32

ProblemType: BugDistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-16-server 3.0.0-16.28
ProcVersionSignature: Ubuntu 3.0.0-16.28-server 3.0.17
Uname: Linux 3.0.0-16-server x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 2012-02-23 02:40 seq
 crw-rw---- 1 root audio 116, 33 2012-02-23 02:40 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Date: Thu Feb 23 16:59:46 2012
HibernationDevice: RESUME=UUID=a1472143-3b38-4e18-8f18-905eb5218958
IwConfig: Error: [Errno 2] No such file or directory
MachineType: QCI QSSC-S4R
PciMultimedia:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-16-server root=UUID=63807f6f-2707-4337-9825-4982025129ce ro quiet
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-16-server N/A
 linux-backports-modules-3.0.0-16-server N/A
 linux-firmware 1.60
RfKill: Error: [Errno 2] No such file or directorySourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/15/2010
dmi.bios.vendor: Intel Corp.
dmi.bios.version: QSSC-S4R.QCI.01.00.S008.111520101457
dmi.board.asset.tag: ....................
dmi.board.name: QSSC-S4R
dmi.board.vendor: QCI
dmi.board.version: 31S4RMB00A0
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 17
dmi.chassis.vendor: ..............................
dmi.chassis.version: 32S4RCS0010
dmi.modalias: dmi:bvnIntelCorp.:bvrQSSC-S4R.QCI.01.00.S008.111520101457:bd11/15/2010:svnQCI:pnQSSC-S4R:pvr....................:rvnQCI:rnQSSC-S4R:rvr31S4RMB00A0:cvn..............................:ct17:cvr32S4RCS0010:
dmi.product.name: QSSC-S4R
dmi.product.version: ....................
dmi.sys.vendor: QCI

Revision history for this message
C de-Avillez (hggdh2) wrote :
description: updated
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
C de-Avillez (hggdh2) wrote :

I understand there would probably be a kernel OOPS somewhere during this, but we could not access KVM in time to see it, and it was not logged.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Carlos,

Is this a new issue? Do you know if it was also happening in 3.0.0-15?

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Revision history for this message
Andy Whitcroft (apw) wrote :

Yes this message tends to indicate something got stuck earlier. The system is waiting on that to release resources, and it has not occurred. With luck there would have been an oops at the tip of this, so if we suspect it will occur again we want to get the console logged if possible to catch that in full. Also when it does occur again it would be worth asking if there are stuck processes or spinning processes via sysrq:

Its worth trying to get the output of the following sysrqs: l (all active cpus) w (all blocked processes) d (all held locks)

Revision history for this message
C de-Avillez (hggdh2) wrote :

@Joseph: to my knowledge and memory, this is the first time we see it. The syslogs go back to Feb 18 only, and at least in them I see no messages having 'rcu' in them

tags: removed: kernel-key
penalvch (penalvch)
tags: added: bios-outdated-r0035 needs-upstream-testing regression-potential
penalvch (penalvch)
tags: added: bios-outdated-r0036
removed: bios-outdated-r0035
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.