system is wedged, new ssh connections are closed instantly and console is black with a mouse but does not accept keyboard input

Bug #672793 reported by Greg Dahlman
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

This started happening on Thursday November 4th.

This only seems to happen when the host is idle at the console. although a KVM instance was actively being used on one of the effected hosts.

The system will hang and will be unresponsive to any keyboard input although you can see a mouse cursor.

ssh connections are closed as if they were filtered.

ctl-alt F* does not work

oddly enough a windows KVM host was still responsive.

A coworker is also having this issue with his machine which is running vm's in virtualbox

Both systems are Dell Optiplex 960s

The system did not respond to a soft shutdown with the power button

We were forced to reboot by using the NMI shutdown by holding the power button down.

Unfortunately it appears that syslog just quit logging and there are no errors in any log files.

We both did run updates and here is the entry of what was updated that day on my machine.

Start-Date: 2010-11-04 12:25:04
Upgrade: gvfs-fuse:amd64 (1.6.4-0ubuntu1, 1.6.4-0ubuntu1.1), ubufox:amd64 (0.9~rc2-0ubuntu5, 0.9~rc2-0ubuntu5.1), gvfs-backends:amd64 (1.6.4-0ubuntu1, 1.6.4-0ubuntu1.1), chromium-browser:amd64 (6.0.472.63~r59945-0ubuntu2, 7.0.517.41~r62167-0ubuntu0.10.10.1), lib32asound2:amd64 (1.0.23-1ubuntu2, 1.0.23-1ubuntu2.1), simple-scan:amd64 (2.32.0-0ubuntu3, 2.32.0-0ubuntu4), python-aptdaemon:amd64 (0.31+bzr506-0ubuntu2, 0.31+bzr506-0ubuntu4), alsa-utils:amd64 (1.0.23-2ubuntu3, 1.0.23-2ubuntu3.4), aptdaemon:amd64 (0.31+bzr506-0ubuntu2, 0.31+bzr506-0ubuntu4), libasound2:amd64 (1.0.23-1ubuntu2, 1.0.23-1ubuntu2.1), python-aptdaemon-gtk:amd64 (0.31+bzr506-0ubuntu2, 0.31+bzr506-0ubuntu4), xul-ext-ubufox:amd64 (0.9~rc2-0ubuntu5, 0.9~rc2-0ubuntu5.1), chromium-browser-inspector:amd64 (6.0.472.63~r59945-0ubuntu2, 7.0.517.41~r62167-0ubuntu0.10.10.1), libgvfscommon0:amd64 (1.6.4-0ubuntu1, 1.6.4-0ubuntu1.1), gvfs:amd64 (1.6.4-0ubuntu1, 1.6.4-0ubuntu1.1)
End-Date: 2010-11-04 12:25:11

root@gdahlmpc:/var/log# lsb_release -rd
Description: Ubuntu 10.10

root@gdahlmpc:/var/log# uname -a
Linux gdahlmpc 2.6.35-22-generic #35-Ubuntu SMP Sat Oct 16 20:45:36 UTC 2010 x86_64 GNU/Linux

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: linux-image-2.6.35-22-generic 2.6.35-22.35
Regression: Yes
Reproducible: No
ProcVersionSignature: Ubuntu 2.6.35-22.35-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic x86_64
NonfreeKernelModules: fglrx
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gdahlm 2289 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfebdc000 irq 50'
   Mixer name : 'Analog Devices AD1984A'
   Components : 'HDA:11d4194a,10280276,00100400'
   Controls : 34
   Simple ctrls : 20
Date: Mon Nov 8 14:21:49 2010
Frequency: Once every few days.
HibernationDevice: RESUME=UUID=f98434f0-ec69-4b2a-bf55-57d8bcb570bb
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release Candidate amd64 (20100928)
MachineType: Dell Inc. OptiPlex 960
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-22-generic root=UUID=4a9a32c5-a491-454e-b7fd-83b69dd0b45c ro quiet splash
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.38
RfKill:

SourcePackage: linux
WifiSyslog:
 Nov 8 11:51:26 gdahlmpc kernel: [ 3744.524589] lo: Disabled Privacy Extensions
 Nov 8 13:10:08 gdahlmpc kernel: [ 8466.786534] python[4072]: segfault at 100000000 ip 00007f9c9ba139ee sp 00007fff50fe77e0 error 4 in libdbusmenu-glib.so.1.0.17[7f9c9ba0f000+10000]
 Nov 8 14:20:04 gdahlmpc kernel: [12663.099345] ACPI Warning: Incorrect checksum in table [TCPA] - 0x00, should be 0x7F (20100428/tbutils-314)
 Nov 8 14:20:04 gdahlmpc kernel: [12663.099430] ACPI Warning: Incorrect checksum in table [TCPA] - 0x00, should be 0x7F (20100428/tbutils-314)
dmi.bios.date: 02/17/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A03
dmi.board.name: 0Y958C
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 6
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA03:bd02/17/2009:svnDellInc.:pnOptiPlex960:pvr:rvnDellInc.:rn0Y958C:rvrA00:cvnDellInc.:ct6:cvr:
dmi.product.name: OptiPlex 960
dmi.sys.vendor: Dell Inc.

Revision history for this message
Greg Dahlman (gdahlm) wrote :
Revision history for this message
rooijan (rrossouw) wrote :

Same issue. Mine is completely locked up the 3rd time in 5 days. I suspect after inactivity and/or screen saving. Nothing in syslog. Started 5 Nov. Dell Optiplex 960 w/ U10.10 64bit. Plus ATI Radeon HD 3450. No compiz enabled. Never had any issues on 10.04 64bit.

Revision history for this message
rooijan (rrossouw) wrote :

More information. Another lockup. Can ping host. Can not ssh in. This time Ctl-Alt-F1 gave me a console. Could not login but there was a ext4_fine_entry error.

Also this is system is configured with / on an ADATA SSD S599 64GB disk. It is possible these problems started occurring after an update but also could be related to the EXT4 file system on a SSD. I see a somewhat related bug 453579 about ext4. IN addition I see there is some ext4 patches in 2.6.37-rc1called "ext4: Add batched discard support for ext4".

Revision history for this message
rooijan (rrossouw) wrote :

Another update. After a few more lockups I have gone back to not using the SSD for my root file system. So far no lockups. The OS is identical to before(10.10 64bit) except I have not enabled the ATI fglrx X driver yet. I will give it a few days before trying the ATI fglrx driver again.

Current X driver:
root@jamaica:~# lshw -C display | grep -i driver
       configuration: driver=radeon latency=0

Revision history for this message
Greg Dahlman (gdahlm) wrote :

This seems to be a ATI fglrx issue.

I will install an nvidia based card on Monday and see if that makes the issue go away.

Revision history for this message
Greg Dahlman (gdahlm) wrote :

I installed a nVidia based card and so far things are pretty good. At least it extended the crash time.

root@gdahlmpc:~# lspci | grep VGA
01:00.0 VGA compatible controller: nVidia Corporation G84 [GeForce 8600 GT] (rev a1)

If it makes it without dieing for a little longer I will send it to AMD

Greg Dahlman (gdahlm)
affects: linux → fglrx
Revision history for this message
rooijan (rrossouw) wrote :

So far so good for me with no SSD and no fglrx.

Revision history for this message
rhr396 (rhr396-) wrote :

Dell Dimension 4600 w/Pentium 4 dual processors running Xubuntu 10.04 latest update through an Airlink 101 KVM switch hubbed to a second Airlink 101 KVM switch.

As soon as the screen exits grub to boot the screen goes blank with a no signal message and system locks. The only way it shuts down is to hold the power button until it turns off. I can get a diagnostic report after starting using 2.6.24-28 generic (recovery mode). At the end of the report I have an unresponsive curser. The system boots fine using the live cd or bypassing the KVM.

Following is the diagnostic as best as I could get off the screen. A few characters might be missing due to being off screen.

libudev: udev_monitor_new_from_netlink: error getting socket: Invalid argument
[ 80.496823] wait-for-root[864]: segfault at 00000030 eip b7722f2b esp bfda41
0 error 4
Segmentation fault
Gave up waiting for root device. Common problems:
 - Boot args (cat /proc/cmdline)
  - Check root delay= (did the system wait long enough?)
  - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uvid/6cbffd1d-0d30-4272-84f8-a77f4ff7f3c6 does not exist. D
opping to a shell!

BusyBox v1.13.3 (Ubuntu 1.13.3-1ubuntu11) built-in shell (ash)

Enter 'help' for a list of built-in commands.

(intramfs) _

As stated there is no response to input and screen will eventually go blank although there will be no loss of video signal like there is during a normal boot.
I do not know coding but I'm not totally inept. Best I can tell is that there seems to be an issue with the system on this box dealling with the KVM switch. I have several other machines of varying makes and models running the same system without any problems. This machine did fine with Xubuntu 8.04 and with 10.04 at initial upgrade.

Thanks for any help.

Revision history for this message
rhr396 (rhr396-) wrote :

I forgot to mention that I am using the onboard video.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
penalvch (penalvch)
tags: added: bios-outdated-f11
no longer affects: linux (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Revision history for this message
penalvch (penalvch) wrote :

Greg Dahlman, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.13-rc5

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

Changed in linux (Ubuntu):
status: New → Confirmed
penalvch (penalvch)
affects: fglrx → linux (Ubuntu)
Changed in linux (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.