[K8M800] Xorg freeze / Kernel 2.6.32-10-generic freeze

Bug #508699 reported by Thomas Schweikle
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Chase Douglas
xserver-xorg-video-openchrome (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: xorg

With kernel 2.6.32-9-generic the logon screen does not freeze. With kernel 2.6.32-10-generic it does.

Freez starts with display of the logon screen. No mouse, no keyboard. Alt-Ctrl-F1-4 do not work. Alt-Ctrl-Backspace either. Alt-Ctrl-Del doesn't have any result. Trying to logon remotely isn't working --- timeout. Pinging the system has same result.

Only a hard reset or powering off helps out.

I have tried to disable splash, remove "quiet splash" from the kernel commandline, but with no result: after the kernel is loaded it reinitializes graphics. Only seeing a black screen after that. This with both kernels 2.6.32-9 and 2.6.32-10.
Furthermore: trying to switch back to a text console with Alt-Ctrl-F1 only shows a black screen without any cursor. Adding "vga=normal" to the kernel commandline leads to output: "vga=normal is deprecated ...".

under some circumstances it is neccessary to boot windows in between to have the display work again with linux. Looks to me a lot like linux not initializing graphics complete, leaving things as found from boots before. Messing things up in a way nothing working any more. This state is noticable while booting: grub will not display it's menu. Just some random graphics, looking like modern art.

ProblemType: Bug
Architecture: amd64
Date: Sun Jan 17 11:24:12 2010
DistroRelease: Ubuntu 10.04
InstallationMedia: Error: [Errno 13] Permission denied: '/var/log/installer/media-info'
MachineType: System manufacturer System Product Name
Package: xorg 1:7.5+1ubuntu1
ProcCmdLine: BOOT_IMAGE=//vmlinuz-2.6.32-9-generic root=UUID=cd4782ce-cbe5-46bb-a1ca-0ef613d65be1 ro quiet splash
ProcEnviron:
 LANG=de_DE.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-9.13-generic
RelatedPackageVersions:
 xserver-xorg 1:7.5+1ubuntu1
 libgl1-mesa-glx 7.7-0ubuntu5
 libdrm2 2.4.17-0ubuntu1
 xserver-xorg-video-intel 2:2.9.1-1ubuntu1
 xserver-xorg-video-ati 1:6.12.99+git20091125.0061c4db-0ubuntu2
SourcePackage: xorg
Symptom: display
Tags: freeze lucid
Title: Xorg freeze
Uname: Linux 2.6.32-9-generic x86_64
XorgConf: Error: [Errno 2] No such file or directory: '/etc/X11/xorg.conf'
XorgLogOld:

dmi.bios.date: 03/20/2006
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0211
dmi.board.name: K8V-MX
dmi.board.vendor: ASUSTeK Computer Inc.
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 6
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0211:bd03/20/2006:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerInc.:rnK8V-MX:rvr:cvnChassisManufacture:ct6:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
fglrx: Not loaded
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 architecture: x86_64kernel: 2.6.32-9-generic

Revision history for this message
Thomas Schweikle (tps) wrote :
Bryce Harrington (bryce)
affects: xorg (Ubuntu) → xserver-xorg-video-openchrome (Ubuntu)
Revision history for this message
Thomas Schweikle (tps) wrote :

Since last updates it is absolutely neccessary to boot windows before rebooting into Linux. X will not work otherwise.

Revision history for this message
Thomas Schweikle (tps) wrote :
Download full text (6.0 KiB)

Since last updates it is absolutely necessary to boot windows before rebooting into Linux. X will not work otherwise. If Windows wasn't booted before the whole system will freeze instantly, showing some graphical garbage. No mouse, no keyboard, no remote shell. Even most logs are not written (or whatever had to be written wasn't committed and rolled back next boot). Only hard resetting helps out.

Booting again, without having started windows before leads to exactly the same result. Trying "recovery mode" has the same (!) result. It isn't possible to enter anything. Same garbled screen.

Having booted windows before all is OK. I can work, X running.

This is all with kernel 2.6.32-9. Booting with kernel 2.6.32-10 will never lead to a working system, regardless of booting windows before or not. Difference: no garbage on screen. Only a black screen without anything on it, except a mouse cursor, which I can not move.

Versions of xorg at Jan, 19th:
ii python-xkit 0.4.2 library for the manipulation of the xorg.con
ii xorg 1:7.5+1ubuntu1 X.Org X Window System
ii xorg-docs-core 1:1.4-5 Core documentation for the X.org X Window Sy
ii xserver-xorg 1:7.5+1ubuntu1 the X.Org X server
ii xserver-xorg-core 2:1.7.3.902-1ubuntu8 Xorg X server - core server
ii xserver-xorg-input-all 1:7.5+1ubuntu1 the X.Org X server -- input driver metapacka
ii xserver-xorg-input-evdev 1:2.3.2-3 X.Org X server -- evdev input driver
ii xserver-xorg-input-mouse 1:1.5.0-1 X.Org X server -- mouse input driver
ii xserver-xorg-input-synaptics 1.2.0-3ubuntu2 Synaptics TouchPad driver for X.Org server
ii xserver-xorg-input-vmmouse 1:12.6.5-2ubuntu1 X.Org X server -- VMMouse input driver to us
ii xserver-xorg-video-all 1:7.5+1ubuntu1 the X.Org X server -- output driver metapack
ii xserver-xorg-video-apm 1:1.2.2-1 X.Org X server -- APM display driver
ii xserver-xorg-video-ark 1:0.7.2-1 X.Org X server -- ark display driver
ii xserver-xorg-video-ati 1:6.12.99+git20091125.0061c4db-0ubuntu2 X.Org X server -- ATI display driver wrapper
ii xserver-xorg-video-chips 1:1.2.2-1 X.Org X server -- Chips display driver
ii xserver-xorg-video-cirrus 1:1.3.2-1ubuntu1 X.Org X server -- Cirrus display driver
ii xserver-xorg-video-fbdev 1:0.4.1-1 X.Org X server -- fbdev display driver
ii xserver-xorg-video-i128 1:1.3.3-1 ...

Read more...

Revision history for this message
Thomas Schweikle (tps) wrote :
Download full text (5.1 KiB)

some updates for xorg, but no change.

ii python-xkit 0.4.2 library for the manipulation of the xorg.con
ii xorg 1:7.5+1ubuntu1 X.Org X Window System
ii xorg-docs-core 1:1.5-1 Core documentation for the X.org X Window Sy
ii xserver-xorg 1:7.5+1ubuntu1 the X.Org X server
ii xserver-xorg-core 2:1.7.3.902-1ubuntu9 Xorg X server - core server
ii xserver-xorg-input-all 1:7.5+1ubuntu1 the X.Org X server -- input driver metapacka
ii xserver-xorg-input-evdev 1:2.3.2-3 X.Org X server -- evdev input driver
ii xserver-xorg-input-mouse 1:1.5.0-1 X.Org X server -- mouse input driver
ii xserver-xorg-input-synaptics 1.2.0-3ubuntu2 Synaptics TouchPad driver for X.Org server
ii xserver-xorg-input-vmmouse 1:12.6.5-2ubuntu1 X.Org X server -- VMMouse input driver to us
ii xserver-xorg-video-all 1:7.5+1ubuntu1 the X.Org X server -- output driver metapack
ii xserver-xorg-video-apm 1:1.2.2-1 X.Org X server -- APM display driver
ii xserver-xorg-video-ark 1:0.7.2-1 X.Org X server -- ark display driver
ii xserver-xorg-video-ati 1:6.12.99+git20091125.0061c4db-0ubuntu2 X.Org X server -- ATI display driver wrapper
ii xserver-xorg-video-chips 1:1.2.2-1 X.Org X server -- Chips display driver
ii xserver-xorg-video-cirrus 1:1.3.2-1ubuntu1 X.Org X server -- Cirrus display driver
ii xserver-xorg-video-fbdev 1:0.4.1-1 X.Org X server -- fbdev display driver
ii xserver-xorg-video-i128 1:1.3.3-1 X.Org X server -- i128 display driver
ii xserver-xorg-video-intel 2:2.9.1-1ubuntu1 X.Org X server -- Intel i8xx, i9xx display d
ii xserver-xorg-video-mach64 6.8.2-2 X.Org X server -- ATI Mach64 display driver
ii xserver-xorg-video-mga 1:1.4.11.dfsg-2 X.Org X server -- MGA display driver
ii xserver-xorg-video-neomagic 1:1.2.4-1 X.Org X server -- Neomagic display driver
ii xserver-xorg-video-nv 1:2.1.15-1ubuntu3 X.Org X server -- NV display driver
ii xserver-xorg-video-openchrome 1:0.2.904+svn812-1ubuntu1 X.Org X server -- VIA display driver
ii xserver-xorg-video-r128 6.8.1-2 X.Org X ...

Read more...

Revision history for this message
Bartosz Kosiorek (gang65) wrote :

Attach your Xorg.0.log file after Xorg freeze (or Xorg.0.log.old),
and after Windows reboot.

Revision history for this message
Thomas Schweikle (tps) wrote :

Without booting Windows before: no log written.
With booting Windows before: log written.

Revision history for this message
Thomas Schweikle (tps) wrote :

Comment #6 was for kernel 2.6.32-9

For kernel 2.6.32-10 Xorg.0.log is written, but empty. In both cases: booting with or without having booted windows before.

Revision history for this message
Thomas Schweikle (tps) wrote :

Tested the new Kernel 2.6.32-11: same as with 2.6.10: Xorg.0.log is written, but empty --- in both cases: booting with or without having booted windows before.

Revision history for this message
Thomas Schweikle (tps) wrote :

Tested the new Kernel 2.6.32-12: same as with 2.6.10: Xorg.0.log is written, but empty --- in all cases: booting with or without having booted windows before.

But there is one big change: I can boot into the rescue system (did not work with Kernel 2.6.32-{10,11}), then su to a user after that run "startx". This works. Keyboard is German, but no key-repeat and some other things I'd expect to work do not. Looks a lot like a race condition while starting X, gdm and background services leading to an endless loop or crash within X or gdm or even both (think it is X --- just because the logfile is created, but nothing written to it, while it is created and something written to it starting X from commandline).

Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :

gdm does start if started from rescue system. Logons are possible. Xorg starts in this case without errors. --- Would be nice, if there where a way starting the whole system, but without starting X (and gdm), then testing again if it works!

Revision history for this message
Bartosz Kosiorek (gang65) wrote :

Please build/install the latest openchrome driver and enable debug mode.
The detail build instruction for Ubuntu/Debian is available at:
https://help.ubuntu.com/community/OpenChrome

Attach your /var/log/Xorg.0.log file (with debug mode) after X freeze.

summary: - Xorg freeze / Kernel 2.6.32-10-generic freeze
+ [K8M800] Xorg freeze / Kernel 2.6.32-10-generic freeze
Revision history for this message
Thomas Schweikle (tps) wrote :

OK. Did it. Output is completely distorted. Screen updates are not done while scrolling. Typing this blindly most of the time, because characters are not drawn --- only the cursor is moving. i have to minimize the windows, then restore it again to have a look at what I where typing...

Context-Menues are drawn transparent ...

Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :

After a few minutes the screen froze. I rebooted, but no way: the system freezes while gdm starts. With every boot the time I had until it froze was cut down a bit.

tags: added: regression-potential
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas Schweikle:

Please test the mainline linux kernel for 2.6.32.6. The Ubuntu 2.6.32-10 kernel is based on 2.6.32.2, while the 2.6.32-9 kernel is based on 2.6.32.1. If the .2 kernel is broken, but the .1 kernel is not, then we can try to figure out what broke between them. If the .2 kernel works but the Ubuntu 2.6.32-10 kernel does not, then we can try to figure out what broke between 2.6.32-9 and 2.6.32-10.

You can find the mainline kernel packages at http://kernel.ubuntu.com/~kernel-ppa/mainline/.

Thanks

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Bryce Harrington (bryce)
Changed in xserver-xorg-video-openchrome (Ubuntu):
status: New → Confirmed
Revision history for this message
Thomas Schweikle (tps) wrote :

Had time testing the mainline Linux kernel (Feb. 20th and 21st). Same. I tried switching to vesa drivers --- same! It is impossible to start X11 even if I start the rescue system, then using startx. The whole thing freezes immediately.

Switched back to ubuntu kernel. Freezes too. No difference.

I had the chance to try the same on a different hardware with i915 graphics. Tested first with no xorg.conf at all: graphics start, but mouse is movable, but keyboard and mouse keys are not seen by X11. Tried the same with axorg.conf. Tested various parameters. No change.

Same with vesa drivers. Desktop shown, but no mouse keys and keyboard.

At some point I noticed the keyboard really working, but if X11 started from startx the underlying shell received the keypresses. Mouse keys seem to go to nirvana.

If I start X11 from xdm (or [gk]dm) it hangs as soon as there is one mouse movement or keypress. Sometimes it is possible to press more than one key. Sometimes it is possible to move the mouse a bit further, but then all hangs. It may be possible to login remotely using ssh, but not always.
If you do not touch the mouse the system does hang after a while too, but it takes far longer until

I could not find any of these using VMware or VirtualBox. Everything is fine using these emulators. But installing ubuntu on real hardware ...

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

The first thing I would try is removing plymouth. There are some issues with plymouth that are causing gdm freezes and crashes. By removing that we can figure out how much of your problems are related to it. After you have tested without plymouth, if you still have issues we could perform a git bisect of the ubuntu kernel between -9 and -10 to determine what changes cause the failure for you.

Revision history for this message
Bartosz Kosiorek (gang65) wrote :

In your Xorg.0.log.old there is several errors with mouse:

(EE) ioctl EVIOCGNAME failed: Inappropriate ioctl for device
(II) UnloadModule: "evdev"
(EE) PreInit returned NULL for ""Logitech USB-PS/2 Optical Mouse""

...

(EE) ioctl EVIOCGNAME failed: Inappropriate ioctl for device
(II) UnloadModule: "evdev"
(EE) PreInit returned NULL for ""Macintosh mouse button emulation""

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Bartosz:

Those error messages are new in Lucid, but I see them on installations that work perfectly fine. They seem to be harmless error messages.

Revision history for this message
Thomas Schweikle (tps) wrote :

Since last updates "startx" does not block the starting process any more. But the screens backlight is turned of and the screen itself is blanked.

This with i915, vesa, and fbdev. It doesn't matter using a crafted xorg.conf or the one build on the fly by xinit. I have attached Xorg.0.log with no xorg.conf at all.

Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :
Download full text (4.4 KiB)

The crash is for both drivers vesa and fbdev the same:

*** glibc detected *** ksplashx: double free or corruption (out): 0x0849f158 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(+0x6b611)[0x270611]
/lib/tls/i686/cmov/libc.so.6(+0x6ce68)[0x271e68]
/lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0x274f2d]
/usr/lib/libX11.so.6(+0x22899)[0x8e2899]
ksplashx[0x8057841]
ksplashx[0x805a9cf]
ksplashx[0x804a805]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0x21bbd6]
ksplashx[0x804a641]
======= Memory map: ========
00110000-001f9000 r-xp 00000000 08:05 446 /usr/lib/libstdc++.so.6.0.13
001f9000-001fd000 r--p 000e9000 08:05 446 /usr/lib/libstdc++.so.6.0.13
001fd000-001fe000 rw-p 000ed000 08:05 446 /usr/lib/libstdc++.so.6.0.13
001fe000-00205000 rw-p 00000000 00:00 0
00205000-00352000 r-xp 00000000 08:05 159291 /lib/tls/i686/cmov/libc-2.11.1.so
00352000-00353000 ---p 0014d000 08:05 159291 /lib/tls/i686/cmov/libc-2.11.1.so
00353000-00355000 r--p 0014d000 08:05 159291 /lib/tls/i686/cmov/libc-2.11.1.so
00355000-00356000 rw-p 0014f000 08:05 159291 /lib/tls/i686/cmov/libc-2.11.1.so
00356000-00359000 rw-p 00000000 00:00 0
00359000-00375000 r-xp 00000000 08:05 1060 /lib/libgcc_s.so.1
00375000-00376000 r--p 0001b000 08:05 1060 /lib/libgcc_s.so.1
00376000-00377000 rw-p 0001c000 08:05 1060 /lib/libgcc_s.so.1
00377000-00379000 r-xp 00000000 08:05 1570 /usr/lib/libXau.so.6.0.0
00379000-0037a000 r--p 00001000 08:05 1570 /usr/lib/libXau.so.6.0.0
0037a000-0037b000 rw-p 00002000 08:05 1570 /usr/lib/libXau.so.6.0.0
00470000-00493000 r-xp 00000000 08:05 1057 /lib/libpng12.so.0.42.0
00493000-00494000 r--p 00022000 08:05 1057 /lib/libpng12.so.0.42.0
00494000-00495000 rw-p 00023000 08:05 1057 /lib/libpng12.so.0.42.0
0050b000-0052f000 r-xp 00000000 08:05 159484 /lib/tls/i686/cmov/libm-2.11.1.so
0052f000-00530000 r--p 00023000 08:05 159484 /lib/tls/i686/cmov/libm-2.11.1.so
00530000-00531000 rw-p 00024000 08:05 159484 /lib/tls/i686/cmov/libm-2.11.1.so
00610000-00612000 r-xp 00000000 08:05 159483 /lib/tls/i686/cmov/libdl-2.11.1.so
00612000-00613000 r--p 00001000 08:05 159483 /lib/tls/i686/cmov/libdl-2.11.1.so
00613000-00614000 rw-p 00002000 08:05 159483 /lib/tls/i686/cmov/libdl-2.11.1.so
008c0000-009d9000 r-xp 00000000 08:05 1607 /usr/lib/libX11.so.6.3.0
009d9000-009da000 r--p 00118000 08:05 1607 /usr/lib/libX11.so.6.3.0
009da000-009dc000 rw-p 00119000 08:05 1607 /usr/lib/libX11.so.6.3.0
009dc000-009dd000 rw-p 00000000 00:00 0
00a38000-00a3c000 r-xp 00000000 08:05 1603 /usr/lib/libXdmcp.so.6.0.0
00a3c000-00a3d000 r--p 00003000 08:05 1603 /usr/lib/libXdmcp.so.6.0.0
00a3d000-00a3e000 rw-p 00004000 08:05 1603 /usr/lib/libXdmcp.so.6.0.0
00aa3000-00abe000 r-xp 00000000 08:05 662 /lib/ld-2.11.1.so
00abe000-00abf000 r--p 0001a000 08:05 662 /lib/ld-2.11.1.so
00abf000-00ac0000 rw-p 0001b000 08:05 662 /lib/ld-2.11.1.so
00ad9000-00af1000 r-xp 00000000 08:05 1605 /usr/lib/libxcb.so.1.1.0
00af1000-00af2000 r--p 00017000 08:05 1605 /usr/lib/libxcb.s...

Read more...

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Have you tried removing plymouth yet? Because of the instability of plymouth, we really need to know if that's the cause of this issue or if there's some other cause.

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

plymouth is installed, but not started. If it is necessary I'll remove it entirely.

Revision history for this message
Thomas Schweikle (tps) wrote :

plymouth removed: no change.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Can you try a kernel from the "red" ppa: https://launchpad.net/~apw/+archive/red. It has backported graphics drivers from 2.6.33 that may make a difference for you. We're currently evaluating whether to ship the current kernel or this kernel.

Thanks

Revision history for this message
Juraj (juraj-hrcka) wrote :

Just did following exercise on my "ancient" IBM Thinkpad 570 machine (see lshw/lspci -a attachments):
Used livecd from 10th Mar (uname -a attached)
In the boot menu removed quiet splash from the boot line

Assuming by default fbdev used got "distorted log in" screen (see picture attached)
Used xforcevesa and during the boot got "Console: switching to colour frame buffer device 80x30" message. After logged in from commandline, retrieved following (Xorg.0.log,xorg.conf) from the system after tried startx

Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Juraj (juraj-hrcka) wrote :
Revision history for this message
Thomas Schweikle (tps) wrote :

With the most recent kernel -- 2.6.32.17 the problem still exists.

The system seems too boot regularly, then crashing while initializing X11. One big problem: If you are fast enough, you may log in on the console screen, but after a short while the system will stop working. This is with regular starting the system or in recovery mode! Only difference: in recovery mode the time between the crash and console login shown is longer then regularly booting Ubuntu.

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase Douglas: the kernel found at https://launchpad.net/~apw/+archive/red seems to work.

Revision history for this message
Thomas Schweikle (tps) wrote :

The latest kernel 2.6.32.18 (uname -a: Linux aesculap 2.6.32-18-generic #27-Ubuntu SMP Fri Mar 26 21:13:44 UTC 2010 x86_64 GNU/Linux) seems to work also.

Revision history for this message
Thomas Schweikle (tps) wrote :

Wasn't correct. I have to start Windows at last once and login, to make X11 work.

An alternative is: start systemrescuecd 1.5.0, start X11 using startx, then reboot.

Starting Ubuntu right after powering on only gives a distorted screen with no keyboard and mouse. It doesn't mater which kernel you use. It's the same With any of the available ones, including those given in the "red" repository.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

I've read through this bug some more and comment #27 seems to confirm that this is an X issue. In the bug summary is:

"With kernel 2.6.32-9-generic the logon screen does not freeze. With kernel 2.6.32-10-generic it does."
<snip>
"I have tried to disable splash, remove "quiet splash" from the kernel commandline, but with no result: after the kernel is loaded it reinitializes graphics. Only seeing a black screen after that. This with both kernels 2.6.32-9 and 2.6.32-10."

Sometimes bugs are reported like this where it appears that stuff was working with one kernel and broken with another, when in reality the another package was updated at the same time that caused the issue.

To determine if the kernel is at least partially culpable, can someone please retest the 2.6.32-9-generic kernel on beta 1 to see how it behaves?

P.S.: Hopefully you still have the kernel around. I'm not sure where you can find the kernel now, but I'll try to find out.

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

Had the old kernel available:

2.6.32.9: console login possible (X11 not started!);
   system does not freeze;
   remote login possible (ssh).
   X11 does start right after powering on. No problems so far.
   Kernel seems to be incomatible with some installed tools from 10.04 beta 1

2.6.32.10: console login not possible (X11 not started!);
   system freezes as soon as I touch the keyboard;
   remote login possible (ssh) but may freeze after some time.
   X11 does not start right after powering on.
   X11 starts if Windows ran before (you'll have to log on,
     only starting until the logon screen is show does not help!)

2.6.32.15: console login not possible (X11 not started!);
   system freezes after login prompt shown;
   remote login impossible.
   X11 does not start right after powering on.
   X11 does not start after starting and loging in to Windows.

2.6.32.16 (from "red"): console login not possible (X11 not started!);
   system freezes after login prompt shown;
   remote login impossible.
   X11 does not start right after powering on.
   X11 does start after logging in to Windows, but may freeze later.

2.6.32.18: console login possible (X11 not startted!);
   system freezes after some time;
   remote login possible (ssh) but will freeze after some time;
     screen shows random text or graphics.
   X11 does not start right after powering on.
   X11 does start after logging in to Windows.

Tests took a lot of time, because /boot is not big enough to hold all kernels. Was a bit of install uninstall, install uninstall, until all kernels where tested again.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Thanks a lot for the testing, I know it was difficult. At this point, I think the best way forward is to figure out what changes caused things to fail from -9 to -10. If you know how to build your own ubuntu kernels and bisect them using git to find where the first change causes a failure, then that would be a great way to start. If you don't know how to do that I can help you out. In that case, email me privately (<email address hidden>), and we can work through testing kernels. It will probably take around 8 kernels to figure out where the culprit lies.

Thanks again for all the thorough testing!

Changed in linux (Ubuntu):
status: Incomplete → Triaged
assignee: nobody → Chase Douglas (chasedouglas)
Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Thomas Schweikle (tps) wrote :

While testing various kernels (mainly vanilla ones) I found:

booting Windows, then logging on installs the Windows Graphics driver. This driver installs and initializes a (irq-handler?). Rebooting the system leaves this installed handler intact. Linux (Ubuntu 10.04beta) uses this installed handler. Linux does not seem to reinstall or at least reinitialize the handler pointer again (I am not sure about this: maybe it does, but just overwrites it with the same value). All is OK then. X11 is running fine.

Switching the system off or suspending it (this removes the handler pointer, the handler itself seems to stay intact) removes the handler installed. As soon, as the system wants to call it it crashes ...

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Can you be a little more specific about this handler you are talking about? Are you speculating as to what is going on, or do you have some more details?

What we really need is to find out what changes caused this behavior. Can you help me perform a git bisect as I requested in comment #45?

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

Quite difficult, because I do not have ways to debug from an other system and at the moment the system crashes my debugger crashes too :(
But as far as I advanced Windows loads, VGA graphics are initialized, User logs on, then further graphics initializations are performed. Part of this is installing an IRQ handler for the graphics adapter. Finally graphics resolution is set to what the user adjusted it and then IRQs are enabled again.

Booting Linux from power off state loads the kernel and initdisk, Then control is transfered to the kernel. There is *no* IRQ handler installed. Calling X11 then enables the IRQ for the graphics adapter. As soon as one IRQ is generated, the whole system freezes, because the IRQ handler points to useless code.

If I've booted Windows (or systemrescuecd) before, the handler, installed and initialized, handles the IRQ, as it did with Windows (or systemrescuecd) running. Since both handlers do not call any other code they work as expected even if the system changes.

I could not find any way to deactivate IRQ allocation for the graphics adapter. But enabling memory testing wiped the installed handler. With memory testing enabled it wasn't possible to boot Ubuntu 10.04 beta! The system froze as soon as it tried to access graphics, generating an IRQ.

With an Vanilla kernel right from kernel.org I could make the whole thing work too. Leading to conclusion that some patch from Ubuntu must have side effects removing the IRQ handler install for the graphics adapter.

Revision history for this message
Thomas Schweikle (tps) wrote :

Here are the results from testing various kernels:
Plain vanilla Kernel 2.6.32.9 does work!
Plain vanilla Kernel 2.6.32.10 does not work! But the one used by gentoo (patched) works!
Plain vanilla Kernel 2.6.32.11 does not work!
Plain vanilla Kernel 2.6.33.2 does work!
Plain vanilla Kernel 2.6.34-rc3 does work!

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

I assume you are referring to -stable kernels from kernel.org?

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chaise: Yes. Stable (2.6.32.{11,10,9}, 2.6.33.2) and Main (2.6.34-rc3). The Gentoo one is a patched 2.6.32.10 from systemrescuecd 1.5.0 (uname -a reports "Linux sysresccd 2.6.32.10-std150-amd64 #1 SMP Mon Mar 15 18:55:52 UTC 2010 x86_64 AMD Sempron(tm) Processor 3000+ AuthenticAMD GNU/Linux") you can find the iso-image at "http://www.sysresccd.org/"

I took the stable kernels from kernel.org, because I had problems checking the Ubuntu kernels out. I'd like to test them too, but seems I need some hint how to get the sources ...

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

I'm going to upload test kernels for you to try out. Each one will help us narrow down which change cause your problems. This process will require about 7 or 8 kernel tests. When you have tried each one out, please reply in this bug with whether it worked or not. The kernels can be found at http://people.canonical.com/~cndougla/508699/. Please test the v1 kernel first.

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase:
Kernel 2.6.32-10-generic
Power on: hangs. Distorted screen.
Restart after successfull boot: works.
Restart after successfull boot, wipping memory: hangs. Distorted screen.

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase:
Kernel 2.6.31-21-generic
Power on: works.
Restart after successful boot: works.
Restart after successful boot, wiping memory: works.

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase:
Kernel 2.6.32-15-generic
Kernel 2.6.32-16-generic
Power on: hangs. Distorted screen.
Restart after successful boot: works.
Restart after successful boot, wiping memory: hangs. Distorted screen.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Thanks. I've just uploaded v2. Please test it out. I only need the results of booting from power on since that's what we are really trying to fix here.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

FYI, you don't need to do any more kernel testing outside of the kernels I provide. My kernels will help us narrow down to the individual commit that causes your issue.

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase:
Kernel 2.6.32-10.14-generic (v2)
Power on: hangs. Distorted screen.

Revision history for this message
Thomas Schweikle (tps) wrote :

Did a lot of additional tests with "plain vanilla" kernels from kernel.org: none of the 2.6.32-kernels work (2.6.32, 2.6.32.1, 2.6.32.2, ..., 2.6.32.11). All of them are broken on certain hardware. All show the same behavior: if once booted with a working kernel, then rebooting with one of the 2.6.32-series will make X11 work, as far as memory isn't wiped in between.
All Ubuntu-kernels I could find within archives show same behavior: they work, as far as a working kernel was booted before (linux-image-2.6.32-15-generic, linux-image-2.6.32-16-generic, linux-image-2.6.32-19-generic).

Kernel 2.6.31.13 works, as does 2.6.31.12. The Ubuntu-kernel linux-image-2.6.31-21-generic (linux-image-2.6.31-21-generic_2.6.31-21.59_amd64.deb) works too, regardless what state the system was in before.

Kernel series 2.6.33 have some really crucial behavior: booted after switching from a working kernel these do not allow to change resolution! The only resolution seen is the one the system ran at before! If powered on only resolutions up to 800x600 are available. From time to time the system just stops responding, but only if powered on. A hard reset is necessary to make it work again. What exactly caused these "hangs" I was not able to find out, because there are no log or dmesg-entries which could shade some light on it.

Kernel 2.6.34-rc3 works again as 2.6.31.13, 2.6.31.12 and linux-image-2.6.31-21-generic did. Regardless of the state the system was in before.

I am using kernel 2.6.34-rc3 from kernel.org at the moment.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Thomas:

Please test v3.

Thanks

Revision history for this message
Thomas Schweikle (tps) wrote :

@Chase:
Kernel 2.6.32-10.14-generic (v3)
Power on: hangs. Distorted screen.

Revision history for this message
Thomas Schweikle (tps) wrote :

Kernel 2.6.32-20-generic:
Power on: hangs. Distorted screen.

Revision history for this message
Thomas Schweikle (tps) wrote :

Kernel 2.6.32-21-generic:
Power on: works.

But I'll test again tomorow ...! If there is no change then :-)

Revision history for this message
Chase Douglas (chasedouglas) wrote :

That's fantastic! I'm sorry I hadn't sent out a new image to test yet. I was at a conference this past week. I hope things are all fixed for you. I'll mark the bug as Fixed. If you find that it isn't, you can change it back to Triaged.

Thanks

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in xserver-xorg-video-openchrome (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Jean-Marc Le Peuvedic (lepeuvedic) wrote :

Dear All,

I operate a server (no keyboard, no mouse, integrated intel video i82G33/G31). When I switched from 8.04 LTS to 10.04 LTS the server stated to freeze after a few minutes of operation. Normal access to the command line is via SSH. On learning from the web about the intel driver/X problem I shut gdm down intending to finish the configuration of other services (this is a SAMBA and LAMP server). The server still froze after a few minutes or hours. I only observed one case of desktop freeze since I only ran one interactive graphic session.

I downgraded the kernel from (lucid-updates 2.6.32-22) to (lucid 2.6.32-21). The same kernel mentioned above (i386 generic-pae in my case). The sitation improved: the server stayed online for several hours, but never lasted a day. I reenabled KMS, and generally applied all the recipes listed in the Ubuntu.fr forum which do not degrade 3D performance. I installed more recent intel 2.9.1 drivers to no effect. With KMS the i915 driver stays loaded all the time and cannot be unloaded, but with gdm stopped and no interactive session it shouldn't have been exercized significantly.

The solution came from the installation of lucid's 2.6.31-rt kernel, which works for me, as it worked for Thomas (post #54).

Other symptoms are similar: no oops recoverable, no crash file, nothing relating to a serious kernel problem in any log file. I did not try to store log file remotely though. The freeze seems to be instantaneous, but it is not related directly to graphical session activity.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

Jean-Marc Le Peuvedic,

Your hardware is completely different than the hardware for this bug. Please open a new bug for your issues.

Thanks

To post a comment you must log in.