Can't boot with last linux kernels from 2.6.32.10

Bug #505808 reported by shankao
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
linux (Ubuntu)
Fix Released
Medium
Chase Douglas

Bug Description

I can't boot with the just updated kernel (2.6.32.10) and have to select the previous version. The computer freezes after some screen video switch and I can't see any more activity, black screen included.

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: SB [HDA ATI SB], device 0: ALC262 Analog [ALC262 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: shankao 2441 F.... pulseaudio
 /dev/snd/pcmC0D0p: shankao 2441 F...m pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'SB'/'HDA ATI SB at 0xf0000000 irq 16'
   Mixer name : 'Realtek ALC262'
   Components : 'HDA:10ec0262,1854011b,00100002'
   Controls : 16
   Simple ctrls : 11
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xcfeec000 irq 19'
   Mixer name : 'ATI RS600 HDMI'
   Components : 'HDA:1002793c,00793c00,00100000'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [off]
CheckboxSubmission: 56f6196d6c59b6b05a94181cc3955a09
CheckboxSystem: c4d84cd56dc0e1c868ed7fe87c94fd31
Date: Mon Jan 11 10:28:17 2010
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=0789e953-c722-4dd4-958a-0bf9c5d8b5a0
MachineType: LG Electronics E200-A.CP29B
Package: linux-image-2.6.32-9-generic 2.6.32-9.13
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-9-generic root=UUID=f1e83727-9d0c-4969-8f3c-f5dd6fa01dbc ro quiet splash
ProcEnviron:
 LANGUAGE=en_US.UTF-8
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-9.13-generic
Regression: Yes
RelatedPackageVersions: linux-firmware 1.28
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
Tags: lucid needs-upstream-testing regression-potential
TestedUpstream: No
Uname: Linux 2.6.32-9-generic i686
dmi.bios.date: 10/02/2007
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: ELBRSF0C
dmi.board.name: Lhotse-II
dmi.board.vendor: LG Electronics Inc.
dmi.board.version: Rev0.4b
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: LG Electronics Inc.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvrELBRSF0C:bd10/02/2007:svnLGElectronics:pnE200-A.CP29B:pvr0100:rvnLGElectronicsInc.:rnLhotse-II:rvrRev0.4b:cvnLGElectronicsInc.:ct1:cvrN/A:
dmi.product.name: E200-A.CP29B
dmi.product.version: 0100
dmi.sys.vendor: LG Electronics

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=31689)
Xorg configuration

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=31690)
Kernel logs

Revision history for this message
In , agd5f (agd5f) wrote :

You aren't getting KMS due to a race in the way ubuntu loads the drm for kms. X tries to load the drm when it starts, but the drm doesn't finish loading by the time the X driver checks for it so X comes up in non-kms mode, and the drm is still trying to load, but fails because X has already started messing with the hardware. You need to make sure the drm is loaded before starting X.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Firstly, I don't use Ubuntu, I use Gentoo. Secondly, I'm very sure DRM is loaded before X starts. My kernel does it even before starting init.

Revision history for this message
In , Rafał Miłecki (zajec5) wrote :

(In reply to comment #4)
> Firstly, I don't use Ubuntu, I use Gentoo. Secondly, I'm very sure DRM is
> loaded before X starts. My kernel does it even before starting init.

Same issue happens in openSUSE.

drm is one thing but you also have to load radeon before starting X. Please try if this help.

Revision history for this message
In , agd5f (agd5f) wrote :
Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

(In reply to comment #5)
> Same issue happens in openSUSE.
>
> drm is one thing but you also have to load radeon before starting X. Please try
> if this help.
>

Every module related to drm, ttm, kms helpers, radeon, and so on... are autoloaded by my kernel at boot time before calling init. The problem is not related to the loading of the module, its the module itself. Please, refer to the attachments for details.

Revision history for this message
In , agd5f (agd5f) wrote :

Created an attachment (id=31713)
fix vram setup on rs600

This patch should fix your issue.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

I must correct something I said: the module radeon and all its dependencies are not loaded before init; my system loads them through udev. When udev is triggered to populate /dev, it talks to the kernel through uevents and when it finishes, all required modules are loaded including radeon and all deps.

About the patch to fix vram setup: now the system starts ok, but when udevd starts, I obtain a blackout on my screen and the system freezes. Only power button allows me to shutdown for rebooting.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #9)
> I must correct something I said: the module radeon and all its dependencies are
> not loaded before init; my system loads them through udev. When udev is
> triggered to populate /dev, it talks to the kernel through uevents and when it
> finishes, all required modules are loaded including radeon and all deps.
>
> About the patch to fix vram setup: now the system starts ok, but when udevd
> starts, I obtain a blackout on my screen and the system freezes. Only power
> button allows me to shutdown for rebooting.
>

Do you have fbcon loaded? Can you provide a kernel log with the patch applied?

Revision history for this message
In , agd5f (agd5f) wrote :

Created an attachment (id=31770)
fixup rs600 gart setup

Does this patch help?

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

(In reply to comment #10)
> Do you have fbcon loaded? Can you provide a kernel log with the patch applied?
>

I have no clue about how to know if fbcon is loaded. A hint?
About kernel log. I just can't obtain a kernel log because my system freezes too soon and it does not generate the appropriate log. Is there any known way to force the system to start logging before udev starts?

Applying gart patch right now....

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

gart fix + vram fix = blackout at udev starting....

Let's try only the gart fix....

Revision history for this message
In , Milan Plzik (emempi) wrote :

I experience the same problems; after loading and initializing radeon.ko, screen turns black; starting X server after loading probably locks up whole computer.

 I use 2.6.32 + drm-radeon-next; dmesg from boot log is at http://www.pastebin.ca/1704782 . I can be reached either by e-mail, or on IRC #radeon, handle 'mmp'

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #13)
> gart fix + vram fix = blackout at udev starting....
>
> Let's try only the gart fix....
>

The gart fix alone won't work.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

(In reply to comment #15)
>
> The gart fix alone won't work.
>

As you said, it didn't work: X started but slow as before. What else can I do?

Revision history for this message
In , agd5f (agd5f) wrote :

Created an attachment (id=31792)
fix vram location

This patch along with the other two should do the trick.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=31793)
Rejection applying the patch 31792: fix vram location

One hunk is rejected when applying the last patch. I'm attaching the resulting file from the rejection (.rej). Should I modify by hand the line causing the rejection?

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #18)
> Created an attachment (id=31793) [details]
> Rejection applying the patch 31792: fix vram location
>
>
> One hunk is rejected when applying the last patch. I'm attaching the resulting
> file from the rejection (.rej). Should I modify by hand the line causing the
> rejection?
>

You can ignore that part.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Well, things are definitely getting better after applying the three patches. Now X starts, but with some quirks. Fonts are distorted since GDM appears. I achieve logging into my session (fonts distortions continues, totally unreadable). After a few seconds, keyboard and mouse turn unresponsive and the screen slooooowly begins to become white. After a minute or two is practically completely white. I'll attach new logs.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=31796)
dmesg after applying the three patches

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=31797)
Xorg log after applying the three patches

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

When I said things are getting better, I was referring to KMS. Seems it is active and Xorg now seems to detect it, but things still need some adjustments to work properly.

Revision history for this message
In , agd5f (agd5f) wrote :

Created an attachment (id=31798)
force 32 mb gart

Can you try this on top of the others?

Revision history for this message
In , agd5f (agd5f) wrote :

Created an attachment (id=31799)
setup gart like pre-kms

If the patch in comment 24 doesn't help, try adding this one.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Well, the results of various experiments:

Firstly, (and this is before applying patches 31798 and 31799) the white screen phenomenon not always occur, but fonts corruption and freezing always do.

After applying patch 31798 (force 32 mb gart) nothing significantly different happened. Applying patch 31799 (setup gart like pre-kms) on top of all other four, aggravated the problem: more corruption not only in fonts, also in other areas like windows titles and desktop background (plus freeze). Applying patch 31799 without applying 31798 on top of previous three, maintains the same symptoms previous to the last two patches: font corruption and freeze.

I also noticed that when applied both patches, GDM shown some rectangles in the background and a corruption similar to those on the fonts but in the form of random horizontal thick noisy bars with many colors.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Any news about this issue?

Revision history for this message
shankao (shankao) wrote : Can't boot with last linux kernel 2.6.32.10

I can't boot with the just updated kernel (2.6.32.10) and have to select the previous version. The computer freezes and I can't see any activity, black screen included.

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: SB [HDA ATI SB], device 0: ALC262 Analog [ALC262 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: shankao 2441 F.... pulseaudio
 /dev/snd/pcmC0D0p: shankao 2441 F...m pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'SB'/'HDA ATI SB at 0xf0000000 irq 16'
   Mixer name : 'Realtek ALC262'
   Components : 'HDA:10ec0262,1854011b,00100002'
   Controls : 16
   Simple ctrls : 11
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xcfeec000 irq 19'
   Mixer name : 'ATI RS600 HDMI'
   Components : 'HDA:1002793c,00793c00,00100000'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [off]
CheckboxSubmission: 56f6196d6c59b6b05a94181cc3955a09
CheckboxSystem: c4d84cd56dc0e1c868ed7fe87c94fd31
Date: Mon Jan 11 10:28:17 2010
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=0789e953-c722-4dd4-958a-0bf9c5d8b5a0
MachineType: LG Electronics E200-A.CP29B
Package: linux-image-2.6.32-9-generic 2.6.32-9.13
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-9-generic root=UUID=f1e83727-9d0c-4969-8f3c-f5dd6fa01dbc ro quiet splash
ProcEnviron:
 LANGUAGE=en_US.UTF-8
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-9.13-generic
Regression: Yes
RelatedPackageVersions: linux-firmware 1.28
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
Tags: lucid needs-upstream-testing regression-potential
TestedUpstream: No
Uname: Linux 2.6.32-9-generic i686
dmi.bios.date: 10/02/2007
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: ELBRSF0C
dmi.board.name: Lhotse-II
dmi.board.vendor: LG Electronics Inc.
dmi.board.version: Rev0.4b
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: LG Electronics Inc.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvrELBRSF0C:bd10/02/2007:svnLGElectronics:pnE200-A.CP29B:pvr0100:rvnLGElectronicsInc.:rnLhotse-II:rvrRev0.4b:cvnLGElectronicsInc.:ct1:cvrN/A:
dmi.product.name: E200-A.CP29B
dmi.product.version: 0100
dmi.sys.vendor: LG Electronics

Revision history for this message
shankao (shankao) wrote :
Revision history for this message
Knatchwa (rl-marr) wrote :

Same problem with AMD 64 version, unable to boot into 2.6.32.10 | Gets as far as login and once went as far as coming to the desktop but then freezes, can only solve it by restarting the computer and running the previous kernal.

Revision history for this message
shankao (shankao) wrote : Re: Can't boot with last linux kernels 2.6.32.10 and 2.6.32.11

@Knatchwa: I think that's a different bug that has to be filled in its own, as I can't event get to the login screen (gdm). My system freezes after a resolution change as the last activity that can be seen is the screen flip.

Where can I find other information / logs that early in the boot process?

summary: - Can't boot with last linux kernel 2.6.32.10
+ Can't boot with last linux kernels 2.6.32.10 and 2.6.32.11
Surbhi Palande (csurbhi)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Surbhi Palande (csurbhi)
Changed in linux (Ubuntu):
status: New → Triaged
shankao (shankao)
description: updated
summary: - Can't boot with last linux kernels 2.6.32.10 and 2.6.32.11
+ Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.12
Revision history for this message
Hew (hew) wrote : Re: Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.12

I can't boot with > 2.6.32-9-generic either. I get the following error message which may or may not be related:

mount: mount point /proc/bus/usb does not exist
mountall: mount /proc/bus/usb [703] terminated with status 32
mountall: Filesystem could not be mounted: /proc/bus/usb

Revision history for this message
Hew (hew) wrote :

My problem turned out to be bug 507881

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Hew,

I don't think your issue is related.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

Your issue seems to be related to bug 510524. Can you try upgrading your plymouth package to see if it helps?

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
shankao (shankao) wrote :

Upgraded to 0.8.0~-9ubuntu1 and still no change. I'm stuck with the .9 kernel :(

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

Can you try removing plymouth and seeing if that helps?

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
shankao (shankao) wrote :

Done, and nothing changes... but some extra complains about the missing plymouth.
We can put plymouth out of the possible causes.

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

The 2.6.32-9 kernel is based on upstream 2.6.32.1. The -10 kernel is based on upstream 2.6.32.2. Can you try out the upstream kernels? You can find prebuilt kernel packages at:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.32.1/
http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.32.2/

Finding out whether the kernel boots for both of these kernels will help us determine where the problem came from.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
shankao (shankao) wrote :

I can boot with upstream 2.6.32.2

$ uname -a
Linux clotilde 2.6.32-02063202-generic #02063202 SMP Sat Dec 19 11:00:49 UTC 2009 i686 GNU/Linux
$

Do this mean that the problem is in the ubuntu changes? Should I also test with the 2.6.32.1 one (that I can boot if ubuntu's .9 is based on it) or having tested this is enough to spot the problem?

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

I think this is enough information for now. I will look into things a bit further.

Changed in linux (Ubuntu):
status: New → Triaged
tags: removed: needs-upstream-testing
tags: removed: i386
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

There's 173 changes between the -9 and the -10 kernels. Are you able to build and git bisect your way through a few ubuntu test kernels to determine what change is giving you issues?

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

I think I just received an email from you saying your hardware has changed and that may be making a difference for you. If this is the case, are you still interested in following up on this issue with the same hardware?

Revision history for this message
shankao (shankao) wrote :

@chase: em... I think that's not me...

Yes, I'm still interested in having the last kernel working. I can make some test as you suggest, but I need some guidance. Can you please point to the info or procedures that I should follow?
Maybe this bug report is not the better place to continue this conversation so we can move to e-mail if you want, with permission of that other "shankao" that's contacting you :)

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Chase Douglas (chasedouglas) wrote :

I am working with shankao through email to proceed with a git bisect run. When we have results they will get posted here.

Changed in linux (Ubuntu):
assignee: nobody → Chase Douglas (chasedouglas)
status: New → Triaged
Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Chase Douglas (chasedouglas) wrote :

I worked with shankao through private emails to bisect to the first commit that causes issues for him. Here's the output of the bisection (note that the tree is the ubuntu-lucid git tree hosted at kernel.ubuntu.com):

1369d982cea341524032a248b1f31832bb8c3eb6 is first bad commit
commit 1369d982cea341524032a248b1f31832bb8c3eb6
Author: Alex Deucher <email address hidden>
Date: Thu Dec 3 16:18:19 2009 -0500

    drm/radeon/kms: fix vram setup on rs600

    commit 722f29434e72188b2d20f9b41f4b5952073ed568 upstream.

    also fix up rs690 mem width.

    should fix fdo bug 25408

    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Dave Airlie <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>
    Signed-off-by: Andy Whitcroft <email address hidden>

:040000 040000 9bf615ab33dfcdcbd7aff53c51389705be954ba7 0f1789164ca98d79a64a7afbbbe843fa5803f02d M drivers

--

At first inspection, this commit does modify the driver for the graphics card present in shankao's system, so the bisection does seem to have found the culprit.

However, this change comes from the -stable tree upstream. Thus, it is odd that the 2.6.32.2 mainline kernel booted, but not our 2.6.32-10 kernel. The most likely scenario is some odd interaction between one of our patches and this change.

I am going to do some further investigating to try to figure out where to go from here.

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

Can you test the latest kernel I've built: http://people.canonical.com/~cndougla/505808/vram_cocktail/linux-image-2.6.32-13-generic_2.6.32-13.19_i386.deb. It contains three new patches:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4f15d24adb39803ba7b9363d0bb5dd714a6706f6
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=64bffd03756249e11b8651ccf33ac3a50a93ed4c
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=22dd50133ab7548adb23e86c302d6e8b75817e8c

The first and third patches are mentioned in the upstream freedesktop.org bug. The second patch is needed to cleanly apply the third, and is directly related to the third. These patches do not completely fix all of the upstream bug reporter's issues, but it is unclear if those issues are caused by the same bug or some other unrelated bug. Thus, I think testing these three patches is a good first step.

shankao (shankao)
summary: - Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.12
+ Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.13
Revision history for this message
shankao (shankao) wrote : Re: Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.13

I made 3 tries with this kernel with mixed success:

-> In the first one, I got to the login screen, but with visible corruption (the mouse cursor is a random-dots-square, and the bottom options are also really bad). I can login, X freeze but I can still change to another VT.

-> In the second, the mouse pointer is completely disappeared. I can login but the system froze after that. This time I can't change to another VT.

-> The third time I got an oops after getting the login screen :(
Where's this oops output located? Is of any help to you if I send it?

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

That's rather unfortunate. Can you try the latest mainline build of 2.6.33-rc*? If it fixes your issue, then maybe there's just a few more patches we need. If not, then you'll likely have to comment on the upstream bug report to try to get it resolved.

We can also look into removing the commit where things start to go bad, but I'm afraid of breaking things for others. I think we'll cross that bridge if needed when we find out the results of the latest mainline kernel.

Revision history for this message
In , shankao (shankao) wrote :

Hi, I'm affected too by this bug. I have being experiencing black screens from some kernels ago (based on this) in my current distro and after trying the patches presented here I get some mixed results.

-> In the first try, I got to the login screen, but with visible corruption (the mouse cursor is a random-dots-square, and the bottom options are also really bad). I can login, X freeze but I can still change to another VT.

-> In the second, the mouse pointer is completely disappeared. I can login but the system froze after that. This time I can't change to another VT.

-> The third time I got an oops after getting the login screen :(
Where's this oops output located? Is of any help to you if I send it?

My original bug report is in Ubuntu's launchpad at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/505808

Revision history for this message
In , Pauli (paniemin) wrote :

Chase Douglas wrote on 2010-02-18:
"I worked with shankao through private emails to bisect to the first commit that causes issues for him. Here's the output of the bisection (note that the tree is the ubuntu-lucid git tree hosted at kernel.ubuntu.com):

1369d982cea341524032a248b1f31832bb8c3eb6 is first bad commit
commit 1369d982cea341524032a248b1f31832bb8c3eb6
Author: Alex Deucher <email address hidden>
Date: Thu Dec 3 16:18:19 2009 -0500

    drm/radeon/kms: fix vram setup on rs600

    commit 722f29434e72188b2d20f9b41f4b5952073ed568 upstream.

    also fix up rs690 mem width.

    should fix fdo bug 25408

    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Dave Airlie <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>
    Signed-off-by: Andy Whitcroft <email address hidden>

:040000 040000 9bf615ab33dfcdcbd7aff53c51389705be954ba7 0f1789164ca98d79a64a7afbbbe843fa5803f02d M drivers

--

At first inspection, this commit does modify the driver for the graphics card present in shankao's system, so the bisection does seem to have found the culprit.

However, this change comes from the -stable tree upstream. Thus, it is odd that the 2.6.32.2 mainline kernel booted, but not our 2.6.32-10 kernel. The most likely scenario is some odd interaction between one of our patches and this change.

I am going to do some further investigating to try to figure out where to go from here."

Revision history for this message
shankao (shankao) wrote : Re: Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.13

Sure, I'm going to try the kernel in http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.33-rc8/ and will report any results here.

summary: - Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.13
+ Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.14
Revision history for this message
shankao (shankao) wrote : Re: Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.14

I have tried the v2.6.33-rc8 kernel and I still get a black screen. :(

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

Unfortunately this seems to be a still present bug. You will need to continue working with upstream through the freedesktop.org bug report to find a resolution.

Changed in linux (Ubuntu):
assignee: Chase Douglas (chasedouglas) → nobody
status: In Progress → Triaged
Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

I've recently tried kernel 2.6.33. I must say that I'm constantly seeing regressions since various kernels behind regarding RS600 support, everytime is getting worse and worse. KMS still blocks graphics with a black screen (so it seems none of these patches have made it to the mainline kernel) and non-KMS is getting slower in time (I'm continuously losing FPS: at some point I had 320 fps, with kernel 2.6.32 I have 220 fps, and now when I've tried kernel 2.6.33 I obtained 190 fps). Please, is anybody working on this? Sometime ago I offered myself to test any new patch, I reaffirm that, I'm willing to help in everything I can.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

As additional info I must say I'm using latest libdrm, latest xf86-video-ati and latest mesa from git repositories.

Revision history for this message
HAL (enterprise-nx) wrote : Re: Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.14

this maybe a problem related with some mobility ati video cards. i am also affected

Revision history for this message
In , agd5f (agd5f) wrote :

I wonder if this is an msi chipset/bios issue. E.g, For reference:
http://marc.info/?l=dri-devel&m=126926011226719&w=2
https://bugzilla.kernel.org/show_bug.cgi?id=15287

Perhaps your system needs a quirk for the graphics bridge. Does booting with pci=nomsi help?

Revision history for this message
In , shankao (shankao) wrote :

I have just tried the pci=nomsi parameter in the boot options but it still doesn't work :(

If there's something else that I can do in order to debug this problem, I'm willing to help.

Revision history for this message
In , Glisse (glisse) wrote :

Created an attachment (id=35157)
Fix rs600 gart tlb flush

Can you confirm that the attached patch fix the issue for you, here with it rs600 seems to work properly.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

(In reply to comment #34)
> Created an attachment (id=35157) [details]
> Fix rs600 gart tlb flush
>
> Can you confirm that the attached patch fix the issue for you, here with it
> rs600 seems to work properly.

Here are the results:

I've applied only this last patch to kernel 2.6.33 and obtained a black screen just before gdm start. Then I've tried to apply all the available patches published here at bugzilla and almost all of them failed to apply except 0005-setup_gart_like_pre_kms.patch and yours, but again, obtained the black screen at the same moment. Then I went back to kernel 2.6.32, applied all the patches, and then it worked. I'm using libdrm, mesa and xf86-video-ati from git head.

Which kernel are you using? I would like to test your same environment, because after all, KMS is working but I only get 175 fps with glxgears, and 2D acceleration seems quite slow.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=35202)
Xorg log with KMS working

Here my Xorg log if you want to check it

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Created an attachment (id=35204)
dmesg with KMS working

Also you can check my dmesg if you want.

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

I forgot to mention that using only the last patch on kernel 2.6.32 didn't work either, I tried that before applying all other patches.

Revision history for this message
In , agd5f (agd5f) wrote :
Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

Ok, tuxonice variant of kernel 2.6.34-rc5 was already patched (probably the vanilla too) and it worked just fine. Issue with speed persists but it's not so critical for me, at least right now (175 fps). I hope it will be improved in the future. I think we can declare this bug fixed.

Revision history for this message
In , Chase Douglas (chasedouglas) wrote :

Can someone detail exactly which patches are needed to fix this issue? I would like to ensure they are backported into Ubuntu 10.04 LTS. It would be great if they were sent to the stable queue as well.

Thanks

Revision history for this message
In , Dariem Pérez Herrera (dariemp) wrote :

(In reply to comment #41)
> Can someone detail exactly which patches are needed to fix this issue? I would
> like to ensure they are backported into Ubuntu 10.04 LTS. It would be great if
> they were sent to the stable queue as well.
>
> Thanks

I think *maybe* your best choice is to substitute in the current kernel of Ubuntu, the file drivers/gpu/drm/radeon/rs600.c with the one that "just works" available in 2.6.34-rc5. Somebody correct me if i'm saying a monstrosity.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #41)
> Can someone detail exactly which patches are needed to fix this issue? I would
> like to ensure they are backported into Ubuntu 10.04 LTS. It would be great if
> they were sent to the stable queue as well.

The patch from comment 34 should be all that's needed for 10.04 as 10.04 uses a 2.6.33 drm for the most part. That patch is in 2.6.34 already and has should hit stable as well:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=30f69f3fb20bd719b5e1bf879339914063d38f47

Revision history for this message
In , Chase Douglas (chasedouglas) wrote :

(In reply to comment #43)
> (In reply to comment #41)
> > Can someone detail exactly which patches are needed to fix this issue? I would
> > like to ensure they are backported into Ubuntu 10.04 LTS. It would be great if
> > they were sent to the stable queue as well.
>
> The patch from comment 34 should be all that's needed for 10.04 as 10.04 uses a
> 2.6.33 drm for the most part. That patch is in 2.6.34 already and has should
> hit stable as well:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=30f69f3fb20bd719b5e1bf879339914063d38f47

Thanks! I'll make a test kernel for a user who is seeing this issue on Ubuntu so we can verify it is all that is needed.

Revision history for this message
shankao (shankao) wrote :

There have being some advancements recently regarding this bug in freedesktop and some patches avaliable.

Unfortunately, they seem to be only for 2.6.34 kernel. Could they be ported back for lucid to be used?

summary: - Can't boot with last linux kernels from 2.6.32.10 to 2.6.32.14
+ Can't boot with last linux kernels from 2.6.32.10
Revision history for this message
Chase Douglas (chasedouglas) wrote :

It's been suggested in the upstream bug that this patch will fix things:

http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.33/drm-radeon-kms-fix-rs600-tlb-flush.patch;h=ae28c673b414ac50a93a9d4e41d35e787b0e6a94;hb=HEAD

Note that the patch is already in the stable queue. When the next stable drop occurs it will include this patch. That will likely be included in the next Ubuntu kernel release as well.

I've uploaded a test kernel to http://people.canonical.com/~cndougla/505808/tlb-flush/. Please test it out to ensure it fixes this issue.

Thanks

Changed in linux (Ubuntu):
assignee: nobody → Chase Douglas (chasedouglas)
status: Triaged → In Progress
Revision history for this message
Chase Douglas (chasedouglas) wrote :

Also, please test booting with radeon.modeset=0 on a stock kernel. Since this patch won't get into the release image, if we find out that this allows booting we can add it to the release note.

Thanks

Revision history for this message
shankao (shankao) wrote :

I have tested that patched kernel but it still doesn't boot. I tried the radeon.modeset=0 setting with similar results.

I'm currently using an updated lucid with the last karmic kernel (as I reinstalled karmic and updated from there).

Is there any logs or information that can we use to address this problem.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

Can you test the latest ubuntu mainline kernel? They can be found at:

http://kernel.ubuntu.com/~kernel-ppa/mainline/

Thanks

Revision history for this message
shankao (shankao) wrote :

I have just tested two of the latest kernels in that page:

v2.6.33.3-lucid/ - Does not work. I got stuck again in a black screen.

v2.6.34-rc5-lucid/ Works and as far as I know, without any graphical problems, neither in boot or the rest of the normal usage (I'm typing this while using that kernel)

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

I'm really sorry it's been a while since I've gotten back to you. Been busy!

I've uploaded a new test kernel to http://kernel.ubuntu.com/~cndougla/505808/vram2. It contains three patches that affect the vram locations and tlb flushing. I *think* it will resolve your issue. Please test it out and report back.

Thanks

Revision history for this message
shankao (shankao) wrote :

Hi! don't worry, I suppose that finishing a release is not an easy task.

Unfortunately, that kernel still doesn't boot correctly. It keeps showing only a black screen (doesn't show the ubuntu logo / other graphical stuff), although I can see hard disk activity for some moments.

I tried to change tty but no visible results. I could maybe try accessing by sshd if that can assist us in some kind of test / logging.

Revision history for this message
shankao (shankao) wrote :

The last kernel for maverick in the ubuntu repositories works! (Linux clotilde 2.6.34-2-generic #9-Ubuntu SMP Thu May 13 23:47:25 UTC 2010 i686 GNU/Linux)
I suppose that we can mark this bug as fixed, but, as it's the new 2.6.34 it can still have missing some ubuntu specific patches that make it to fail to boot when applied in a later package release. Can anybody confirm this?

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@shankao:

It's generally frowned upon to use anything other than the official release kernels. We won't be able to support you if you encounter any other issues. That said, we have no problem with you running the kernel as long as you understand the risks associated with it.

Unfortunately, I'm not real sure where to go from here. I've tried throwing in every last change that made sense to me. We would need to do quite a bit more testing to figure out exactly where the resolution lies, and I would understand if you would rather spend your time running your system, even with an unsupported kernel. What do you think?

Thanks

Revision history for this message
shankao (shankao) wrote :

I'm not sure if we have understood each other this time. What I said in comment #34 is that I have tried the new ubuntu kernel (the official one) that came in maverick some days ago (Linux clotilde 2.6.34-2-generic #9-Ubuntu SMP Thu May 13 23:47:25 UTC 2010 i686 GNU/Linux) and it is working correctly.
So I'm using a supported kernel and the bug can be marked as fixed :)

Revision history for this message
Chase Douglas (chasedouglas) wrote :

Until 10.10 is released, the Maverick kernel isn't "supported". Of course, we love people who test out kernels for the next release :), it's how we find and try to resolve bugs. But if you need official support, we can't provide any until the kernel is officially released as part of Ubuntu 10.10.

Revision history for this message
shankao (shankao) wrote :

I really love using the dev. releases and report bugs, it's the way to make ubuntu a better S.O., and I can have it without problems on this computer, so no need of a stable and supported release.
I'm marking the bug as fix released. Thanks Chase, it's being a pleasure to have worked out the bug with you!

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
shankao (shankao) wrote :

Is this bugfix going to be backported to the kernel in lucid? Should I ask explicitly for it?

Revision history for this message
Chase Douglas (chasedouglas) wrote :

shankao,

The issue is identifying the fix. We've gone through many iterations of what looked like fixes for this kernel without any success, and I haven't seen any duplicate bugs from others encountering this issue. Thus, I haven't been working on this bug in lieu of higher priority issues.

If we can find a patch or series of patches that resolves this issue, we certainly can backport the fix to Lucid.

Thanks

Changed in linux:
importance: Unknown → High
status: Confirmed → Fix Released
Changed in linux:
importance: High → Unknown
Changed in linux:
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.