[Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics)

Bug #727620 reported by afoglia
394
This bug affects 59 people
Affects Status Importance Assigned to Milestone
Linux
Confirmed
Medium
xserver-xorg-driver-ati
Fix Released
Critical
linux (Ubuntu)
Fix Released
High
Seth Forshee
xserver-xorg-video-ati (Ubuntu)
Invalid
Wishlist
Unassigned

Bug Description

[Problem]
On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a failure occurs resulting in a black screen and errors from the radeon kernel module, as shown below.

[Cause]
From upstream developer:

"The switcheroo code needs more work to switch properly on some systems it seems. There are a set acpi methods required to activate/deactivate the respective gpus. The drivers need to load and initialize active hw. If the hw is not active when the driver loads, then the hw is not set up properly and it won't work. Probably some ordering issues in how the switcheroo acpi methods are called."

[Workarounds]
Several options:

1. If your BIOS includes functionality to disable the Intel card, use BIOS settings to select which chip to load.

2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note that the default radeon gallium driver only works with KMS, so YMMV.

[Original Report]
I'm running natty, and every since the upgrade to 6.14.0 I've been unable to consistently boot. After some discussion in the forums, I tried repeatedly to boot into recovery mode. In most cases, I got a black screen. One time though, when I was able to successfully increase the brightness, I saw some errors from the radeon module. I took a photo (available at http://i.imgur.com/P0bQ0.jpg), and here's the stack and call trace, as best as I can read it:

Stack:
 ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
 00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
 ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
Call Trace:
 [<ffffffffa022aba0>] evergreen_cp_resume+0x3a0/0x630 [radeon]
 [<ffffffffa022c8b7>] evergreen_startup+0x157/0x260 [radeon]
 [<ffffffffa01fe8a0>] ? r600_pcie_gart_init+0x60/0x70 [radeon]
 [<ffffffffa022dbec>] evergreen_init+0x1ac/0x2d0 [radeon]
 [<ffffffffa01a5a69>] radeon_device_init+0x409/0x490 [radeon]
 [<ffffffffa01a7142>] radeon_driver_load_kms+0xb2/0x1a0 [radeon]
 [<ffffffffa007fb2e>] drm_get_pci_dev+0x18e/0x300 [drm]
 [<ffffffff8115426f>] ? kmem_cache_alloc_trace+0xff/0x120
 [<ffffffffa023790e>] radeon_pci_probe+0xb2/0xba [radeon]
 [<ffffffff812fea7f>] local_pci_probe+0x5f/0xd0
 [<ffffffff81300369>] pci_device_probe+0x119/0x120
 [<ffffffff813b8eca>] ? driver_sysfs_add+0x7a/0xb0
 [<ffffffff813b8ff8>] really_probe+0x68/0x190
 [<ffffffff813b9305>] driver_probe_device+0x45/0x70
 [<ffffffff813b93db>] __driver_attach+0xab/0xb0
 [<ffffffff813b9330>] ? __driver_attach+0x0/0xb0
 [<ffffffff813b817e>] bus_for_each_dev+0x5e/0x90
 [<ffffffff813b8e4e>] driver_attach+0x1e/0x20
 [<ffffffff813b89b5>] bus_add_driver+0xc5/0x280
 [<ffffffffa0013000>] ? radeon_init+0x0/0x1000 [radeon]
 [<ffffffff813b9676>] driver_register+0x76/0x140
 [<ffffffffa0013000>] ? radeon_init+0x0/0x1000 [radeon]
 [<ffffffff812ff126>] __pci_register_driver+0x56/0xd0
 [<ffffffffa0080044>] drm_pci_init+0xe4/0xf0 [drm]
 [<ffffffff815bf36e>] ? mutex_lock+0x1e/0x50
 [<ffffffffa0013000>] ? radeon_init+0x0/0x1000 [radeon]
 [<ffffffffa0077688>] drm_init+0x58/0x70 [drm]
 [<ffffffffa00130c4>] radeon_init+0xc4/0x1000 [radeon]
 [<ffffffff81002195>] do_one_initcall+0x45/0x190
 [<ffffffff810a4573>] sys_init_module+0x103/0x260
 [<ffffffff8100c002>] system_call_fastpath+0x16/0x1b
Code: 00 45 8b 84 24 e4 0a 00 00 45 85 c0 0f 8e c7 09 00 00 41 8b 84 24 d4 0a 00 00 89 c2 83 c0 01 40 c1 e2 02 49 03 94 24 c8 0a 00 00 <c7> 02 00 44 05 c0 41 8b 94 24 e4 0a 00 00 41 23 84 24 f4 0a 00
RIP [<ffffffffa0227ad7>] evergreen_cp_start+0x57/0xc80 [radeon]
 RSP <ffff88014b0f7af8>
CRZ: ffffc90411ce1ffc
---[ end trace 37702c56f2e23247 ]---
udevd-work[94]: '/sbin/modprobe -bv pci:v00001002d000068C1sv0000103Csd00001436bc03sc00i00' unexpected exit with status 0x0009

There is also some register info dumped at the top of the screen visible in the photo, that I didn't bother to write, as I'd most certainly get something wrong.

afoglia (afoglia)
tags: added: natty
Revision history for this message
afoglia (afoglia) wrote :

I forgot to mention, my computer is an HP Envy 14, so I have the discrete ATI card, and also integrated graphics from the core i5 (which uses the i915 driver). Just in case it's some interaction between the two that causes the crash.

Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

I also have the same problem, sometimes it takes just 1-2 resets to be able to boot, and now i reseted the computer 8 times (2 with full power off) and it finally booted. I think that it fails right before showing the Ubuntu logo and progress bar when switching from console to graphics mode.

My computer is HP Pavilion dm4t-1100 wit ATI 5470HD and Intel.

summary: - [Radeon HD 5650] Driver crash during recovery boot
+ [Radeon HD 5650 and 5470] Driver crash during recovery boot and in
+ normal boot
Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot

Hi afoglia,

Does it resolve if you downgrade to an older version of -ati?

You can get older .deb files of the driver from Launchpad here:

https://launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+publishinghistory

Click on the link under Version of the version you want to test, then under Builds click the link for your hardware architecture, then grab the -ati and -radeon .debs and install them.

If that doesn't do it, then next guess would be you are having a kernel issue - if you still have a prior kernel you can try booting it (hold down the left shift key during boot to bring up the menu.)

Changed in xserver-xorg-video-ati (Ubuntu):
status: New → Incomplete
Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

In my situation this is something I found in the logs.
I analyzed the logs and compared it to a normal log and besides the similar stack dump I see one significant difference in the problematic log is this:

[drm] radeon: 3584M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.

whereas in the normal log:
[drm] radeon: 512M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.

The laptop has only 4GB RAM and ATI is supposed to have only 512MB.

I have attached the relevant part of the log

Revision history for this message
Johan Fornander (fornander-johan) wrote :

This is my complete kern.log file showing the crash being identical to that of afoglia and Vangel Ajanovski.

Revision history for this message
afoglia (afoglia) wrote :

How do I install the old versions? I tried installing 6.13.2+git20110124.fadee040-0ubuntu4 and ...ubuntu3 but I got dpkg dependency problems stating that they depend on xorg-video-abi-9.0. apt-get can't find that package. (It can find xorg-video-abi-9, but selects xserver-xorg-core instead, and that's at the newest version in natty.)

I also tried the maverick version on that page (6.13.1-1ubuntu5) and again, dpkg has dependency issues, this time the required package is xorg-video-abi-8.0, and that this version of xserver-xorg-video-(ati|radeon) provides xserver-xorg-video-8 which xserver-xorg-core breaks.

If this helps, I did not have these problems under maverick, and while I had minor problems in natty a few weeks ago, they got noticeably, drastically worse when the 6.14 drivers were released.

Revision history for this message
afoglia (afoglia) wrote :

I tried Bryce's second suggestion of using old kernels. I have two previous versions of 2.6.38 installed, 2.6.38-3-generic and 2.6.38-4-generic. I booted each into recovery and normal mode 4 times, for a total of 16 boots. Here's the number of times the boot was a success, where I either got to the recovery boot menu or gdm, (regardless of whether the screen brightness had to be manually increased from 0, or if the plymouth boot screen displayed).

2.6.38-4-generic, normal: 1 success, 3 failures
2.6.38-4-generic, recovery: 4 successes
2.6.38-3-generic, normal: 4 successes
2.6.38-3-generic, recovery: 3 successes, 1 failure

At no time did I see a stack trace like the one I posted, but I've only seen that in recovery mode. (Would it be written somewhere persistent between boots? It's not in /var/log/syslog.)

I took more notes on the failures. They're pretty vague and qualitative, but have slightly more detail of what each boot was like.

Revision history for this message
Bryce Harrington (bryce) wrote :

Okay, thanks for the testing. That suggests a regression in the kernel between 2.6.38-3 and -4 (the one failure with -3 may be a random outlier).

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)

Even though this seems to be pinpointed to the kernel, I'll leave the X task open for now so we can keep track of the bug's progress from the X end.

summary: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in
- normal boot
+ normal boot (Regression from 2.6.38-3 to -4)
Changed in xserver-xorg-video-ati (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
Revision history for this message
kolio kostadinov (koliokostadinov) wrote :

Hi guys,
i have the same problem.I like very much this OS-Ubuntu, but why such a good system do not resolve this problem such a long time?I see a posts from the last 2 or 3 yaers.It's strange for me.I don't want to experiment with my PC.I tried but it's always a crash,black screen,red screen.....I wait a new better release or help me with something that really works.Thank you very much.

bugbot (bugbot)
tags: added: crash
Revision history for this message
Bryce Harrington (bryce) wrote :

kolio, not sure what you're talking about. The Radeon HD 5650 came on the market on Jan 7, 2010, so it did not exist 2 or 3 years ago. Whatever posts you're looking at are unrelated to this problem.

Revision history for this message
Bryce Harrington (bryce) wrote :

afoglia, just to confirm - you still seeing this crash with the current kernel?

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → Confirmed
Revision history for this message
afoglia (afoglia) wrote :

Yes and no. I did six normal boots with 2.6.38-7.35, then realized there was an update, and booted that both normally and in recovery and here's what I saw

2.6.38-7.36 normal, 6 boots, 5 reached gdm login screen, 1 gdm started but hung before login window appeared (only one of the 5 successful boots showed the plymouth boot screen)
2.6.38-7.36 recovery mode, 5 boots, all hung with the monitor off, no plymouth, brightness key did nothing.
2.6.38-7.35 normal, 6 boots, 3 hung with monitor off, 3 reached gdm

Since I still can't boot in recovery (and I don't see anything in the changelog for -7.36 obviously related), I'd say the bug is still there.

Revision history for this message
Guillaume Modard (guillaumemodard) wrote :

I confirm that the bug is still there. I install last update this morning (I saw xserver-xorg-video-ati, -radeon... update so I was expecting the bug to be solved).

When I restart, I need to reboot more than 5 time before I get a desktop. And now, when the boot crash, I don't get any shell, screen stay black (as if it is off).

I really expect this bug will be solved before the final release. If not, Ubuntu won't work on plenty of the last HP pavilion laptop.

Revision history for this message
Guillaume Modard (guillaumemodard) wrote :

Note : Here is the result of
lspci -v | grep -A 12 VGA :

guillaume@guillaume-HP-Notebook:~$ lspci -v | grep -A 12 VGA
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller])
 Subsystem: Hewlett-Packard Company Device 163c
 Flags: bus master, fast devsel, latency 0, IRQ 44
 Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
 Memory at b0000000 (64-bit, prefetchable) [size=256M]
 I/O ports at 5050 [size=8]
 Expansion ROM at <unassigned> [disabled]
 Capabilities: <access denied>
 Kernel driver in use: i915
 Kernel modules: i915

00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
 Subsystem: Hewlett-Packard Company Device 163c
--
01:00.0 VGA compatible controller: ATI Technologies Inc Robson CE [AMD Radeon HD 6300 Series] (prog-if 00 [VGA controller])
 Subsystem: Hewlett-Packard Company Device 163c
 Flags: bus master, fast devsel, latency 0, IRQ 43
 Memory at a0000000 (64-bit, prefetchable) [size=256M]
 Memory at c4400000 (64-bit, non-prefetchable) [size=128K]
 I/O ports at 4000 [size=256]
 Expansion ROM at c4440000 [disabled] [size=128K]
 Capabilities: <access denied>
 Kernel driver in use: radeon
 Kernel modules: radeon

01:00.1 Audio device: ATI Technologies Inc Manhattan HDMI Audio [Mobility Radeon HD 5000 Series]
 Subsystem: Hewlett-Packard Company Device 163c
guillaume@guillaume-HP-Notebook:~$

Revision history for this message
Johan Fornander (fornander-johan) wrote :

I have a theory... Maybe there is a race condition between the intel and ati driver involved here? My notebook starts up in two seemingly random configurations, or three including the radeon crash:

1. X server is on VT8 -> unable to unload radeon module because it is in use (by some framebuffer I guess). I am also unable to switch to consoles VT1-7. If I use vga_switcheroo to switch to integrated gpu in this mode then radeon crashes.

2. X server is on VT7 -> I can unload the radeon module and use the consoles VT1-6. vga_switcheroo works and I can also use acpi calls to turn off the gpu.

3. The radeon driver crashes. Forcing reboot through RSEIUB.

This makes it difficult to control the temperature since I cannot know if the radeon module is in use or not (i.e. I might or might not be able to use the vga_switcheroo, or unload the module and use a specific acpi call to shut of the gpu).

Revision history for this message
Johan Fornander (fornander-johan) wrote :
Download full text (4.4 KiB)

When I have booted into an evironment where both X server and framebuffer uses intel, sometimes when I try to unload the radeon module it crashes like this:

[ 346.860598] radeon 0000:02:00.0: ffff88014b362000 unpin not necessary
[ 346.860619] BUG: unable to handle kernel paging request at ffffc90022680000
[ 346.861758] IP: [<ffffffffa01f00bc>] rs600_gart_set_page+0x3c/0x50 [radeon]
[ 346.863120] PGD 157818067 PUD 157819067 PMD 14959b067 PTE 0
[ 346.864707] Oops: 0002 [#1] SMP
[ 346.866297] last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-VGA-1/status
[ 346.867941] CPU 3
[ 346.867963] Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc parport_pc ppdev dm_crypt wl(P) lib80211 snd_hda_codec_hdmi snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm joydev snd_seq_midi brcm80211(C) snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device sparse_keymap mac80211 cfg80211 uvcvideo videodev psmouse snd v4l2_compat_ioctl32 intel_ips serio_raw lp soundcore snd_page_alloc parport radeon(-) i915 ttm ahci atl1c libahci drm_kms_helper drm i2c_algo_bit video
[ 346.877408]
[ 346.879334] Pid: 2518, comm: rmmod Tainted: P C 2.6.38-7-generic #38-Ubuntu Acer Aspire 3820/JM31_CP
[ 346.881377] RIP: 0010:[<ffffffffa01f00bc>] [<ffffffffa01f00bc>] rs600_gart_set_page+0x3c/0x50 [radeon]
[ 346.883456] RSP: 0018:ffff8801259e7c68 EFLAGS: 00010286
[ 346.885512] RAX: 00000000ffffffea RBX: ffff880149e00000 RCX: ffffc90022680000
[ 346.887607] RDX: 0000000036822067 RSI: 0000000000000000 RDI: ffff880149e00000
[ 346.889724] RBP: ffff8801259e7c68 R08: 0000000000000000 R09: ffff88014adf7748
[ 346.891854] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000111
[ 346.893993] R13: 0000000000000111 R14: 0000000000000888 R15: 0000000000000001
[ 346.896139] FS: 00007f8e31ee8720(0000) GS:ffff880093180000(0000) knlGS:0000000000000000
[ 346.898319] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 346.900511] CR2: ffffc90022680000 CR3: 0000000125af6000 CR4: 00000000000006e0
[ 346.902754] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 346.905025] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 346.907314] Process rmmod (pid: 2518, threadinfo ffff8801259e6000, task ffff88013d67c440)
[ 346.909638] Stack:
[ 346.911920] ffff8801259e7cb8 ffffffffa01bf146 ffff880149e01338 0002000000000000
[ 346.914277] ffff8801259e7cb8 ffff880149e00000 ffff88014f0c6000 ffffffffa025a590
[ 346.916600] ffff88014f80e000 0000000000000001 ffff8801259e7cd8 ffffffffa01bf49d
[ 346.918952] Call Trace:
[ 346.921281] [<ffffffffa01bf146>] radeon_gart_unbind+0xb6/0x160 [radeon]
[ 346.923664] [<ffffffffa01bf49d>] radeon_gart_fini+0x7d/0x80 [radeon]
[ 346.926056] [<ffffffffa022a146>] evergreen_pcie_gart_fini+0x26/0x30 [radeon]
[ 346.928466] [<ffffffffa022dc8e>] evergreen_fini+0x3e/0x90 [radeon]
[ 346.930871] [<ffffffffa01a5b0b>] radeon_device_fini+0x3b/0xa0 [radeon]
[ 346.933291] [<ffffffffa01a7045>] radeon_driver_unload_kms+0x35/0x60 [radeon]
[ 346.935709] [<ffffffffa0020b16>] drm_put_dev+0xc6/0x1d0 [drm]
[ 346.938125] [<ffffffffa018b11d>] radeon_pci_remove...

Read more...

Revision history for this message
Guillaume Modard (guillaumemodard) wrote :

Does anybody works on this bug ?

Revision history for this message
Dweia (dweia) wrote :

Bryce Harrington wrote on 2011-03-04:[...] That suggests a regression in the kernel between 2.6.38-3 and -4

The error must have occured a lot earlier. I tried a bunch of different kernels, each with the (at the moment) most recent version (highest number after the dash):

2.6.35-25 - works
2.6.36-1 - works
2.6.37-12 - crashes most of the time
2.6.38-7 - crashes most of the time

I also tried some other versions, including the mentioned 2.6.38-3, but no luck there for me.

There's another bug, which may or may not be related, and which got apparently fixed with kernel 2.6.37, maybe thereby introducing this problem with the crashes? This older problem causes entries like the following in the kernel log when shutting down or rebooting the system.

kernel: [ 36.068256] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 36.068719] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CF42 (len 72, WS 0, PS 0) @ 0xCF71
kernel: [ 41.070113] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 41.070654] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CB20 (len 62, WS 0, PS 0) @ 0xCB3C
kernel: [ 46.271694] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 46.272333] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CB20 (len 62, WS 0, PS 0) @ 0xCB3C

These entries occur only in kernel 2.6.35 once per second, in 2.6.36 every 5 seconds and disappear altogether in 2.6.37

P.S. I don't think the bug has to do with xserver-xorg at all - it's most probably the radeon kernel-module, since the errors occur long before X is starting. Also the bugs go away when I blacklist the radeon module. Unfortunately I cannot switch off the Radeon-graphics card, when the module for it isn't loaded. :-(

Revision history for this message
Dweia (dweia) wrote :

Unfortunately I discovered yesterday, that I lied. Wehn used without battery (this is a Aspire 3820TG laptop) and connected power-cord, the crash occurs also with kernel 2.6.36-1. Probably the BIOS does something to/with the graphics-cards, when external power is connected. All the previous tests had been running on battery-power.

Couldn't yet test the 2.6.35 kernel, since I removed it already- need to reinstall that and see what happens.

Revision history for this message
Bryce Harrington (bryce) wrote :

It's starting to sound like this is due to confusion (maybe a regression) in the plumbing layer between X and the kernel, such as module-init-tools or one of the related packages.

I think either apw or cjwatson need to look into this issue. apw's on vacation though.

Revision history for this message
Chris Halse Rogers (raof) wrote :

This does look a lot like some bad interaction between i915/radeon(/vesafb?). Although we don't seem to have a full dmesg of a -7 kernel it seems like it's not vesafb-based.

It would be useful to have logs - both of good and bad boots - with the “drm.debug=0x0e” kernel argument added to the boot line.

Revision history for this message
Johan Fornander (fornander-johan) wrote :

I have taken logs from a set of ubuntu kernels starting up using the requested kernel boot argument "drm.debug=0x0e":

2.6.38-7 ---> fb0: radeondrmfb frame buffer device
2.6.38-8 ---> fb0: inteldrmfb frame buffer device
2.6.39-rc1--> kernel oops (pointing to evergreen something), not caught in the logs. seems to be a new offset than the reported one above

Please see the attached files containng dmesg and kern.log for each kernel. There are some other interesting things in the logs like invalid DSDT and stuff that I will look into further.

Revision history for this message
Bryce Harrington (bryce) wrote :

afoglia - I've forwarded this bug upstream to http://bugs.freedesktop.org/show_bug.cgi?id=36003 - please subscribe yourself to this bug, in case they need further information or wish you to test something. Thanks ahead of time!

Johan, I attached your logs to the upstream bug report. Generally upstream prefers that the logs come from the original reporter, so I'm not sure if they will accept the bug report.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

Upstream would like to see if setting the video to the radeon/descrete setting in the BIOS configuration makes it function properly.

If so, this may be a known issue in the new vga_switcheroo functionality, ala bug https://bugzilla.kernel.org/show_bug.cgi?id=30052

Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Dweia (dweia) wrote :

Chris Halse Rogers wrote in #22: "This does look a lot like some bad interaction between i915/radeon"

I agree - I did some more testing and placed an entry for the radeon-module into /etc/initramfs-tools/modules, and voilá - no crash! (after booting into X I can't get back to the text-console, but that's possibly another issue).

What proves the "bad interaction" even more is: when also placing "i915" into /etc/initramfs-tools/modules BEFORE "radeon", booting isn't possible at all any more, but when placing i915 AFTER radeon, booting is possible, but I got a black screen (as in totally black, that is: no backlight) until X starts up.

I'll try to get logs of four different initrd-configurations (though I doubt I'll be able to record the one where the crash occurs already during running of initrd...)

P.S. off-topic, I knew a cjwatson once - hi Kamion ;-)

Revision history for this message
Dweia (dweia) wrote :

Bryce Harrington wrote in #25 "Upstream would like to see if setting the video to the radeon/descrete setting in the BIOS configuration makes it function properly."

Answer: Yes it does. I tried that a while ago already, but can't (don't want to) use that for regular running, because the radeon-card makes the laptop-battery ruin out too fast. I read also that there's a patched/hacked BIOS somewhere that allows to switch off the radeon-card via BIOS, but if it can be solved with software I'd prefer that ;)

Changed in xserver-xorg-driver-ati:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
Johan Fornander (fornander-johan) wrote :

@Bryce: Yes, booting with only discrete or integrated enabled does solve the problem for me as well.

@Dweia: I patched the CMOS for my 3820TG and unlocked the Intel menu. Now I can choose to only have the IGD activated and PEG (radeon) completely shut off drawing zero power. Same thing as going through vga_switcheroo or using the acpi calls but less hassle.

Revision history for this message
Johan Fornander (fornander-johan) wrote :

Btw, should we work on the bug on here on launchpad or keep the discussion on freedesktop working directly with the AMD devs?

Revision history for this message
Dweia (dweia) wrote :

Sorry, I got sidetracked while getting a set of logs. However, some (yet slightly vague) findings may be useful - even if debugging gets maybe even harder:

Firstly: the computer (BIOS or whatever) behaves differently when external power is connected or only battery used, and secondly: it behaves differently depending on the last state of the vgaswitcheroo BEFORE the reboot. I need to do more testing regarding the former (probably frequency of crashes higher with external power), but the latter seemed to me pretty consistently only crashing after "echo OFF > /sys/kernel/debug/vgaswitcheroo/switch".

I did yesterday a kernel-update to 2.6.38-8, I'll try to reproduce the findings and will try to see if anything changed in the behaviour.

Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

If I add
radeon.modeset=0
in the boot line when starting, the crash does not happen and the system continues with is using integrated Intel.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

marking the bug as confirmed

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Hybrid graphics)

I've updated the title and description based on recent findings.

Hybrid graphics switching support is still fairly embryonic upstream and I don't feel it is yet stable or reliable enough yet for us to support in Ubuntu, so I am setting the importance of the X task here to Wishlist.

However, even aside from switching graphics, the kernel should not be failing with this particular hardware configuration, even if it is not able to properly switch; it should pick one driver or the other and not load both, even if it just has to pick at random. So I'm leaving the kernel task here open, in hopes that some fix can at least paper over the crash.

description: updated
summary: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in
- normal boot (Regression from 2.6.38-3 to -4)
+ normal boot (Hybrid graphics)
Changed in xserver-xorg-video-ati (Ubuntu):
importance: High → Wishlist
status: Confirmed → Triaged
summary: - [Radeon HD 5650 and 5470] Driver crash during recovery boot and in
- normal boot (Hybrid graphics)
+ [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal
+ boot (Hybrid graphics)
Revision history for this message
Bryce Harrington (bryce) wrote :

@JFo, this hardware results in the kernel triggering a BUG. Please add this to the kernel team's list of bugs to investigate.

Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
status: Triaged → New
assignee: nobody → Jeremy Foshee (jeremyfoshee)
Revision history for this message
wedens (frigid20) wrote :

i have same problems on radeon 5650
log attached

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

added to the hot bugs listing for team review.

~JFo

tags: added: kernel-key
Changed in linux (Ubuntu):
assignee: Jeremy Foshee (jeremyfoshee) → nobody
status: New → Triaged
Bryce Harrington (bryce)
tags: added: oneiric
Revision history for this message
Bryce Harrington (bryce) wrote :

[I've marked this bug for inclusion in our oneiric bug queue. While technically this bug has not been re-confirmed against oneiric, I feel it is worth continued development attention. We will need to ask that it be re-confirmed once oneiric is further along, perhaps once we get closer to alpha.]

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

can confirm this bug on hybrid radeon 6550 and intel card on acer aspire 3820TG
natty kernel 2.6.38-8-generic.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

issue seems to be gone with kernel 2.6.38-9-generic from proposed.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

No, it still exist on 2.6.38-9-generic when power cable is unplugged (

Revision history for this message
Seth Forshee (sforshee) wrote :

@Cabalbl4, do you mean that it happens when you boot with the power cable unplugged? Or that it crashes when you unplug the power cable after successfully booting? Either way it would be useful to capture dmesg after it happens, if possible.

It might also be useful to have the DSDT from machines affected by this bug. To collect the DSDT, open a terminal and execute the following commands:

  sudo apt-get install fwts
  sudo fwts --disassemble-aml

Then attach the DSDT.dsl file generated by fwts to this bug. Thanks!

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

Well, as far as I understood now, it is somehow related to cable plug, but not all the times. Maybe it affects module load or something.

Here is a peace of syslog I have captured before by tailing its output to file before everything hung on kernel 2.6.38-8.
I will try to use fwts soon.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

Well, after update from xorg-edgers ppa and some messings with initramfs (tried to put radeon module before intel, but later reverted it) the bug seems to be happening very rarely, but at random :( As drawback, my ttys are gone again. But it is another bug of intel driver. And here is my DSDT.dsl

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

@Seth Forshee to get things clear about the cable. I never tried to plug/unplug cable in the boot process. I have only booted with cable already plugged in or unplugged.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

The bug never happens after the successfull boot. Even when I switch cards with vgaswitcheroo.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Cabalbl4, thanks for the clarification. Can you also attach the SSDT.dsl file that fwts generated? I'm not finding the relevant fields in the DSDT.

Revision history for this message
Jean Demange (jea-demange) wrote :

Same issue for me, above all when my computer is disconnected at the start up. I join files from fwts. Hope it will help.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

@Seth Forshee maybe it is because my ATI card is switched off by script via vgaswitcheroo after boot?
There are two SSDT.dsl files, attaching them.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

Second one

Revision history for this message
Johan Fornander (fornander-johan) wrote :

Confirmed using 2.6.39-020639rc5-generic (v2.6.39-rc5-oneiric). Radeon module crashes in function evergreen_cp_resume().

Revision history for this message
Johan Fornander (fornander-johan) wrote :

Attaching my DSDT.dsl as well. Acer Aspire 3820TG equipped with a Radeon HD 5650.

Revision history for this message
belltown (sea-av80r) wrote :

I also have an HP ENVY 14 with switchable graphics (ATI HD5650/Intel Integrated). I just did a clean install of Ubuntu 11.04 and installed all the recommended updates from the update manager. I get a black screen when I boot unless I put radeon.modeset=0 in the boot command line.

I've attached a copy of /var/log/kern.log created when using the default boot parameters.

Revision history for this message
belltown (sea-av80r) wrote :

Here's a copy of /var/log/kern.log created using radeon.modeset=0 on the boot parameter line.

Revision history for this message
belltown (sea-av80r) wrote :

Sorry, last post was Xorg.0.log with radeon.modeset=0.

Here is /var/log/kern.log with radeon.modeset=0

Revision history for this message
riyasmp (riyasmp) wrote :

Hi guys

I have a similar issue with my hp pavillion dv6-3150 SA.

I have been using 10.10 so far which had same problem with switchable graphics and the ATI never worked.

I made a fresh install of 11.04 on the same laptop recently and the hot boot returned me a blank screen when I chose 11.04. i did a cold booting and it took me to 11.04 unity interface. logd out chose classical ubuntu desktop and tried that as well.

The intresting thing is that live Cd worked alright on this laptop. At the moment i am using 10.10 as 11.94 is not working. I would be able to help any file from 10.10 if it helps.

since then when i restart X is crashing and I cant use 11.04. i tried to seek some help from #ubuntu-uk channel. and some one asked me to put command in the rescue mode( sudo mv /etc/X11/xorg.conf /etc/X11/xorg.conf_backup_20110508 )

I did that and it returned the output /etc/X11/xorg.conf no such file or directory.

the output for cat /var/log/Xorg.0.log | pastebinit is http://paste.ubuntu.com/604907/

please refer to this link https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/698274 for detials about the bug that file last year as well on 10.10

with regards

Revision history for this message
Guillaume Modard (guillaumemodard) wrote :

I can confirm that :

- The bug is due to Switchable ATI + Intel graphics
- Both free and proprio driver do not work correctly :
--> Free Driver : Boot correctly every 5 to 10 time, without a complete support of the ATI Graphic Card (comes very hot some time...)
--> Proprio Driver : Boot correctly every time, but with the intel integrated graphic card (no Unity, only 2D gnome panel without effects)
- Most of the last HP laptop have this bug (= Ubuntu do note work on most of the last HP laptop)

Is there any deadline for correct this critical bug ?
Does any team really work on it ?

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

For now, my bug is completely gone. But it triggers Bug #571573 (tty loss on intel driver). The radeon and intel drivers now manage to boot correctly, however, the TTYs are gone until I switch off intel card (to radeon) and then turn it back on.

Revision history for this message
Jean Demange (jea-demange) wrote :

For me the bug is still present. Almost impossible to boot on battery, with the power plugged, boot succeed after 2 or three times. I think it still needs some adjustments. Moreover is there any way to deactivate this module while it doesn't work correctly ?

Revision history for this message
Jan-Åke Larsson (jalar) wrote :

This is erratic. I have blacklisted "radeon" from autoloading by adding it in /etc/modprobe.d/. I then load it manually in rc.local to be able to turn the card off. I now can boot on battery fine. Or could. Lately I've had boot trouble again, but this might be for Other Reasons.

Revision history for this message
Seth Forshee (sforshee) wrote :

I've been poring over the logs attached here, but there doesn't seem to be enough information to piece together what's different between a good and a bad boot. I'd like to reiterate the previous request for kernel logs of _both_ good and bad boots with the “drm.debug=0x0e” kernel argument added to the boot line, with both the good and bad logs collected using the same hardware and the same kernel version.

Thanks in advance!

Changed in linux (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
status: Triaged → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics)

On Wed, May 18, 2011 at 09:13:15PM -0000, Seth Forshee wrote:
> I've been poring over the logs attached here, but there doesn't seem to
> be enough information to piece together what's different between a good
> and a bad boot. I'd like to reiterate the previous request for kernel
> logs of _both_ good and bad boots with the “drm.debug=0x0e” kernel
> argument added to the boot line, with both the good and bad logs
> collected using the same hardware and the same kernel version.
>
> Thanks in advance!

Btw, the xdiagnose utility can be used to add "drm.debug=0x0e" to the
kernel. Install it, then run 'sudo xdiagnose'; it's the first checkbox
in the dialog.

Revision history for this message
belltown (sea-av80r) wrote :

@Seth Forshee
Is this what you're looking for?

I'm using an HP ENVY 14 with a switchable graphics (Radeon HD5650 and integrated HD Intel graphics.)

I have attached a kern.log file with 3 boots:
1st boot - Bad. Black screen occurs on boot when trying to use Radeon driver without radeon.modeset=0
2nd boot - Good. Boot with radeon.modeset=0
3rd boot - Good. Blacklisted radeon driver. Booted using integrated intel driver

Revision history for this message
belltown (sea-av80r) wrote :

Here's the output from lspci -vv for comment #62

P.S. How do I add multiple attachments to the same comment in Launchpad?

Revision history for this message
belltown (sea-av80r) wrote :

@Seth Forshee

This log might be a better one to look at. I tried booting several times, each boot was EXACTLY the same, no blacklisting of the radeon driver, no use of modeset=0 or other boot parameters. The only variable was the time delay between getting the GRUB menu and pressing the enter key on the menu to get the boot to occur.

The very last boot (time entry 20:50:16) resulted in a successful boot. I believe all the others resulted in a black screen.

I used drm.debug=0x0e for all boots.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

I have a Dell Aspire Timeline X 4820TG with Radeon HD5650 and integrated intel graphics.

This bug used to bugger me about every second boot, but after installing tons of different kernels it's actually been more sporadic. I don't know whether that is because I spend more time in the GRUB-menu, clicking to Previous versions and then choosing 2.6.38-8.

Anyways, I added the drm.debug=0x0e for all kernels and kept booting till I got some good and bad boots. Chronologically I had 1 good, 3 bad and 1 good boot, but in the attached kern.log it only shows 1 good, 1 bad, 1 good. In all the bad boots it stopped with a completely black screen (no backlight) but I was able to switch xserver (or something) by pressing Alt+F1, Alt+F2 and Alt+F7. The system did not respond to Ctrl-Alt-Del but printed some SAK line when pressing AltGr+SysRq+K without doing anything.

The last line in one terminal on all bad boots was this:
[drm:intel_prepare_page_flip], preparing flip with no unpin work?

The last thing in the other terminal was the call trace ending with this:
[ 27.218815] RIP [<ffffffffa0569b16>] evergreen_cp_start+0x56/0xc80 [radeon]

Hope this helps, let me know if you need more info.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :
Download full text (3.3 KiB)

Okay, first of all, sorry for the long comment but here are my observations from scanning the kern.log:

If this block:
[ 18.460593] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 18.460617] i915 0000:00:02.0: setting latency timer to 64

comes before this block:
[ 18.559570] [drm] radeon defaulting to kernel modesetting.
[ 18.559574] [drm] radeon kernel modesetting enabled.
[ 18.559599] VGA switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
[ 18.559652] radeon 0000:01:00.0: enabling device (0000 -> 0003)
[ 18.559660] radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 18.559666] radeon 0000:01:00.0: setting latency timer to 64

We will get:
(...)
[ 18.560350] i915 0000:00:02.0: irq 42 for MSI/MSI-X
(...)
[ 18.630999] [drm:intel_dsm_platform_mux_info], MUX info connectors: 7
(...)(mux load)
[drm:intel-stuff]x30
[ 18.827202] vga_switcheroo: enabled
[ 18.827287] radeon atpx: version is 1
[ 18.842578] HDA Intel 0000:00:1b.0: BAR 0: set to [mem 0xdc500000-0xdc503fff 64bit] (PCI address [0xdc500000-0xdc503fff])
[ 18.842592] HDA Intel 0000:00:1b.0: enabling device (0000 -> 0002)
[ 18.842620] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[ 18.842757] HDA Intel 0000:00:1b.0: irq 43 for MSI/MSI-X
[ 18.842793] HDA Intel 0000:00:1b.0: setting latency timer to 64
(...)
[ 28.717369] ATOM BIOS: Acer
[ 28.717382] radeon 0000:01:00.0: GPU softreset
(...) (GPU reset stack)
[ 28.864335] radeon 0000:01:00.0: irq 45 for MSI/MSI-X
[ 28.864343] radeon 0000:01:00.0: radeon: using MSI.
[ 28.864356] radeon 0000:01:00.0: IH ring buffer overflow (0xFFFFFFFF, 0, 15)
[ 28.864386] [drm] radeon: irq initialized.
[ 28.864388] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 28.864892] [drm] Loading REDWOOD Microcode
[ 29.042576] radeon 0000:01:00.0: Wait for MC idle timedout !
[ 29.187699] radeon 0000:01:00.0: Wait for MC idle timedout !
[ 29.189702] radeon 0000:01:00.0: WB enabled
[ 29.206269] BUG: unable to handle kernel paging request at ffffc9041b591ffc

-> CRASH

if however the blocks on top are switched round we get:

(...)
[ 25.383004] ATOM BIOS: Acer
(...) (no GPU reset )
[ 25.383379] radeon 0000:01:00.0: irq 43 for MSI/MSI-X
[ 25.383386] radeon 0000:01:00.0: radeon: using MSI.
[ 25.383423] [drm] radeon: irq initialized.
[ 25.383426] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 25.383942] [drm] Loading REDWOOD Microcode
[ 25.417531] radeon 0000:01:00.0: WB enabled
[ 25.434108] [drm] ring test succeeded in 1 usecs
(...)
[ 26.836096] i915 0000:00:02.0: irq 44 for MSI/MSI-X
(...)
[ 26.905062] [drm:intel_dsm_platform_mux_info], MUX info connectors: 7
(....) (mux load)
[ 26.905137] vga_switcheroo: enabled
[drm:intel-stuff] x30
[ 27.090241] HDA Intel 0000:00:1b.0: BAR 0: set to [mem 0xdc500000-0xdc503fff 64bit] (PCI address [0xdc500000-0xdc503fff])
[ 27.090256] HDA Intel 0000:00:1b.0: enabling device (0000 -> 0002)
[ 27.090290] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[ 27.090462] HDA Intel 0000:00:1b.0: irq 45 for MSI/MSI-X
[ 27.090495] HDA Intel 0000:00:1b.0: setting la...

Read more...

Revision history for this message
Zentai Andras (andras-zentai) wrote :

Dear All,

I experienced the same problem with an other switchable graphics configuration:
ATI Radeon HD 3650 discrete and some Intel card. (Lenovo Thinkpad T500)

Boot hangs and I got the same error message, just the pci id was different:
udevd-work[86]: '/sbin/modprobe -bv unexpected exit with status 0x0009

Disabling the switchable graphics option in BIOS resulted good boot either using Radeor or Intel video cards.

I suggest to include the HD 3650 model to the topic of the bug.

Revision history for this message
Jean Demange (jea-demange) wrote :

Hey,
After seven failed boots, I've managed to have a good boot : during this good boot, the disk checking had started.
When the boot was unsuccessful unsuccessful, I couldn't access to a tty but the magic keys worked.

I attach the the success part of the kern.log, followed by the bad part and I attach too the entire kern.log because I'm not really sure of my cuts.

The option “drm.debug=0x0e” was activated.

Revision history for this message
Jean Demange (jea-demange) wrote :
Revision history for this message
Jean Demange (jea-demange) wrote :
Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
Seth Forshee (sforshee) wrote :

Thanks for the logs.

I've posted a test build that I'd like to receive some feedback on. It uses the ACPI method to try and enable power to the card before trying to do any hardware initialization. If it works I'll run it by the upstream developers to see whether this is an appropriate solution to the problem. You can get the build at

http://people.canonical.com/~sforshee/lp727620/linux-2.6.38-9.43~lp727620v201105271601/

Please post your feedback here after testing. Thanks!

Changed in linux (Ubuntu):
status: Incomplete → In Progress
status: In Progress → Incomplete
Revision history for this message
belltown (sea-av80r) wrote :

@Seth Forshee

I'd like to try out this patch. My kernel version is 2.6.38-8-generic and I'm running an x64 system.

I assume I need your linux-headers-*_amd64.deb and linux-image-*_amd64.deb files. How do I install them , and do I need to do anything with the .patch file?

Thanks.

Revision history for this message
Seth Forshee (sforshee) wrote :

To install, download the two .deb files that match your installation (*_i386.deb for 32-bit and *_amd64 for 64-bit) and the linux-headers-*_all.deb file into a new directory. Then open a terminal, navigate to that directory, and run 'sudo dpkg -i *.deb'.

If you're unsure whether you have a 32-bit or 64-bit installation, run 'uname -m' in a terminal. If it outputs i686 get the *_i386.deb files and if it outputs x86_64 get the *_amd64.deb files.

Revision history for this message
Jean Demange (jea-demange) wrote :

Hello,
I've tested it. The boot starts correctly and I can get to gdm. But 5 secondes after the logging screen, when I try to log me, i'me returning on a black screen with white writting ; but I still have a mouse and I can log me by enter my password blindly. I can heard the sound of Ubuntu starting but X11 doesn't work.

I attach the kern.log.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Hi Seth,
Thanks a lot for working on this. Unfortunately I was unable to reach Unity at all with the new kernel but it crashes in a distinctively new way. Earlier the computer stopped without any backlight, but I was able to reach VT1, which says "Preparing flip with no unpin work?" and VT7, which displayed the kernel BUG from the driver loading.

Now, out of five times, two times the computer stopped with a black screen with backlight, but I was unable to reach any VT.
The following three times, the computer wanted to check the disk, which after diskchecking, resulted in the loading-logo (Ubuntu with dots underneath) being displayed on my computer for the first time since the upgrade to 11.04 (Usually the complete loading of linux is black).

the computer froze after this, two times with the ubuntu-logo being displayed and once with the Call trace from the BUG being displayed. I was unable to use Alt-F1/F7.

Attached is the kern.log

Revision history for this message
aproposnix (aproposnix) wrote :

I just wanted to add that with the HD5650 on an Acer Aspire I have the same issues. One thing that doesn't seem to get mentioned here much is the vesafb. More than half the time I boot I get an error stating that their was an error inserting vesafb. I sometimes also receive an error stating that the module vesafb.ko was not found. Either way, the system fails to boot.

I have no idea if this information is being output to a log somewhere as it seems to occur even before the filesystems are mounted. Can someone suggest to me which log to look for?

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

Same here. Testing these new kernel .deb's doesn't fix the issue. At the contrary, in my case I never was able to boot using this kernel, while it works using a non-patched kernel.

Revision history for this message
Seth Forshee (sforshee) wrote :

Thanks to everyone who tested. I've passed this information along to upstream.

@harry, are you getting the vesafb messages with natty? The natty kernel has the vesafb driver built into the kernel, so there isn't any vesafb.ko. Do you have a log with the messages you're talking about?

Revision history for this message
aproposnix (aproposnix) wrote :

Seth, yeah I'm on natty. Can you help me identify the log you need? I'm not sure which is relevant.

Revision history for this message
Seth Forshee (sforshee) wrote :

harry, let's start with /var/log/kern.log. Grab it from a boot when you've seen the vesafb messages. Thanks!

Revision history for this message
aproposnix (aproposnix) wrote :

@Seth, Does it matter that the error occurs before the drives are mounted? I'm not sure it's actually saving the log when the error occurs.

Either way, on the next occurrence, I'll send this log.

Revision history for this message
Seth Forshee (sforshee) wrote :

harry, the whole of the kernel log will get saved to kern.log as long as the system is booting that far. If it's not booting that far and you're able to get a terminal then you can try to collect the output of the dmesg command. Failing that, you could try booting into recovery mode, and if you get the errors then try to collect dmesg. Otherwise the best you can do is probably to supply the exact text of whatever messages you see when this happens (taking a picture of the screen is one option).

Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
z06gal (z06gal) wrote :

I am running Mint 11 32bit and am experiencing this bug. I upgraded yesterday to the 2.6.39.1 kernel and fortunately it resolved the power regression issue I was having but I continue to get this message during boot. The first message that comes up is "could not start bootsplash = could not access a shared library" and this is followed by the error being discussed here. After those 2 lines come up, my computer will boot right up and there are no more issues. It boots the same whether I use battery or not. I have no idea if this is a part of this issue but when I run powertop, I see at the top i915 <interrupt> always. Here is the info on my dell xps:

robin@robin-Dell-System-XPS-L702X ~ $ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation 2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series Chipset Family High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 2 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 6 (rev b5)
00:1d.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation HM67 Express Chipset Family LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series Chipset Family 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series Chipset Family SMBus Controller (rev 05)
01:00.0 VGA compatible controller: nVidia Corporation Device 0dcd (rev a1)
03:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 (rev 34)
04:00.0 USB Controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04)
05:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host Controller (rev 30)
05:00.2 SD Host controller: JMicron Technology Corp. Standard SD Host Controller (rev 30)
05:00.3 System peripheral: JMicron Technology Corp. MS Host Controller (rev 30)
05:00.4 System peripheral: JMicron Technology Corp. xD Host Controller (rev 30)
0a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

The behaviour has changed a bit after latest updates.
I have a HP Pavilion dm4t
uname -a
2.6.38-10-generic #44-Ubuntu SMP Thu Jun 2 21:32:22 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I have the feeling that a month ago I needed 6-7 reboots until working OK, now I only need 3-4.

Another change is that after it starts OK, the screen is dimmed so much that is turned off (my laptop has such feature to turn off the backlight). So in order to login I have to increase brightness (switch backlight on).

WIth previous kernels the screen was backlit after successful loading of X.

Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

I forgot to say that I also have included the ubuntu-x-swat/x-updates ppa.

Revision history for this message
aproposnix (aproposnix) wrote :

@Seth I still can't seem to find the error message that I see on boot freeze in the logs. I tired taking a photo of it. Maybe it'll help?

uname -a
Linux ClarifyUbuntu 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Seth Forshee (sforshee) wrote :

@harry, what that message probably means is that the bootloader is using something other than VESA for graphics, VGA perhaps. It's probably nothing to worry about, and I don't think it has anything to do with the problem during the radeon driver probe.

Revision history for this message
aproposnix (aproposnix) wrote :

@Seth You wouldn't know how to fix this would you? :)

I do worry about it though as i am constantly having to reboot many times until I finally get into Ubuntu.

 It's annoying as well as embarrassing around my Mac and Window using colleagues. They think I'm an idiot for using Ubuntu... maybe I am?

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

According to https://bugzilla.kernel.org/show_bug.cgi?id=30052#c17 this seems to be fixed in recent kernel releases. I think that the patches fixing it should be backported!

> I can't reproduce this bug on 2.6.39-git19 and 3.0-rc3.
> Seems bug fixed.
> Thanks!

Revision history for this message
Seth Forshee (sforshee) wrote :

No one has identified what patches fix the issue, if it is indeed fixed at all. The thing to do now is to test these kernels on more affected machines, and if it fixes the problem universally we can try to find the fix and backport it. You can get mainline kernels for Ubuntu from the following link:

  http://kernel.ubuntu.com/~kernel-ppa/mainline/

I don't know exactly what version 2.6.39-git19 corresponds to, but a 3.0-rc3 is definitely available. If you find a build that does fix the problem it would also be useful to move backward one version at a time until you find the last version that is still broken.

I'll also look to see if I can find out what changes might have fixed the issue.

Thanks!

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Jean Demange (jea-demange) wrote :

Hi every body,

I've tried some kernel and I find that using the kernel 3.0-rc2, it seems to work properly. But using the 2.36.1 it doesn't work. But i didn't try the 3.0-rc1.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Jean, thanks for testing.

I don't see anything in the radeon driver changes that looks like it obviously fixes the issue, but I've identified a handful that could be responsible. These commits were introduced between 2.6.39-rc5 and -rc7. What would be most helpful right now would to be able to state something like, "version x doesn't work but version x+1 does." At this point I'd suggest starting with 2.6.39-rc5 through -rc7.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Hi,
Just a quick question. Isn't v2.6.39.1-oneiric/ after v2.6.39-rc7-oneiric/ ?

Because for me, 2.6.39-1 (which I'm guessing Jean is also referring to, since 11.04 comes with 2.6.38-8) doesn't work while 3.0-rc3 works. Testing rc1 now.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Yup,

My linux crashes occasionally during startup with kernel 2.6.39-1 with this kernel BUG, while I'm unable to produce this on 3.0-rc1.

It is also worth noting that my rc.local script that turns off the power to the radeon card works on 3.0-rc1 while it was not run on earlier kernels even the times it booted. (I would have to run it manually after startup).

Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
Seth Forshee (sforshee) wrote :

Yes, 2.6.39-1 is after 2.6.39-rc7.

I've been looking between 2.6.39 and 3.0-rc1 now, but nothing stands out as the change that potentially fixes this issue. The next step is to perform a bisection to try and locate the commit that fixes the problem. I'll provide a series of kernels, please test each one and let me know whether or not it contains the issue.

Everyone who is able to test please do so, as it can't hurt to have multiple people testing.

Bisect build #000 is available at:

  http://people.canonical.com/~sforshee/lp727620/bisect/

Bisect log so far (note that since we're hunting for the commit that fixes the problem, the meanings of 'good' and 'bad' are swapped in the log).

# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect start 'v3.0-rc1' 'v2.6.39'

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Are these bisects between 2.6.39-1 and 3.0-rc1 or between 2.6.39 and 3.0-rc1?

I'm more than willing to try the bisects, but as I've already outlined, 2.6.39-1 is already "good", it crashes, and I don't want to try too many bisects.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Ap. Syvertsen: Let's try to lay out everything clearly to make sure there isn't any confusion. The swapping of the meanings of "good" and "bad" are bound to cause confusion, so let's use the usual meanings for any discussion with the understanding that the meanings are swapped *only in the bisect logs.*

You said, "My linux crashes occasionally during startup with kernel 2.6.39-1 with this kernel BUG, while I'm unable to produce this on 3.0-rc1." I took that to mean 2.6.39.1 is bad, i.e. exhibits an oops message similar to that in the bug description. And version 3.0-rc1 does not exhibit the oops. Is that understanding correct?

It's a little problematic to bisect between 2.6.39.1 and 3.0-rc1. The stable kernels (anything with the fourth component of the version number, i.e. 2.6.39.y) are somewhat of a branch off of normal kernel development, and the bisection will go more quickly if we start with 2.6.39. This version should still have the bug if 2.6.39.1 does, unless it was fixed and then broken again, but that's unlikely. It can't hurt to verify that 2.6.39 really does have the problem however.

So I started the bisection by marking 2.6.39 as bad (i.e. "good" in the bisect log) and 3.0-rc1 as good (i.e. "bad" in the bisect logs). git picked an intermediate version for testing, and that is what the bisect000 build represents. That is correct so far as my understanding of the situation as communicated above is correct. Does that make sense?

Revision history for this message
Jean Demange (jea-demange) wrote :

For me bisect #000 does not work. The bug affect it.

Revision history for this message
Seth Forshee (sforshee) wrote :

Second build (bisect001) is now available in the same location.

Bisect log:

# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect start 'v3.0-rc1' 'v2.6.39'
# good: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git bisect good c44dead70a841d90ddc01968012f323c33217c9e

Revision history for this message
Klaus Reichl (klaus-reichl) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics)
Download full text (6.2 KiB)

Unfortunately I don't have the laptop around where the bug showed up
originally for me.
I'll be there again on one of the next weekends.

But be warned, I thought the bug has gone after updating natty a two
weeks ago.
I happily installed packages, got googleearth running, everything
looked great, and than ...
after next reboot - loose loose loose.

Switched back to poor graphics with radeon blacklisted to have at least a
base system working.

Klaus
--
Klaus Reichl                                    <email address hidden>
Danhausergasse 8/16                           +43 6991 84 137 94
1040 Wien

On Tue, Jun 21, 2011 at 9:21 PM, Seth Forshee
<email address hidden> wrote:
> Yes, 2.6.39-1 is after 2.6.39-rc7.
>
> I've been looking between 2.6.39 and 3.0-rc1 now, but nothing stands out
> as the change that potentially fixes this issue. The next step is to
> perform a bisection to try and locate the commit that fixes the problem.
> I'll provide a series of kernels, please test each one and let me know
> whether or not it contains the issue.
>
> Everyone who is able to test please do so, as it can't hurt to have
> multiple people testing.
>
> Bisect build #000 is available at:
>
>  http://people.canonical.com/~sforshee/lp727620/bisect/
>
> Bisect log so far (note that since we're hunting for the commit that
> fixes the problem, the meanings of 'good' and 'bad' are swapped in the
> log).
>
> # bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
> # good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
> git bisect start 'v3.0-rc1' 'v2.6.39'
>
> ** Changed in: linux (Ubuntu)
>       Status: In Progress => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/727620
>
> Title:
>  [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
>  normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
>  Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
>  Confirmed
> Status in “linux” package in Ubuntu:
>  Incomplete
> Status in “xserver-xorg-video-ati” package in Ubuntu:
>  Triaged
>
> Bug description:
>  [Problem]
>  On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a failure occurs resulting in a black screen and errors from the radeon kernel module, as shown below.
>
>  [Cause]
>  From upstream developer:
>
>  "The switcheroo code needs more work to switch properly on some
>  systems it seems.  There are a set acpi methods required to
>  activate/deactivate the respective gpus.  The drivers need to load and
>  initialize active hw.  If the hw is not active when the driver loads,
>  then the hw is not set up properly and it won't work.  Probably some
>  ordering issues in how the switcheroo acpi methods are called."
>
>  [Workarounds]
>  Several options:
>
>  1. If your BIOS includes functionality to disable the Intel card, use
>  BIOS settings to select which chip to load.
>
>  2. Disable KMS by adding `radeon.modeset=0` in the boot line.  Note
>  that the default radeon gallium driver only works with KMS, so YMMV.
>
>  [Original Report]
>  I'm running natty, and every since the upgrade to 6.14.0...

Read more...

Revision history for this message
Jean Demange (jea-demange) wrote :

Hello every body,

Bisect #001 work for me. The bug does not show up. But with this kernel my CPU is always at 100%, but nothing to deal with the bug.

Revision history for this message
Seth Forshee (sforshee) wrote :

Hrm, I've got a lurking suspicion that this issue has a timing component
(and that could actually be why it works in 3.0, not that it's really
fixed but that the timing that some change in timing makes the problem
much harder to trigger). If your CPU is being pegged then that could be
affecting the results.

We'll mark this one as good for now, but if the bisect goes bad we may
come back to this point.

Third build (bisect002) is now available.

# bad: [a09ed5e00084448453c8bada4dcd31e5fbfc2f21] vmscan: change shrink_slab() interfaces by passing shrink_control
git bisect bad a09ed5e00084448453c8bada4dcd31e5fbfc2f21

Revision history for this message
Jean Demange (jea-demange) wrote :

Same comportment with this bisect. No X bug but CPU at 100% all the time.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Hi Seth, thanks a lot for clarifying.

I've now tested the the three bisects and bisect000 crashes while bisect001 boots every time. I have no problems with CPU use either.

For me however, bisect002 also crashes during start-up (two out of three times).

As a side note, I had to install the headers of the bisects to make it crash, if I didn't install the headers I could boot all the kernels.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Ap. Syvertsen: That's interesting about the headers. With the next
build that tests bad for you, can you compare the lsmod output with and
without the headers packages installed?

bisect003 is now available.

# good: [9461702d2a54cd4d9da09b7755c96815791a9d07] m68k: let Makefile sort out compiling mmu and non-mmu lib/checksum.c
git bisect good 9461702d2a54cd4d9da09b7755c96815791a9d07

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

First of all, bisect003 crashes as well, with or without headers.

I looks like I kinda jumped to a conclusion too fast, as I am unable to reproduce the no-header-boot link.

The reason why I was led to believe this is that I installed bisect001-image without headers and suddenly I could boot 2.6.39-1. I keep this image and also try it to check if something else has changed the boot behavior. I've now tried for an hour to reproduce the no-header-boot link with no luck, so I would regard my previous statement as false.

Revision history for this message
Jean Demange (jea-demange) wrote :

So, I've tried the last bisect, and it doesn't work : 2 boots of 10 succeed. So was wondering if the others was really not working. And the bisect 002 crashes every time too whereas the bisect 001 worked every times.
For information for every kernel I have to add the option --force-depends.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect004 is now available.

# good: [2b030bda66b0a59f8ebf0ce2117088256a5f9f97] mach-ux500: set proper I2C platform data from MOP500s
git bisect good 2b030bda66b0a59f8ebf0ce2117088256a5f9f97

Revision history for this message
Jean Demange (jea-demange) wrote :

Bisect 004 works as the bisect 001. No X bug but CPU at 100% every time.

Revision history for this message
Ap. Syvertsen (taperkatt) wrote :

Yup, same with me. bisect004 boots fine.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect005 is now available.

# bad: [931474c4c30633400ff0dff8fb452ae20e01d067] Merge branch 'drm-radeon-next' of /ssd/git/drm-radeon-next into drm-core-next
git bisect bad 931474c4c30633400ff0dff8fb452ae20e01d067

Revision history for this message
Jean Demange (jea-demange) wrote :

Bisect 005 works fine : no X bug and no CPU issue.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect006 is now available.

# bad: [69f7876b2ab61e8114675d6092ad0b482e233612] Merge remote branch 'keithp/drm-intel-next' of /ssd/git/drm-next into drm-core-next
git bisect bad 69f7876b2ab61e8114675d6092ad0b482e233612

Revision history for this message
Jean Demange (jea-demange) wrote :

Bisect 006 works fine too : no X bug and no CPU issue.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect007 is now available.

# bad: [4b65177b27ede9dee3186bc3a58c737997ee4749] drm/i915: add IS_IVYBRIDGE macro for checks
git bisect bad 4b65177b27ede9dee3186bc3a58c737997ee4749

Revision history for this message
Jean Demange (jea-demange) wrote :

Bisect 007 does not work as well as the others. Half boots end badly. But when it bugs there is no writting but just some points every where in the screen. It is impossible to change tty.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Jean: So as I understand it you were unable to tell whether you were
getting the oops or not? Did you check your kern.log to see whether or
not it appears when booting with the bisect007 kernel?

For now I'm going to wait and see if we get some more testing on this
one.

Revision history for this message
bruise lee (workformydream) wrote :

confirm: 2.6.39-2 from mainline does not fix it. 50% chance for a good boot

HP DV6tqe with dual GPU of intel/ati(good boot reports 2G NVRAM, bad boot report 3G)

Revision history for this message
Jean Demange (jea-demange) wrote :

I think it is the same bug but symptoms are slighty not the same. But I don't know how to recognize this bug in the logs. I put in attachment the kern.log when the boot failed.

Revision history for this message
Seth Forshee (sforshee) wrote :

@Jean, that log does display the bug.

bisect008 is now available. We're getting close to the end, should be 3 builds after this one. It's looking like it will end up identifying a change to the i915 driver.

# good: [2c34b850ee1e9f86b41706149d0954eee58757a3] drm/i915: fix ilk rc6 teardown locking
git bisect good 2c34b850ee1e9f86b41706149d0954eee58757a3

Revision history for this message
DLHDavidLH (dlhdavidlh-yahoo) wrote :

this bug also affects

AMD Radeon HD 3300

Revision history for this message
DLHDavidLH (dlhdavidlh-yahoo) wrote :

this bug affects

Ubuntu 11.04 -and- 11.10

Revision history for this message
Jean Demange (jea-demange) wrote :

Same issue than the previous one.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect009 is now available.

# good: [fcca7926299944841569515da321bef9655b7703] drm/i915: reference counted forcewake
git bisect good fcca7926299944841569515da321bef9655b7703

Revision history for this message
Viktor Pal (deere) wrote :

Can confirm this bug with AMD Radeon HD 6470M

Revision history for this message
adrianszwej (adrian-szwej) wrote :

Yet another confirmation with AMD Radeon HD 6470M on Dell Vostro 3350

lspci | grep -i radeon
01:00.0 VGA compatible controller: ATI Technologies Inc NI Seymour [AMD Radeon HD 6470M] (rev ff)

uname -a
2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

bin$ cat /etc/issue
Ubuntu 11.04 \n \l

I noticed that changing power supply and non powersupply mode triggers this more frequent.
But having this laptop for one week I dont really se the pattern.
Do the radeon remember that it should sort of resume the card??

Now I have blacklisted both radeon and fglrx and I seem to be able to boot with less struggle.
In /etc/rc.local I modprobe the radeon and run switchero. Most of the time fan goes of; but sometime I have to run the switchero command in my gnome-session.

/Adrian

Revision history for this message
adrianszwej (adrian-szwej) wrote :

kern.log from ubuntu 11.04.

Revision history for this message
aproposnix (aproposnix) wrote :

Sorry to sound stupid, but I think there are many people affected by this.... in layman terms, what's the status of this issue?

Revision history for this message
Seth Forshee (sforshee) wrote :

On Sat, Jul 09, 2011 at 09:34:42PM -0000, harry wrote:
> Sorry to sound stupid, but I think there are many people affected by
> this.... in layman terms, what's the status of this issue?

The status is that there are multiple reports of this being fixed in
kernel version 3.0, and if this is true then the issue is fixed for
oneiric. We're currently undergoing a process known as bisection to
identify what changes fixed the problem in 3.0 so that the problem can
be fixed in earlier versions. Anyone experiencing this issue can help
with the bisection process by testing the bisect builds that I am
posting and reporting whether or not the crash is occuring for each of
the builds.

Revision history for this message
Martin Stjernholm (msub) wrote :

[Side note: Had to install module-init-tools from oneiric to satisfy deps in these bisect builds.]

Just to double check a little, I've tried the three latest bisects 007-009, and I've seen the oops in evertgreen_cp_resume at least once in each of them.

However, the trig ratio is fairly low for me - bisect 009 gave the oops only 2 times out of 10, which means that to conclude its absence with a reasonable degree of certainty would require 50 boots or so.

I also consistently get the no backlight bug with all builds (is there a separate report for that?). If one isn't aware of that issue, a successful start (wrt this bug) may be confused with a hang with no tty output.

Btw, bisect 007 also gave an oops in radeon_gart_unbind. It was not inside any pci probe stuff but rather in some sort of cleanup code inside drm_release. I can attach the log if it's interesting.

Revision history for this message
Seth Forshee (sforshee) wrote :

Martin, thanks for testing. Your results at least match those reported
for the builds you tested. I don't know whether there's a bug for your
backlight issue. If you can't find one then you should open a new one.

bisect010 is now available.

# good: [8eb572942ca02890f590d9251233038e27dd3842] drm/i915: forcewake debugfs fix
git bisect good 8eb572942ca02890f590d9251233038e27dd3842

Revision history for this message
Martin Stjernholm (msub) wrote :

I have now tried bisect 010 and got the oops in evergreen_cp_resume on the third boot.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect011 is now available.

# good: [4697995b98417c6da9ab2708a36f5e2bc926c8ac] drm/i915: split irq handling into per-chipset functions
git bisect good 4697995b98417c6da9ab2708a36f5e2bc926c8ac

Revision history for this message
Bryce Harrington (bryce) wrote :

[Closing out the -ati task; the issue has been pretty definitively narrowed to the kernel and the kernel team is active on it; nothing else we should do at the X end for this bug.]

Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → Invalid
Revision history for this message
Martin Stjernholm (msub) wrote :

Can confirm the evergreen_cp_resume oops in bisect 11 (on fifth boot).

Revision history for this message
Martin Stjernholm (msub) wrote :

Since I've always gotten the bug in all the bisects I've tried, I thought I better reverify bisect 006 which has been reported as working by others. Unfortunately it didn't work for me, so we're getting different results here. :(

I didn't get the oops in the log file, but I checked on screen that the top 6 frame names were the same as in the oops at the top of this ticket. If it's of any use, I can try to trig it again (in bisect 006, that is) and write it down.

Revision history for this message
Seth Forshee (sforshee) wrote :

Thanks for checking, Martin. I was afraid that might happen with this bug. Can you also try the other builds before 006 that reported as good (001, 004, 005) and see if any of those fail?

It's best if we can get as many testers as possible for each build. It's pretty clear when we get a bad result, but with the good results it's a lot less clear whether or not it's really good. More testers means a better chance of hitting the bug for each build.

Revision history for this message
josejuan05 (josejuan05) wrote :

I have a HP dm4-1100 (sometimes notated dm4t) with the 5470 switchable graphics. I tested bisects 001, 004, 005, 006, and 011.

001, 004, and 006 each booted properly 10 times out of 10.

005 failed the first three times before I gave up on it.

011 booted 10 out of 13 times. The last two times I booted 011 I didn't get the evergreen error, and my cursor appeared (and would move properly), but I couldn't do anything else. I took some pictures with my phone of the error on the last two times, and if you think it would be helpful I can transcribe them, but I'd rather not if you do not think it would be of any help.

Revision history for this message
josejuan05 (josejuan05) wrote :

Here's what I got as an error with bisect 011:

[ 29.400514] Stack:
[ 29.400524] ffff88014d4dfc38 ffffffffa01c7303 ffff88014d4dfc08 00000631815dd48e
[ 29.400561] ffff88014d4dfc38 ffff88013f2b3240 0000000000000002 ffff880143a60968
[ 29.400595] ffff880143a60560 ffff88014db00148 ffff88014d4dfc58 ffffffffa01c4e41
[ 29.400630] Call Trace:
[ 29.400653] [<ffffffffa01c7303>] radeon_gart_unbind+0xf3/0x160 [radeon]
[ 29.400687] [<ffffffffa01c4e41>] radeon_ttm_backend_unbind+0x21/0x30 [radeon]
[ 29.400717] [<ffffffffa016591f>] ttm_tt_unbind+0x2f/0x50 [ttm]
[ 29.400741] [<ffffffffa0165c52>] ttm_bo_cleanup_memtype_use+0x22/0x90 [ttm]
[ 29.400769] [<ffffffffa0166f21>] ttm_bo_cleanup_refs_or_queue+0x190/0x190
[ 29.400798] [<ffffffffa0166fdb>] ttm_bo_release+0x9b/0xd0 [ttm]
[ 29.400822] [<ffffffffa0166f40>] ? ttm_bo_cleanup_refs_or_queue+0x190/0x190
[ 29.400852] [<ffffffff812d9dc6>] kref_put+0x36/0x70
[ 29.400873] [<ffffffffa01660de>] ttm_bo_unref+0x3e/0x50 [ttm]
[ 29.400904] [<ffffffffa01c63e7>] radeon_bo_unref+0x47/0x80 [radeon]
[ 29.400937] [<ffffffffa012df30>] ? drm_gem_object_release+0x20/0x20 [drm]
[ 29.400971] [<ffffffffa01ddb46>] radeon_gem_object_free+0x26/0x30 [radeon]
[ 29.401000] [<ffffffffa012df5a>] drm_gem_object_free+0x2a/0x30 [drm]
[ 29.401024] [<ffffffff812d9dc6>] kref_put+0x36/0x70
[ 29.401052] [<ffffffffa01d5994>] radeon_user_framebuffer_destroy+0x44/0x70 [radeon]
[ 29.401085] [<ffffffffa013b983>] drm_fb_release+0x83/0xb0 [drm]
[ 29.401112] [<ffffffffa012d930>] drm_release+0x340/0x3e0 [drm]
[ 29.401136] [<ffffffff811631ee>] __fput+0xbe/0x210
[ 29.402583] [<ffffffff81163365>] fput+0x25/0x30
[ 29.404042] [<ffffffff8115f9f6>] flip_close+0x66/0x90
[ 29.405462] [<ffffffff8115fad4>] sys_close+0xb4/0x120
[ 29.406859] [<ffffffff815e5942>] system_call_fastpath+0x16/0x1b
[ 29.408252] Code: ea ff ff ff 48 8b 8f 80 03 00 00 85 f6 78 21 3b b7 68 03 00 00 77 19 c1 e6 03 48 81 e2 00 f0 ff ff 48 63 f6 48 83 ca 67 48 01 f1
[ 29.408451] 89 11 31 c0 5d c3 0f 1f 40 00 55 48 89 e5 53 48 83 ec 08 0f
[ 29.411472] RIP [<ffffffffa01f8495>] rs600_gart_set_page+0x35/0x40 [radeon]
[ 29.412948] RSP <ffff88014d4dfbe8>
[ 29.414386] CR2: ffffc90011601088

Revision history for this message
Martin Stjernholm (msub) wrote :

Because testing good bisects is tedious, I went backwards:

Bisect 005 is inconclusive since it bugs out with no tty output. Got one kernel hang, but since there's no tty output I don't know if it's this bug or not (nothing in kern.log from that boot either).

Bisect 004 does not show the bug after 30 boots.

@josejuan05: I suspect the radeon_gart_unbind oops is unrelated. I have seen it too, as well as another user in comment #17. Perhaps there is (or should be) a separate ticket for it.

Revision history for this message
Seth Forshee (sforshee) wrote :

I'm trying to construct a list of good and bad commits to use for starting a new bisection run. Please take note that it only takes a single occurrence of the evergreen_cp_resume oops to qualify a commit as bad, so there's no need to continue testing once you've seen that oops.

I do agree that the radeon_gart_unbind oops looks different, so that shouldn't be considered a failure for the purposes of the bisection.

Here's the list I have now.

Bad builds: 000 003 005 006 007 008 009 010 011

Good builds: 001 004

Inconclusive: 005

Martin, just to verify -- with each of the builds you've indicated have failed for you, you've verified that you're getting the evergreen_cp_resume oops?

josejuan05, when you tested build 005, did you verify that you were getting an oops in evergreen_cp_resume? If so, we can move that one to the bad list.

Once this is all sorted out I'll use the information to seed a new bisection and start providing more builds to test.

Thanks everyone!

Revision history for this message
Seth Forshee (sforshee) wrote :

Oops, just noticed that I included 005 in the bad and inconclusive lists. I meant for it to be only in the inconclusive list.

Revision history for this message
Martin Stjernholm (msub) wrote :

Yes, I have verified that it's an oops in evergreen_cp_resume that I've gotten at least once in each of bisects 006-011.

Revision history for this message
Seth Forshee (sforshee) wrote :

I've put up a new build (bisect012) at:

http://people.canonical.com/~sforshee/lp727620/bisect/

Here's the bisect log I used to re-seed the bisection.

# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect start 'v3.0-rc1' 'v2.6.39'
# good: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git bisect good c44dead70a841d90ddc01968012f323c33217c9e
# bad: [a09ed5e00084448453c8bada4dcd31e5fbfc2f21] vmscan: change shrink_slab() interfaces by passing shrink_control
git bisect bad a09ed5e00084448453c8bada4dcd31e5fbfc2f21
# good: [9461702d2a54cd4d9da09b7755c96815791a9d07] m68k: let Makefile sort out compiling mmu and non-mmu lib/checksum.c
git bisect good 9461702d2a54cd4d9da09b7755c96815791a9d07
# good: [2b030bda66b0a59f8ebf0ce2117088256a5f9f97] mach-ux500: set proper I2C platform data from MOP500s
git bisect good 2b030bda66b0a59f8ebf0ce2117088256a5f9f97
# bad: [931474c4c30633400ff0dff8fb452ae20e01d067] Merge branch 'drm-radeon-next' of /ssd/git/drm-radeon-next into drm-core-next
git bisect bad 931474c4c30633400ff0dff8fb452ae20e01d067
# skip: [69f7876b2ab61e8114675d6092ad0b482e233612] Merge remote branch 'keithp/drm-intel-next' of /ssd/git/drm-next into drm-core-next
git bisect skip 69f7876b2ab61e8114675d6092ad0b482e233612
# good: [4b65177b27ede9dee3186bc3a58c737997ee4749] drm/i915: add IS_IVYBRIDGE macro for checks
git bisect good 4b65177b27ede9dee3186bc3a58c737997ee4749
# good: [2c34b850ee1e9f86b41706149d0954eee58757a3] drm/i915: fix ilk rc6 teardown locking
git bisect good 2c34b850ee1e9f86b41706149d0954eee58757a3
# good: [fcca7926299944841569515da321bef9655b7703] drm/i915: reference counted forcewake
git bisect good fcca7926299944841569515da321bef9655b7703
# good: [8eb572942ca02890f590d9251233038e27dd3842] drm/i915: forcewake debugfs fix
git bisect good 8eb572942ca02890f590d9251233038e27dd3842
# good: [4697995b98417c6da9ab2708a36f5e2bc926c8ac] drm/i915: split irq handling into per-chipset functions
git bisect good 4697995b98417c6da9ab2708a36f5e2bc926c8ac

This build is testing commit fcfc768806f2ed8ad56d9fd3f0c6af1cdb5e10e2.

Revision history for this message
josejuan05 (josejuan05) wrote :

Ok.

So far I've had 11 consecutive successful boots with bisect012

I took another look at 005. While the screen goes black and I can bring up no TTY, kern.log shows no evergreen_cp_resume errors. Furthermore, the system eventually gets to the gdm login screen (I can hear the "ding noise" and can move my cursor and keys in order to restart the system, blindly), but the screen is still black. I managed to properly boot to the gdm login screen 20 times consecutively. Furthermore, I didn't get any kernel panics.

As it doesn't appear exhibit the evergreen_cp_resume error, I believe for the purposes of the bisect that 005 should be in the "good" category, with 001 and 004 (and, by my testing, 012)

As for the _gart_unbind error, I will hold off on reporting it unless I see it again, since I have only seen it on a bisect.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect013 is now available.

http://people.canonical.com/~sforshee/lp727620/bisect/

# bad: [fcfc768806f2ed8ad56d9fd3f0c6af1cdb5e10e2] drm/nva3: support for memory timing map table
git bisect bad fcfc768806f2ed8ad56d9fd3f0c6af1cdb5e10e2

Now testing commit 2703c21a82301f5c31ba5679e2d56422bd4cd404.

Revision history for this message
Martin Stjernholm (msub) wrote :

Bisect 013 booted ok 24 times out of 30. 3 of the remaining times failed with an afaics unrelated oops in azx_interrupt in the snd_hda_intel driver. The last 3 times there were oops'es which didn't manage to get sufficiently logged on the tty to see where they were (the computer froze either before logging the relevant line, or after having scrolled it off the screen). I think it's most likely that those were in snd_hda_intel as well, since some other boots with that oops froze before logging the oops entirely.

So my conclusion is that bisect 013 is not affected by the evergreen_cp_resume bug. I therefore didn't test 012 since it should work by implication under the bisect assumption.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect014 is now available.

http://people.canonical.com/~sforshee/lp727620/bisect/

# bad: [2703c21a82301f5c31ba5679e2d56422bd4cd404] drm/nv50/gr: move to exec engine interfaces
git bisect bad 2703c21a82301f5c31ba5679e2d56422bd4cd404

Now testing commit 000703f44c77b152cd966eaf06f4ab043274ff46.

Revision history for this message
Martin Stjernholm (msub) wrote :

After 30 boots with bisect 014 I got 3 oopses but none mentioning evergreen_cp_resume, so I deem it not affected.

Revision history for this message
josejuan05 (josejuan05) wrote :

I have had 20 boots with one oops, but not evergreen_cp_resume on bisect 014. I second Martin.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect015 is now available.

http://people.canonical.com/~sforshee/lp727620/bisect/

# bad: [000703f44c77b152cd966eaf06f4ab043274ff46] mxm/wmi: add MXMX interface entry point.
git bisect bad 000703f44c77b152cd966eaf06f4ab043274ff46

Now testing commit 63f7d9828bf55cc8ee6f460830c5285fe06bef3e.

Revision history for this message
josejuan05 (josejuan05) wrote :

I have booted bisect015 twice, and both times got evergreen_cp_restart oops.

This commit is affected by the bug.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect016 is now available.

http://people.canonical.com/~sforshee/lp727620/bisect/

# good: [63f7d9828bf55cc8ee6f460830c5285fe06bef3e] drm/radeon/kms: add support for thermal chips on combios asics
git bisect good 63f7d9828bf55cc8ee6f460830c5285fe06bef3e

Now testing commit 99b38b4acc0d7dbbab443273577cff60080fcfad.

Revision history for this message
josejuan05 (josejuan05) wrote :

On five boots of bisect016 I have two evergreen_cp_start errors.

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect017 is now available.

http://people.canonical.com/~sforshee/lp727620/bisect/

# good: [99b38b4acc0d7dbbab443273577cff60080fcfad] platform/x86: add MXM WMI driver.
git bisect good 99b38b4acc0d7dbbab443273577cff60080fcfad

Now testing commit 3448a19da479b6bd1e28e2a2be9fa16c6a6feb39.

Revision history for this message
josejuan05 (josejuan05) wrote :

On bisect017 I've had 19 successful boots, and one that ended with the gart_unbind error (not evergreen_cp_...)

Revision history for this message
Seth Forshee (sforshee) wrote :

bisect018 is now available. This should be the last bisect build, then we'll need to apply the commit identified as fixing the issue onto natty and see if that fixes the problem there.

http://people.canonical.com/~sforshee/lp727620/bisect/

# bad: [3448a19da479b6bd1e28e2a2be9fa16c6a6feb39] vgaarb: use bridges to control VGA routing where possible.
git bisect bad 3448a19da479b6bd1e28e2a2be9fa16c6a6feb39

Now testing commit 8116188fdef5946bcbb2d73e41d7412a57ffb034.

Revision history for this message
josejuan05 (josejuan05) wrote :

Out of three boots, the first two failed on an evergreen_cp_ error.

Revision history for this message
josejuan05 (josejuan05) wrote :

Oh. Clarification: Out of three boots of bisect018, the first two failed on an evergreen_cp_ error.

Revision history for this message
Seth Forshee (sforshee) wrote :

The bisect identified this as the commit that fixes the problem:

3448a19 vgaarb: use bridges to control VGA routing where possible.

A natty build with this patch applied is available at the link below. Please test to see whether or not the bug is reproducible in this build. Thanks!

http://people.canonical.com/~sforshee/lp727620/linux-2.6.38-11.48~lp727620v201108261554/

Revision history for this message
josejuan05 (josejuan05) wrote :

I have 20 consecutive successful boots on this commit. I have no boot failures yet.

Revision history for this message
aproposnix (aproposnix) wrote :
Download full text (5.0 KiB)

Cool! it seems to be working for me. I even got a little taste of
the Plymouth boot screen which I haven't seen in a long time :)
I'll continue testing and let you know if I experience the B[lack]SOD.

On Fri, Aug 26, 2011 at 8:19 PM, josejuan05 <email address hidden>wrote:

> I have 20 consecutive successful boots on this commit. I have no boot
> failures yet.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/727620
>
> Title:
> [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
> normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
> Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
> Confirmed
> Status in “linux” package in Ubuntu:
> Incomplete
> Status in “xserver-xorg-video-ati” package in Ubuntu:
> Invalid
>
> Bug description:
> [Problem]
> On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a
> failure occurs resulting in a black screen and errors from the radeon kernel
> module, as shown below.
>
> [Cause]
> From upstream developer:
>
> "The switcheroo code needs more work to switch properly on some
> systems it seems. There are a set acpi methods required to
> activate/deactivate the respective gpus. The drivers need to load and
> initialize active hw. If the hw is not active when the driver loads,
> then the hw is not set up properly and it won't work. Probably some
> ordering issues in how the switcheroo acpi methods are called."
>
> [Workarounds]
> Several options:
>
> 1. If your BIOS includes functionality to disable the Intel card, use
> BIOS settings to select which chip to load.
>
> 2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note
> that the default radeon gallium driver only works with KMS, so YMMV.
>
> [Original Report]
> I'm running natty, and every since the upgrade to 6.14.0 I've been unable
> to consistently boot. After some discussion in the forums, I tried
> repeatedly to boot into recovery mode. In most cases, I got a black screen.
> One time though, when I was able to successfully increase the brightness, I
> saw some errors from the radeon module. I took a photo (available at
> http://i.imgur.com/P0bQ0.jpg), and here's the stack and call trace, as
> best as I can read it:
>
> Stack:
> ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
> 00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
> ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
> Call Trace:
> [<ffffffffa022aba0>] evergreen_cp_resume+0x3a0/0x630 [radeon]
> [<ffffffffa022c8b7>] evergreen_startup+0x157/0x260 [radeon]
> [<ffffffffa01fe8a0>] ? r600_pcie_gart_init+0x60/0x70 [radeon]
> [<ffffffffa022dbec>] evergreen_init+0x1ac/0x2d0 [radeon]
> [<ffffffffa01a5a69>] radeon_device_init+0x409/0x490 [radeon]
> [<ffffffffa01a7142>] radeon_driver_load_kms+0xb2/0x1a0 [radeon]
> [<ffffffffa007fb2e>] drm_get_pci_dev+0x18e/0x300 [drm]
> [<ffffffff8115426f>] ? kmem_cache_alloc_trace+0xff/0x120
> [<ffffffffa023790e>] radeon_pci_probe+0xb2/0xba [radeon]
> [<ffffffff812fea7f>] local_pci_probe+0x5f/0xd0
> [<fffff...

Read more...

Revision history for this message
Martin Stjernholm (msub) wrote :

I've tested the patched natty kernel and have over 30 boots without the evergreen_cp_resume oops, so the bisected patch indeed appears to be the right one. Does it explain the race?

Revision history for this message
Seth Forshee (sforshee) wrote :

I've been looking at the patch we identified as fixing the problem, but I can't work out any causal relationship between what it does and the GPU being on when nouveau probes. I've inquired about it on the upstream bugzilla to see if I'm missing something. But I'm beginning to suspect that the patch alters the timing of things enough to prevent the problem from being triggered.

Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
aproposnix (aproposnix) wrote :

Hi Seth,

One bad experience I have had since installing the patched kernel. After countless good boots I decided to add the following lines to /etc/tc.local in order to use switcheroo:

chown "username" /sys/kernel/debug/vgaswitcheroo/switch # change "username" with your user name
echo OFF > /sys/kernel/debug/vgaswitcheroo/switch

Everything seemed to work for the first 5 or so boots but then I started getting the blackscreens again. The message was similar but not the same. Unfortunately I didn't manage to get a screen of it.

Once I got back into my desktop I removed the lines from rc.local and then the issue disappeared.

I've gone ahead and added the lines again to see if I can recreate the issue but so far nothing.

Not sure if this info is of any use to you at all but I thought I would share just in case it would.

Revision history for this message
AceLan Kao (acelankao) wrote :

Seth,

I can confirm the commit
   3448a19 vgaarb: use bridges to control VGA routing where possible.
fixed this issue.
And this issue is a hwcert block issue, so is there anything that I can help to make the SRU process faster?

tags: added: blocks-hwcert
Revision history for this message
josejuan05 (josejuan05) wrote :

@harry
I don't know if it means anything, but in newer kernels you may be unable to use that command, since /sys/kernel/debug may not be owned by you. A quick and dirty (if dangerous) solution would be to change the first line to
chown "username(:group)" /sys/kernel/debug/ -R
where username is again your username and :group is the optional argument for the group ownership of the folder

Back to the bug, though, I did note one related oops in 50 boots (looking through my logs). I did note when it happened - it did cause a boot failure, but I did not give the evergreen error. Rather I got the gart_set_page error. However, it didn't cause a kernel dump like in the bisects. I only believe that the gart_set_page error to be related because it does not show up in kernels which were susceptible to the evergreen_cp_start oops. This still does not explain the race condition.

FWIW I did some plotting out of the bisects on paper and found that if there was a commit that fixed the gart_set error it was between bisects 014 and 017.

Revision history for this message
Seth Forshee (sforshee) wrote :

AceLan: The problem right now is that I suspect that the patch doesn't actually do anything to directly fix the problem. I.e., that the patch fixes this oops is just a side-effect of burning more time before the driver tries to access the hardware or something like that. I'm not sure though so I'd like to get confirmation from upstream whether or not the patch is a real fix for the problem, but so far I haven't received a response.

Revision history for this message
AceLan Kao (acelankao) wrote :

Seth,

To exam your assumption, I reverted that commit and replaced the vga_arbiter_check_bridge_sharing() function call by some delays, but I still encountered the issue.
The error message is the same.
Do you have any suggestion to do the test?

==================
diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index ace2b16..5b93935 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -500,6 +500,8 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
                vga_default = pci_dev_get(pdev);
 #endif

+ msleep(1000);
+
        /* Add to the list */
        list_add(&vgadev->list, &vga_list);
        vga_count++;
===================

Revision history for this message
josejuan05 (josejuan05) wrote :

If gart_ and evergreen_ are related errors (my presumption), I can confirm an actual failure (complete with debug dump) on the patched kernel.

It took about 100 or so boots for this to show up.

Changed in xserver-xorg-driver-ati:
importance: High → Critical
Changed in xserver-xorg-driver-ati:
status: Confirmed → Fix Released
Revision history for this message
Seth Forshee (sforshee) wrote :

Has anyone tested this yet since oneiric released? I'd like to get confirmation that the problem is fixed there. Thanks!

Revision history for this message
dsainty (dsainty) wrote :

In Oneiric it seems to do better (in respect to default handling of the hardware), in at least it boots to the integrated card, rather than a black screen.

I haven't had any success switching to the discrete card via vgaswitcheroo though.

Revision history for this message
Seth Forshee (sforshee) wrote :

Moving status to Fix Released based on positive test results with Oneiric noted in comment #172.

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Elia (elia-baragiola) wrote :

one question i have ubuntu 10.04 LTS
the fix will be released on this version or not?

for fix the bug , What should I do? :D
thanks you !

Revision history for this message
Ihorko (ihorchyhin) wrote :

I have not such good experience with Oneiric on my HP G62-a35er. First of all by default startup brightness is 0% (see bug #873191). The second is that I tried to turn power on discrete card by adding corresponding command to /etc/rc.local and once some kernel oops occurred at startup (now I moved this command to startup with delay 10 seconds because of some problems with snd_hda_intel too). I can attach part of that fail log next week because of I have only mobile broadband connection from time to time on my laptop.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

The bug is back now for me (( The thing I have done before to make it gone - pass "nosplash" instead of "quiet splash" in grub and enable option GRUB_TERMINAL=console and GRUB_GFXMODE=1024x768
Now i have reverted these options - and the bug is back for me on Natty 2.6.38-10-generic.
So it seems to be graphical-mode related. (Maybe plymouth - dependable?)

Now reverting back to working options.

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

No, the later fix is not working anymore. Now blacklisting radeon to modprobe it manually when switching.

Revision history for this message
Daniel Buchner (danieljb2) wrote :

I just ran into this bug after installing the latest 11.10 release on a new ENVY, looks like the fix here didn't work :(

Revision history for this message
Elia (elia-baragiola) wrote :

bug persist :(
someone have update the bios ? now can fix the graphic card?

tags: added: hybrid-graphics
Revision history for this message
Vangel Ajanovski (ajanovski) wrote :

HP dm4t (ATI 5470), after several installs last weekend - bug no longer present in Xubuntu 11.10 stock, also not present after upgrade to kernel 3.2.0-10. It was still present with LinixMint Debian Edition stock (at that moment 2.6 kernel), but after upgrade to latest kernel it was fixed.
In fact I have not seen this for some months now, but I did a clean reinstall just in order to check it.

Revision history for this message
Klaus Reichl (klaus-reichl) wrote : Invitation to connect on LinkedIn

LinkedIn
------------

Bug,

I'd like to add you to my professional network on LinkedIn.

- Klaus

Klaus Reichl
Technology Expert at Thales
Austria

Confirm that you know Klaus Reichl:
https://www.linkedin.com/e/-23x794-h2wwfi1a-25/isd/7319909931/SX7S-eb-/?hs=false&tok=1GQFJgiCAvq5g1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/-23x794-h2wwfi1a-25/7rYzEsOVuzif46I3SgQu1HdVsk49QSkYAnntYjn/goo/727620%40bugs%2Elaunchpad%2Enet/20061/I2488425681_1/?hs=false&tok=0nTcgEXXAvq5g1

(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

Revision history for this message
Zentai Andras (andras-zentai) wrote :

Really clever way to invite all bug subscribers to your LinkedIn network... ;)

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

Switched to 12.04. Bug seems to be gone.

Revision history for this message
kolwas (kolwas) wrote :

<Switched to 12.04. Bug seems to be gone.>
On my machine it is not true, I don't know if there was some updates but now ubuntu starts in 50%

Revision history for this message
Klaus Reichl (klaus-reichl) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics)
Download full text (5.0 KiB)

Hi all,

I really apoligize for that, still don't know what was going on.

Sorry,
Klaus
--
Klaus Reichl <email address hidden>
Danhausergasse 8/16 +43 6991 84 137 94
1040 Wien

On Tue, Jun 5, 2012 at 9:33 PM, Zentai Andras <email address hidden>wrote:

> Really clever way to invite all bug subscribers to your LinkedIn
> network... ;)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/727620
>
> Title:
> [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
> normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
> Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
> Fix Released
> Status in “linux” package in Ubuntu:
> Fix Released
> Status in “xserver-xorg-video-ati” package in Ubuntu:
> Invalid
>
> Bug description:
> [Problem]
> On hybrid graphics hardware with this ATI chip and another (e.g. Intel),
> a failure occurs resulting in a black screen and errors from the radeon
> kernel module, as shown below.
>
> [Cause]
> From upstream developer:
>
> "The switcheroo code needs more work to switch properly on some
> systems it seems. There are a set acpi methods required to
> activate/deactivate the respective gpus. The drivers need to load and
> initialize active hw. If the hw is not active when the driver loads,
> then the hw is not set up properly and it won't work. Probably some
> ordering issues in how the switcheroo acpi methods are called."
>
> [Workarounds]
> Several options:
>
> 1. If your BIOS includes functionality to disable the Intel card, use
> BIOS settings to select which chip to load.
>
> 2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note
> that the default radeon gallium driver only works with KMS, so YMMV.
>
> [Original Report]
> I'm running natty, and every since the upgrade to 6.14.0 I've been unable
> to consistently boot. After some discussion in the forums, I tried
> repeatedly to boot into recovery mode. In most cases, I got a black
> screen. One time though, when I was able to successfully increase the
> brightness, I saw some errors from the radeon module. I took a photo
> (available at http://i.imgur.com/P0bQ0.jpg), and here's the stack and
> call trace, as best as I can read it:
>
> Stack:
> ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
> 00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
> ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
> Call Trace:
> [<ffffffffa022aba0>] evergreen_cp_resume+0x3a0/0x630 [radeon]
> [<ffffffffa022c8b7>] evergreen_startup+0x157/0x260 [radeon]
> [<ffffffffa01fe8a0>] ? r600_pcie_gart_init+0x60/0x70 [radeon]
> [<ffffffffa022dbec>] evergreen_init+0x1ac/0x2d0 [radeon]
> [<ffffffffa01a5a69>] radeon_device_init+0x409/0x490 [radeon]
> [<ffffffffa01a7142>] radeon_driver_load_kms+0xb2/0x1a0 [radeon]
> [<ffffffffa007fb2e>] drm_get_pci_dev+0x18e/0x300 [drm]
> [<ffffffff8115426f>] ? kmem_cache_alloc_trace+0xff/0x120
> [<ffffffffa023790e>] radeon_pci_probe+0xb2/0xba [radeon]
> [<ffffffff812fea...

Read more...

Revision history for this message
Cabalbl4 (i-vohmin) wrote :

As far as I understand it has time racing condition between intel and radeon modules. And it may be related with vgaswitcheroo too. On earlyer versions of ubuntu changing the waiting time and the gfx mode in grub seemed to affect the bug somehow. Some combination of that even maked it disappear :)

To post a comment you must log in.