[Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linux |
Confirmed
|
Medium
|
|||
xserver-xorg-driver-ati |
Fix Released
|
Critical
|
|||
linux (Ubuntu) |
Fix Released
|
High
|
Seth Forshee | ||
xserver-xorg-video-ati (Ubuntu) |
Invalid
|
Wishlist
|
Unassigned |
Bug Description
[Problem]
On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a failure occurs resulting in a black screen and errors from the radeon kernel module, as shown below.
[Cause]
From upstream developer:
"The switcheroo code needs more work to switch properly on some systems it seems. There are a set acpi methods required to activate/deactivate the respective gpus. The drivers need to load and initialize active hw. If the hw is not active when the driver loads, then the hw is not set up properly and it won't work. Probably some ordering issues in how the switcheroo acpi methods are called."
[Workarounds]
Several options:
1. If your BIOS includes functionality to disable the Intel card, use BIOS settings to select which chip to load.
2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note that the default radeon gallium driver only works with KMS, so YMMV.
[Original Report]
I'm running natty, and every since the upgrade to 6.14.0 I've been unable to consistently boot. After some discussion in the forums, I tried repeatedly to boot into recovery mode. In most cases, I got a black screen. One time though, when I was able to successfully increase the brightness, I saw some errors from the radeon module. I took a photo (available at http://
Stack:
ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
Call Trace:
[<ffffffffa022
[<ffffffffa022
[<ffffffffa01f
[<ffffffffa022
[<ffffffffa01a
[<ffffffffa01a
[<ffffffffa007
[<ffffffff8115
[<ffffffffa023
[<ffffffff812f
[<ffffffff8130
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffff813b
[<ffffffffa001
[<ffffffff813b
[<ffffffffa001
[<ffffffff812f
[<ffffffffa008
[<ffffffff815b
[<ffffffffa001
[<ffffffffa007
[<ffffffffa001
[<ffffffff8100
[<ffffffff810a
[<ffffffff8100
Code: 00 45 8b 84 24 e4 0a 00 00 45 85 c0 0f 8e c7 09 00 00 41 8b 84 24 d4 0a 00 00 89 c2 83 c0 01 40 c1 e2 02 49 03 94 24 c8 0a 00 00 <c7> 02 00 44 05 c0 41 8b 94 24 e4 0a 00 00 41 23 84 24 f4 0a 00
RIP [<ffffffffa0227
RSP <ffff88014b0f7af8>
CRZ: ffffc90411ce1ffc
---[ end trace 37702c56f2e23247 ]---
udevd-work[94]: '/sbin/modprobe -bv pci:v00001002d0
There is also some register info dumped at the top of the screen visible in the photo, that I didn't bother to write, as I'd most certainly get something wrong.
tags: | added: natty |
afoglia (afoglia) wrote : | #1 |
Vangel Ajanovski (ajanovski) wrote : | #2 |
I also have the same problem, sometimes it takes just 1-2 resets to be able to boot, and now i reseted the computer 8 times (2 with full power off) and it finally booted. I think that it fails right before showing the Ubuntu logo and progress bar when switching from console to graphics mode.
My computer is HP Pavilion dm4t-1100 wit ATI 5470HD and Intel.
summary: |
- [Radeon HD 5650] Driver crash during recovery boot + [Radeon HD 5650 and 5470] Driver crash during recovery boot and in + normal boot |
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot | #3 |
Hi afoglia,
Does it resolve if you downgrade to an older version of -ati?
You can get older .deb files of the driver from Launchpad here:
https:/
Click on the link under Version of the version you want to test, then under Builds click the link for your hardware architecture, then grab the -ati and -radeon .debs and install them.
If that doesn't do it, then next guess would be you are having a kernel issue - if you still have a prior kernel you can try booting it (hold down the left shift key during boot to bring up the menu.)
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | New → Incomplete |
Vangel Ajanovski (ajanovski) wrote : | #4 |
- logproblem.txt Edit (11.3 KiB, text/plain)
In my situation this is something I found in the logs.
I analyzed the logs and compared it to a normal log and besides the similar stack dump I see one significant difference in the problematic log is this:
[drm] radeon: 3584M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
whereas in the normal log:
[drm] radeon: 512M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
The laptop has only 4GB RAM and ATI is supposed to have only 512MB.
I have attached the relevant part of the log
Johan Fornander (fornander-johan) wrote : | #5 |
- kern.log.bz2 Edit (117.9 KiB, application/octet-stream)
This is my complete kern.log file showing the crash being identical to that of afoglia and Vangel Ajanovski.
afoglia (afoglia) wrote : | #6 |
How do I install the old versions? I tried installing 6.13.2+
I also tried the maverick version on that page (6.13.1-1ubuntu5) and again, dpkg has dependency issues, this time the required package is xorg-video-abi-8.0, and that this version of xserver-
If this helps, I did not have these problems under maverick, and while I had minor problems in natty a few weeks ago, they got noticeably, drastically worse when the 6.14 drivers were released.
afoglia (afoglia) wrote : | #7 |
I tried Bryce's second suggestion of using old kernels. I have two previous versions of 2.6.38 installed, 2.6.38-3-generic and 2.6.38-4-generic. I booted each into recovery and normal mode 4 times, for a total of 16 boots. Here's the number of times the boot was a success, where I either got to the recovery boot menu or gdm, (regardless of whether the screen brightness had to be manually increased from 0, or if the plymouth boot screen displayed).
2.6.38-4-generic, normal: 1 success, 3 failures
2.6.38-4-generic, recovery: 4 successes
2.6.38-3-generic, normal: 4 successes
2.6.38-3-generic, recovery: 3 successes, 1 failure
At no time did I see a stack trace like the one I posted, but I've only seen that in recovery mode. (Would it be written somewhere persistent between boots? It's not in /var/log/syslog.)
I took more notes on the failures. They're pretty vague and qualitative, but have slightly more detail of what each boot was like.
Bryce Harrington (bryce) wrote : | #8 |
Okay, thanks for the testing. That suggests a regression in the kernel between 2.6.38-3 and -4 (the one failure with -3 may be a random outlier).
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4) | #9 |
Even though this seems to be pinpointed to the kernel, I'll leave the X task open for now so we can keep track of the bug's progress from the X end.
summary: |
[Radeon HD 5650 and 5470] Driver crash during recovery boot and in - normal boot + normal boot (Regression from 2.6.38-3 to -4) |
Changed in xserver-xorg-video-ati (Ubuntu): | |
importance: | Undecided → High |
status: | Incomplete → Triaged |
kolio kostadinov (koliokostadinov) wrote : | #10 |
Hi guys,
i have the same problem.I like very much this OS-Ubuntu, but why such a good system do not resolve this problem such a long time?I see a posts from the last 2 or 3 yaers.It's strange for me.I don't want to experiment with my PC.I tried but it's always a crash,black screen,red screen.....I wait a new better release or help me with something that really works.Thank you very much.
tags: | added: crash |
Bryce Harrington (bryce) wrote : | #11 |
kolio, not sure what you're talking about. The Radeon HD 5650 came on the market on Jan 7, 2010, so it did not exist 2 or 3 years ago. Whatever posts you're looking at are unrelated to this problem.
Bryce Harrington (bryce) wrote : | #12 |
afoglia, just to confirm - you still seeing this crash with the current kernel?
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | Triaged → Confirmed |
afoglia (afoglia) wrote : | #13 |
Yes and no. I did six normal boots with 2.6.38-7.35, then realized there was an update, and booted that both normally and in recovery and here's what I saw
2.6.38-7.36 normal, 6 boots, 5 reached gdm login screen, 1 gdm started but hung before login window appeared (only one of the 5 successful boots showed the plymouth boot screen)
2.6.38-7.36 recovery mode, 5 boots, all hung with the monitor off, no plymouth, brightness key did nothing.
2.6.38-7.35 normal, 6 boots, 3 hung with monitor off, 3 reached gdm
Since I still can't boot in recovery (and I don't see anything in the changelog for -7.36 obviously related), I'd say the bug is still there.
Guillaume Modard (guillaumemodard) wrote : | #14 |
I confirm that the bug is still there. I install last update this morning (I saw xserver-
When I restart, I need to reboot more than 5 time before I get a desktop. And now, when the boot crash, I don't get any shell, screen stay black (as if it is off).
I really expect this bug will be solved before the final release. If not, Ubuntu won't work on plenty of the last HP pavilion laptop.
Guillaume Modard (guillaumemodard) wrote : | #15 |
Note : Here is the result of
lspci -v | grep -A 12 VGA :
guillaume@
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller])
Subsystem: Hewlett-Packard Company Device 163c
Flags: bus master, fast devsel, latency 0, IRQ 44
Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
Memory at b0000000 (64-bit, prefetchable) [size=256M]
I/O ports at 5050 [size=8]
Expansion ROM at <unassigned> [disabled]
Capabilities: <access denied>
Kernel driver in use: i915
Kernel modules: i915
00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
Subsystem: Hewlett-Packard Company Device 163c
--
01:00.0 VGA compatible controller: ATI Technologies Inc Robson CE [AMD Radeon HD 6300 Series] (prog-if 00 [VGA controller])
Subsystem: Hewlett-Packard Company Device 163c
Flags: bus master, fast devsel, latency 0, IRQ 43
Memory at a0000000 (64-bit, prefetchable) [size=256M]
Memory at c4400000 (64-bit, non-prefetchable) [size=128K]
I/O ports at 4000 [size=256]
Expansion ROM at c4440000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: radeon
Kernel modules: radeon
01:00.1 Audio device: ATI Technologies Inc Manhattan HDMI Audio [Mobility Radeon HD 5000 Series]
Subsystem: Hewlett-Packard Company Device 163c
guillaume@
Johan Fornander (fornander-johan) wrote : | #16 |
I have a theory... Maybe there is a race condition between the intel and ati driver involved here? My notebook starts up in two seemingly random configurations, or three including the radeon crash:
1. X server is on VT8 -> unable to unload radeon module because it is in use (by some framebuffer I guess). I am also unable to switch to consoles VT1-7. If I use vga_switcheroo to switch to integrated gpu in this mode then radeon crashes.
2. X server is on VT7 -> I can unload the radeon module and use the consoles VT1-6. vga_switcheroo works and I can also use acpi calls to turn off the gpu.
3. The radeon driver crashes. Forcing reboot through RSEIUB.
This makes it difficult to control the temperature since I cannot know if the radeon module is in use or not (i.e. I might or might not be able to use the vga_switcheroo, or unload the module and use a specific acpi call to shut of the gpu).
Johan Fornander (fornander-johan) wrote : | #17 |
When I have booted into an evironment where both X server and framebuffer uses intel, sometimes when I try to unload the radeon module it crashes like this:
[ 346.860598] radeon 0000:02:00.0: ffff88014b362000 unpin not necessary
[ 346.860619] BUG: unable to handle kernel paging request at ffffc90022680000
[ 346.861758] IP: [<ffffffffa01f0
[ 346.863120] PGD 157818067 PUD 157819067 PMD 14959b067 PTE 0
[ 346.864707] Oops: 0002 [#1] SMP
[ 346.866297] last sysfs file: /sys/devices/
[ 346.867941] CPU 3
[ 346.867963] Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc parport_pc ppdev dm_crypt wl(P) lib80211 snd_hda_codec_hdmi snd_hda_
[ 346.877408]
[ 346.879334] Pid: 2518, comm: rmmod Tainted: P C 2.6.38-7-generic #38-Ubuntu Acer Aspire 3820/JM31_CP
[ 346.881377] RIP: 0010:[<
[ 346.883456] RSP: 0018:ffff880125
[ 346.885512] RAX: 00000000ffffffea RBX: ffff880149e00000 RCX: ffffc90022680000
[ 346.887607] RDX: 0000000036822067 RSI: 0000000000000000 RDI: ffff880149e00000
[ 346.889724] RBP: ffff8801259e7c68 R08: 0000000000000000 R09: ffff88014adf7748
[ 346.891854] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000111
[ 346.893993] R13: 0000000000000111 R14: 0000000000000888 R15: 0000000000000001
[ 346.896139] FS: 00007f8e31ee872
[ 346.898319] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 346.900511] CR2: ffffc90022680000 CR3: 0000000125af6000 CR4: 00000000000006e0
[ 346.902754] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 346.905025] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 346.907314] Process rmmod (pid: 2518, threadinfo ffff8801259e6000, task ffff88013d67c440)
[ 346.909638] Stack:
[ 346.911920] ffff8801259e7cb8 ffffffffa01bf146 ffff880149e01338 0002000000000000
[ 346.914277] ffff8801259e7cb8 ffff880149e00000 ffff88014f0c6000 ffffffffa025a590
[ 346.916600] ffff88014f80e000 0000000000000001 ffff8801259e7cd8 ffffffffa01bf49d
[ 346.918952] Call Trace:
[ 346.921281] [<ffffffffa01bf
[ 346.923664] [<ffffffffa01bf
[ 346.926056] [<ffffffffa022a
[ 346.928466] [<ffffffffa022d
[ 346.930871] [<ffffffffa01a5
[ 346.933291] [<ffffffffa01a7
[ 346.935709] [<ffffffffa0020
[ 346.938125] [<ffffffffa018b
Guillaume Modard (guillaumemodard) wrote : | #18 |
Does anybody works on this bug ?
Dweia (dweia) wrote : | #19 |
Bryce Harrington wrote on 2011-03-04:[...] That suggests a regression in the kernel between 2.6.38-3 and -4
The error must have occured a lot earlier. I tried a bunch of different kernels, each with the (at the moment) most recent version (highest number after the dash):
2.6.35-25 - works
2.6.36-1 - works
2.6.37-12 - crashes most of the time
2.6.38-7 - crashes most of the time
I also tried some other versions, including the mentioned 2.6.38-3, but no luck there for me.
There's another bug, which may or may not be related, and which got apparently fixed with kernel 2.6.37, maybe thereby introducing this problem with the crashes? This older problem causes entries like the following in the kernel log when shutting down or rebooting the system.
kernel: [ 36.068256] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 36.068719] [drm:atom_
kernel: [ 41.070113] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 41.070654] [drm:atom_
kernel: [ 46.271694] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
kernel: [ 46.272333] [drm:atom_
These entries occur only in kernel 2.6.35 once per second, in 2.6.36 every 5 seconds and disappear altogether in 2.6.37
P.S. I don't think the bug has to do with xserver-xorg at all - it's most probably the radeon kernel-module, since the errors occur long before X is starting. Also the bugs go away when I blacklist the radeon module. Unfortunately I cannot switch off the Radeon-graphics card, when the module for it isn't loaded. :-(
Dweia (dweia) wrote : | #20 |
Unfortunately I discovered yesterday, that I lied. Wehn used without battery (this is a Aspire 3820TG laptop) and connected power-cord, the crash occurs also with kernel 2.6.36-1. Probably the BIOS does something to/with the graphics-cards, when external power is connected. All the previous tests had been running on battery-power.
Couldn't yet test the 2.6.35 kernel, since I removed it already- need to reinstall that and see what happens.
Bryce Harrington (bryce) wrote : | #21 |
It's starting to sound like this is due to confusion (maybe a regression) in the plumbing layer between X and the kernel, such as module-init-tools or one of the related packages.
I think either apw or cjwatson need to look into this issue. apw's on vacation though.
Chris Halse Rogers (raof) wrote : | #22 |
This does look a lot like some bad interaction between i915/radeon(
It would be useful to have logs - both of good and bad boots - with the “drm.debug=0x0e” kernel argument added to the boot line.
Johan Fornander (fornander-johan) wrote : | #23 |
- 3_sets_of_kernel_logs.tar.bz2 Edit (335.2 KiB, application/x-tar)
I have taken logs from a set of ubuntu kernels starting up using the requested kernel boot argument "drm.debug=0x0e":
2.6.38-7 ---> fb0: radeondrmfb frame buffer device
2.6.38-8 ---> fb0: inteldrmfb frame buffer device
2.6.39-rc1--> kernel oops (pointing to evergreen something), not caught in the logs. seems to be a new offset than the reported one above
Please see the attached files containng dmesg and kern.log for each kernel. There are some other interesting things in the logs like invalid DSDT and stuff that I will look into further.
Bryce Harrington (bryce) wrote : | #24 |
afoglia - I've forwarded this bug upstream to http://
Johan, I attached your logs to the upstream bug report. Generally upstream prefers that the logs come from the original reporter, so I'm not sure if they will accept the bug report.
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | Confirmed → Triaged |
Bryce Harrington (bryce) wrote : | #25 |
Upstream would like to see if setting the video to the radeon/descrete setting in the BIOS configuration makes it function properly.
If so, this may be a known issue in the new vga_switcheroo functionality, ala bug https:/
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | Triaged → Incomplete |
Dweia (dweia) wrote : | #26 |
Chris Halse Rogers wrote in #22: "This does look a lot like some bad interaction between i915/radeon"
I agree - I did some more testing and placed an entry for the radeon-module into /etc/initramfs-
What proves the "bad interaction" even more is: when also placing "i915" into /etc/initramfs-
I'll try to get logs of four different initrd-
P.S. off-topic, I knew a cjwatson once - hi Kamion ;-)
Dweia (dweia) wrote : | #27 |
Bryce Harrington wrote in #25 "Upstream would like to see if setting the video to the radeon/descrete setting in the BIOS configuration makes it function properly."
Answer: Yes it does. I tried that a while ago already, but can't (don't want to) use that for regular running, because the radeon-card makes the laptop-battery ruin out too fast. I read also that there's a patched/hacked BIOS somewhere that allows to switch off the radeon-card via BIOS, but if it can be solved with software I'd prefer that ;)
Changed in xserver-xorg-driver-ati: | |
importance: | Unknown → High |
status: | Unknown → Confirmed |
Johan Fornander (fornander-johan) wrote : | #28 |
@Bryce: Yes, booting with only discrete or integrated enabled does solve the problem for me as well.
@Dweia: I patched the CMOS for my 3820TG and unlocked the Intel menu. Now I can choose to only have the IGD activated and PEG (radeon) completely shut off drawing zero power. Same thing as going through vga_switcheroo or using the acpi calls but less hassle.
Johan Fornander (fornander-johan) wrote : | #29 |
Btw, should we work on the bug on here on launchpad or keep the discussion on freedesktop working directly with the AMD devs?
Dweia (dweia) wrote : | #30 |
Sorry, I got sidetracked while getting a set of logs. However, some (yet slightly vague) findings may be useful - even if debugging gets maybe even harder:
Firstly: the computer (BIOS or whatever) behaves differently when external power is connected or only battery used, and secondly: it behaves differently depending on the last state of the vgaswitcheroo BEFORE the reboot. I need to do more testing regarding the former (probably frequency of crashes higher with external power), but the latter seemed to me pretty consistently only crashing after "echo OFF > /sys/kernel/
I did yesterday a kernel-update to 2.6.38-8, I'll try to reproduce the findings and will try to see if anything changed in the behaviour.
Vangel Ajanovski (ajanovski) wrote : | #31 |
If I add
radeon.modeset=0
in the boot line when starting, the crash does not happen and the system continues with is using integrated Intel.
Timo Aaltonen (tjaalton) wrote : | #32 |
marking the bug as confirmed
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | Incomplete → Confirmed |
Bryce Harrington (bryce) wrote : Re: [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Hybrid graphics) | #33 |
I've updated the title and description based on recent findings.
Hybrid graphics switching support is still fairly embryonic upstream and I don't feel it is yet stable or reliable enough yet for us to support in Ubuntu, so I am setting the importance of the X task here to Wishlist.
However, even aside from switching graphics, the kernel should not be failing with this particular hardware configuration, even if it is not able to properly switch; it should pick one driver or the other and not load both, even if it just has to pick at random. So I'm leaving the kernel task here open, in hopes that some fix can at least paper over the crash.
description: | updated |
summary: |
[Radeon HD 5650 and 5470] Driver crash during recovery boot and in - normal boot (Regression from 2.6.38-3 to -4) + normal boot (Hybrid graphics) |
Changed in xserver-xorg-video-ati (Ubuntu): | |
importance: | High → Wishlist |
status: | Confirmed → Triaged |
summary: |
- [Radeon HD 5650 and 5470] Driver crash during recovery boot and in - normal boot (Hybrid graphics) + [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal + boot (Hybrid graphics) |
Bryce Harrington (bryce) wrote : | #34 |
@JFo, this hardware results in the kernel triggering a BUG. Please add this to the kernel team's list of bugs to investigate.
Changed in linux (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
status: | Triaged → New |
assignee: | nobody → Jeremy Foshee (jeremyfoshee) |
wedens (frigid20) wrote : | #35 |
Jeremy Foshee (jeremyfoshee) wrote : | #36 |
added to the hot bugs listing for team review.
~JFo
tags: | added: kernel-key |
Changed in linux (Ubuntu): | |
assignee: | Jeremy Foshee (jeremyfoshee) → nobody |
status: | New → Triaged |
tags: | added: oneiric |
Bryce Harrington (bryce) wrote : | #37 |
[I've marked this bug for inclusion in our oneiric bug queue. While technically this bug has not been re-confirmed against oneiric, I feel it is worth continued development attention. We will need to ask that it be re-confirmed once oneiric is further along, perhaps once we get closer to alpha.]
Cabalbl4 (i-vohmin) wrote : | #38 |
can confirm this bug on hybrid radeon 6550 and intel card on acer aspire 3820TG
natty kernel 2.6.38-8-generic.
Cabalbl4 (i-vohmin) wrote : | #39 |
issue seems to be gone with kernel 2.6.38-9-generic from proposed.
Cabalbl4 (i-vohmin) wrote : | #40 |
No, it still exist on 2.6.38-9-generic when power cable is unplugged (
Seth Forshee (sforshee) wrote : | #41 |
@Cabalbl4, do you mean that it happens when you boot with the power cable unplugged? Or that it crashes when you unplug the power cable after successfully booting? Either way it would be useful to capture dmesg after it happens, if possible.
It might also be useful to have the DSDT from machines affected by this bug. To collect the DSDT, open a terminal and execute the following commands:
sudo apt-get install fwts
sudo fwts --disassemble-aml
Then attach the DSDT.dsl file generated by fwts to this bug. Thanks!
Cabalbl4 (i-vohmin) wrote : | #42 |
- syslog exert (tail) Edit (14.6 KiB, text/plain)
Well, as far as I understood now, it is somehow related to cable plug, but not all the times. Maybe it affects module load or something.
Here is a peace of syslog I have captured before by tailing its output to file before everything hung on kernel 2.6.38-8.
I will try to use fwts soon.
Cabalbl4 (i-vohmin) wrote : | #43 |
Well, after update from xorg-edgers ppa and some messings with initramfs (tried to put radeon module before intel, but later reverted it) the bug seems to be happening very rarely, but at random :( As drawback, my ttys are gone again. But it is another bug of intel driver. And here is my DSDT.dsl
Cabalbl4 (i-vohmin) wrote : | #44 |
@Seth Forshee to get things clear about the cable. I never tried to plug/unplug cable in the boot process. I have only booted with cable already plugged in or unplugged.
Cabalbl4 (i-vohmin) wrote : | #45 |
The bug never happens after the successfull boot. Even when I switch cards with vgaswitcheroo.
Seth Forshee (sforshee) wrote : | #46 |
@Cabalbl4, thanks for the clarification. Can you also attach the SSDT.dsl file that fwts generated? I'm not finding the relevant fields in the DSDT.
Jean Demange (jea-demange) wrote : | #47 |
Cabalbl4 (i-vohmin) wrote : | #48 |
Cabalbl4 (i-vohmin) wrote : | #49 |
Johan Fornander (fornander-johan) wrote : | #50 |
Confirmed using 2.6.39-
Johan Fornander (fornander-johan) wrote : | #51 |
- DSDT-Acer_Aspire_3820TG.dsl.bz2 Edit (26.3 KiB, application/octet-stream)
Attaching my DSDT.dsl as well. Acer Aspire 3820TG equipped with a Radeon HD 5650.
belltown (sea-av80r) wrote : | #52 |
- /var/log/kern.log with default boot parameters Edit (92.4 KiB, text/plain)
I also have an HP ENVY 14 with switchable graphics (ATI HD5650/Intel Integrated). I just did a clean install of Ubuntu 11.04 and installed all the recommended updates from the update manager. I get a black screen when I boot unless I put radeon.modeset=0 in the boot command line.
I've attached a copy of /var/log/kern.log created when using the default boot parameters.
belltown (sea-av80r) wrote : | #53 |
- /var/log/kern.log with modeset=0 Edit (29.6 KiB, text/plain)
Here's a copy of /var/log/kern.log created using radeon.modeset=0 on the boot parameter line.
belltown (sea-av80r) wrote : | #54 |
- /var/log/kern.log with modeset=0 Edit (82.7 KiB, text/plain)
Sorry, last post was Xorg.0.log with radeon.modeset=0.
Here is /var/log/kern.log with radeon.modeset=0
riyasmp (riyasmp) wrote : | #55 |
Hi guys
I have a similar issue with my hp pavillion dv6-3150 SA.
I have been using 10.10 so far which had same problem with switchable graphics and the ATI never worked.
I made a fresh install of 11.04 on the same laptop recently and the hot boot returned me a blank screen when I chose 11.04. i did a cold booting and it took me to 11.04 unity interface. logd out chose classical ubuntu desktop and tried that as well.
The intresting thing is that live Cd worked alright on this laptop. At the moment i am using 10.10 as 11.94 is not working. I would be able to help any file from 10.10 if it helps.
since then when i restart X is crashing and I cant use 11.04. i tried to seek some help from #ubuntu-uk channel. and some one asked me to put command in the rescue mode( sudo mv /etc/X11/xorg.conf /etc/X11/
I did that and it returned the output /etc/X11/xorg.conf no such file or directory.
the output for cat /var/log/Xorg.0.log | pastebinit is http://
please refer to this link https:/
with regards
Guillaume Modard (guillaumemodard) wrote : | #56 |
I can confirm that :
- The bug is due to Switchable ATI + Intel graphics
- Both free and proprio driver do not work correctly :
--> Free Driver : Boot correctly every 5 to 10 time, without a complete support of the ATI Graphic Card (comes very hot some time...)
--> Proprio Driver : Boot correctly every time, but with the intel integrated graphic card (no Unity, only 2D gnome panel without effects)
- Most of the last HP laptop have this bug (= Ubuntu do note work on most of the last HP laptop)
Is there any deadline for correct this critical bug ?
Does any team really work on it ?
Cabalbl4 (i-vohmin) wrote : | #57 |
For now, my bug is completely gone. But it triggers Bug #571573 (tty loss on intel driver). The radeon and intel drivers now manage to boot correctly, however, the TTYs are gone until I switch off intel card (to radeon) and then turn it back on.
Jean Demange (jea-demange) wrote : | #58 |
For me the bug is still present. Almost impossible to boot on battery, with the power plugged, boot succeed after 2 or three times. I think it still needs some adjustments. Moreover is there any way to deactivate this module while it doesn't work correctly ?
Jan-Åke Larsson (jalar) wrote : | #59 |
This is erratic. I have blacklisted "radeon" from autoloading by adding it in /etc/modprobe.d/. I then load it manually in rc.local to be able to turn the card off. I now can boot on battery fine. Or could. Lately I've had boot trouble again, but this might be for Other Reasons.
Seth Forshee (sforshee) wrote : | #60 |
I've been poring over the logs attached here, but there doesn't seem to be enough information to piece together what's different between a good and a bad boot. I'd like to reiterate the previous request for kernel logs of _both_ good and bad boots with the “drm.debug=0x0e” kernel argument added to the boot line, with both the good and bad logs collected using the same hardware and the same kernel version.
Thanks in advance!
Changed in linux (Ubuntu): | |
assignee: | nobody → Seth Forshee (sforshee) |
status: | Triaged → Incomplete |
Bryce Harrington (bryce) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics) | #61 |
On Wed, May 18, 2011 at 09:13:15PM -0000, Seth Forshee wrote:
> I've been poring over the logs attached here, but there doesn't seem to
> be enough information to piece together what's different between a good
> and a bad boot. I'd like to reiterate the previous request for kernel
> logs of _both_ good and bad boots with the “drm.debug=0x0e” kernel
> argument added to the boot line, with both the good and bad logs
> collected using the same hardware and the same kernel version.
>
> Thanks in advance!
Btw, the xdiagnose utility can be used to add "drm.debug=0x0e" to the
kernel. Install it, then run 'sudo xdiagnose'; it's the first checkbox
in the dialog.
belltown (sea-av80r) wrote : | #62 |
@Seth Forshee
Is this what you're looking for?
I'm using an HP ENVY 14 with a switchable graphics (Radeon HD5650 and integrated HD Intel graphics.)
I have attached a kern.log file with 3 boots:
1st boot - Bad. Black screen occurs on boot when trying to use Radeon driver without radeon.modeset=0
2nd boot - Good. Boot with radeon.modeset=0
3rd boot - Good. Blacklisted radeon driver. Booted using integrated intel driver
belltown (sea-av80r) wrote : | #63 |
belltown (sea-av80r) wrote : | #64 |
@Seth Forshee
This log might be a better one to look at. I tried booting several times, each boot was EXACTLY the same, no blacklisting of the radeon driver, no use of modeset=0 or other boot parameters. The only variable was the time delay between getting the GRUB menu and pressing the enter key on the menu to get the boot to occur.
The very last boot (time entry 20:50:16) resulted in a successful boot. I believe all the others resulted in a black screen.
I used drm.debug=0x0e for all boots.
Ap. Syvertsen (taperkatt) wrote : | #65 |
I have a Dell Aspire Timeline X 4820TG with Radeon HD5650 and integrated intel graphics.
This bug used to bugger me about every second boot, but after installing tons of different kernels it's actually been more sporadic. I don't know whether that is because I spend more time in the GRUB-menu, clicking to Previous versions and then choosing 2.6.38-8.
Anyways, I added the drm.debug=0x0e for all kernels and kept booting till I got some good and bad boots. Chronologically I had 1 good, 3 bad and 1 good boot, but in the attached kern.log it only shows 1 good, 1 bad, 1 good. In all the bad boots it stopped with a completely black screen (no backlight) but I was able to switch xserver (or something) by pressing Alt+F1, Alt+F2 and Alt+F7. The system did not respond to Ctrl-Alt-Del but printed some SAK line when pressing AltGr+SysRq+K without doing anything.
The last line in one terminal on all bad boots was this:
[drm:intel_
The last thing in the other terminal was the call trace ending with this:
[ 27.218815] RIP [<ffffffffa0569
Hope this helps, let me know if you need more info.
Ap. Syvertsen (taperkatt) wrote : | #66 |
Okay, first of all, sorry for the long comment but here are my observations from scanning the kern.log:
If this block:
[ 18.460593] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 18.460617] i915 0000:00:02.0: setting latency timer to 64
comes before this block:
[ 18.559570] [drm] radeon defaulting to kernel modesetting.
[ 18.559574] [drm] radeon kernel modesetting enabled.
[ 18.559599] VGA switcheroo: detected switching method \_SB_.PCI0.
[ 18.559652] radeon 0000:01:00.0: enabling device (0000 -> 0003)
[ 18.559660] radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 18.559666] radeon 0000:01:00.0: setting latency timer to 64
We will get:
(...)
[ 18.560350] i915 0000:00:02.0: irq 42 for MSI/MSI-X
(...)
[ 18.630999] [drm:intel_
(...)(mux load)
[drm:intel-
[ 18.827202] vga_switcheroo: enabled
[ 18.827287] radeon atpx: version is 1
[ 18.842578] HDA Intel 0000:00:1b.0: BAR 0: set to [mem 0xdc500000-
[ 18.842592] HDA Intel 0000:00:1b.0: enabling device (0000 -> 0002)
[ 18.842620] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[ 18.842757] HDA Intel 0000:00:1b.0: irq 43 for MSI/MSI-X
[ 18.842793] HDA Intel 0000:00:1b.0: setting latency timer to 64
(...)
[ 28.717369] ATOM BIOS: Acer
[ 28.717382] radeon 0000:01:00.0: GPU softreset
(...) (GPU reset stack)
[ 28.864335] radeon 0000:01:00.0: irq 45 for MSI/MSI-X
[ 28.864343] radeon 0000:01:00.0: radeon: using MSI.
[ 28.864356] radeon 0000:01:00.0: IH ring buffer overflow (0xFFFFFFFF, 0, 15)
[ 28.864386] [drm] radeon: irq initialized.
[ 28.864388] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 28.864892] [drm] Loading REDWOOD Microcode
[ 29.042576] radeon 0000:01:00.0: Wait for MC idle timedout !
[ 29.187699] radeon 0000:01:00.0: Wait for MC idle timedout !
[ 29.189702] radeon 0000:01:00.0: WB enabled
[ 29.206269] BUG: unable to handle kernel paging request at ffffc9041b591ffc
-> CRASH
if however the blocks on top are switched round we get:
(...)
[ 25.383004] ATOM BIOS: Acer
(...) (no GPU reset )
[ 25.383379] radeon 0000:01:00.0: irq 43 for MSI/MSI-X
[ 25.383386] radeon 0000:01:00.0: radeon: using MSI.
[ 25.383423] [drm] radeon: irq initialized.
[ 25.383426] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 25.383942] [drm] Loading REDWOOD Microcode
[ 25.417531] radeon 0000:01:00.0: WB enabled
[ 25.434108] [drm] ring test succeeded in 1 usecs
(...)
[ 26.836096] i915 0000:00:02.0: irq 44 for MSI/MSI-X
(...)
[ 26.905062] [drm:intel_
(....) (mux load)
[ 26.905137] vga_switcheroo: enabled
[drm:intel-stuff] x30
[ 27.090241] HDA Intel 0000:00:1b.0: BAR 0: set to [mem 0xdc500000-
[ 27.090256] HDA Intel 0000:00:1b.0: enabling device (0000 -> 0002)
[ 27.090290] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[ 27.090462] HDA Intel 0000:00:1b.0: irq 45 for MSI/MSI-X
[ 27.090495] HDA Intel 0000:00:1b.0: setting la...
Zentai Andras (andras-zentai) wrote : | #67 |
Dear All,
I experienced the same problem with an other switchable graphics configuration:
ATI Radeon HD 3650 discrete and some Intel card. (Lenovo Thinkpad T500)
Boot hangs and I got the same error message, just the pci id was different:
udevd-work[86]: '/sbin/modprobe -bv unexpected exit with status 0x0009
Disabling the switchable graphics option in BIOS resulted good boot either using Radeor or Intel video cards.
I suggest to include the HD 3650 model to the topic of the bug.
Jean Demange (jea-demange) wrote : | #68 |
- Part of the kern.log after a good boot Edit (173.9 KiB, text/plain)
Hey,
After seven failed boots, I've managed to have a good boot : during this good boot, the disk checking had started.
When the boot was unsuccessful unsuccessful, I couldn't access to a tty but the magic keys worked.
I attach the the success part of the kern.log, followed by the bad part and I attach too the entire kern.log because I'm not really sure of my cuts.
The option “drm.debug=0x0e” was activated.
Jean Demange (jea-demange) wrote : | #69 |
Jean Demange (jea-demange) wrote : | #70 |
Changed in linux: | |
importance: | Unknown → Medium |
status: | Unknown → Confirmed |
Seth Forshee (sforshee) wrote : | #71 |
Thanks for the logs.
I've posted a test build that I'd like to receive some feedback on. It uses the ACPI method to try and enable power to the card before trying to do any hardware initialization. If it works I'll run it by the upstream developers to see whether this is an appropriate solution to the problem. You can get the build at
http://
Please post your feedback here after testing. Thanks!
Changed in linux (Ubuntu): | |
status: | Incomplete → In Progress |
status: | In Progress → Incomplete |
belltown (sea-av80r) wrote : | #72 |
@Seth Forshee
I'd like to try out this patch. My kernel version is 2.6.38-8-generic and I'm running an x64 system.
I assume I need your linux-headers-
Thanks.
Seth Forshee (sforshee) wrote : | #73 |
To install, download the two .deb files that match your installation (*_i386.deb for 32-bit and *_amd64 for 64-bit) and the linux-headers-
If you're unsure whether you have a 32-bit or 64-bit installation, run 'uname -m' in a terminal. If it outputs i686 get the *_i386.deb files and if it outputs x86_64 get the *_amd64.deb files.
Jean Demange (jea-demange) wrote : | #74 |
- kern_bad.log Edit (219.7 KiB, text/plain)
Hello,
I've tested it. The boot starts correctly and I can get to gdm. But 5 secondes after the logging screen, when I try to log me, i'me returning on a black screen with white writting ; but I still have a mouse and I can log me by enter my password blindly. I can heard the sound of Ubuntu starting but X11 doesn't work.
I attach the kern.log.
Ap. Syvertsen (taperkatt) wrote : | #75 |
- syvertsen_kern.log Edit (706.2 KiB, text/plain)
Hi Seth,
Thanks a lot for working on this. Unfortunately I was unable to reach Unity at all with the new kernel but it crashes in a distinctively new way. Earlier the computer stopped without any backlight, but I was able to reach VT1, which says "Preparing flip with no unpin work?" and VT7, which displayed the kernel BUG from the driver loading.
Now, out of five times, two times the computer stopped with a black screen with backlight, but I was unable to reach any VT.
The following three times, the computer wanted to check the disk, which after diskchecking, resulted in the loading-logo (Ubuntu with dots underneath) being displayed on my computer for the first time since the upgrade to 11.04 (Usually the complete loading of linux is black).
the computer froze after this, two times with the ubuntu-logo being displayed and once with the Call trace from the BUG being displayed. I was unable to use Alt-F1/F7.
Attached is the kern.log
aproposnix (aproposnix) wrote : | #76 |
I just wanted to add that with the HD5650 on an Acer Aspire I have the same issues. One thing that doesn't seem to get mentioned here much is the vesafb. More than half the time I boot I get an error stating that their was an error inserting vesafb. I sometimes also receive an error stating that the module vesafb.ko was not found. Either way, the system fails to boot.
I have no idea if this information is being output to a log somewhere as it seems to occur even before the filesystems are mounted. Can someone suggest to me which log to look for?
Marco Trevisan (Treviño) (3v1n0) wrote : | #77 |
Same here. Testing these new kernel .deb's doesn't fix the issue. At the contrary, in my case I never was able to boot using this kernel, while it works using a non-patched kernel.
Seth Forshee (sforshee) wrote : | #78 |
Thanks to everyone who tested. I've passed this information along to upstream.
@harry, are you getting the vesafb messages with natty? The natty kernel has the vesafb driver built into the kernel, so there isn't any vesafb.ko. Do you have a log with the messages you're talking about?
aproposnix (aproposnix) wrote : | #79 |
Seth, yeah I'm on natty. Can you help me identify the log you need? I'm not sure which is relevant.
Seth Forshee (sforshee) wrote : | #80 |
harry, let's start with /var/log/kern.log. Grab it from a boot when you've seen the vesafb messages. Thanks!
aproposnix (aproposnix) wrote : | #81 |
@Seth, Does it matter that the error occurs before the drives are mounted? I'm not sure it's actually saving the log when the error occurs.
Either way, on the next occurrence, I'll send this log.
Seth Forshee (sforshee) wrote : | #82 |
harry, the whole of the kernel log will get saved to kern.log as long as the system is booting that far. If it's not booting that far and you're able to get a terminal then you can try to collect the output of the dmesg command. Failing that, you could try booting into recovery mode, and if you get the errors then try to collect dmesg. Otherwise the best you can do is probably to supply the exact text of whatever messages you see when this happens (taking a picture of the screen is one option).
Changed in linux (Ubuntu): | |
status: | Incomplete → In Progress |
z06gal (z06gal) wrote : | #83 |
I am running Mint 11 32bit and am experiencing this bug. I upgraded yesterday to the 2.6.39.1 kernel and fortunately it resolved the power regression issue I was having but I continue to get this message during boot. The first message that comes up is "could not start bootsplash = could not access a shared library" and this is followed by the error being discussed here. After those 2 lines come up, my computer will boot right up and there are no more issues. It boots the same whether I use battery or not. I have no idea if this is a part of this issue but when I run powertop, I see at the top i915 <interrupt> always. Here is the info on my dell xps:
robin@robin-
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation 2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series Chipset Family High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 2 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 6 (rev b5)
00:1d.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation HM67 Express Chipset Family LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series Chipset Family 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series Chipset Family SMBus Controller (rev 05)
01:00.0 VGA compatible controller: nVidia Corporation Device 0dcd (rev a1)
03:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 (rev 34)
04:00.0 USB Controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04)
05:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host Controller (rev 30)
05:00.2 SD Host controller: JMicron Technology Corp. Standard SD Host Controller (rev 30)
05:00.3 System peripheral: JMicron Technology Corp. MS Host Controller (rev 30)
05:00.4 System peripheral: JMicron Technology Corp. xD Host Controller (rev 30)
0a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
Vangel Ajanovski (ajanovski) wrote : | #84 |
The behaviour has changed a bit after latest updates.
I have a HP Pavilion dm4t
uname -a
2.6.38-10-generic #44-Ubuntu SMP Thu Jun 2 21:32:22 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
I have the feeling that a month ago I needed 6-7 reboots until working OK, now I only need 3-4.
Another change is that after it starts OK, the screen is dimmed so much that is turned off (my laptop has such feature to turn off the backlight). So in order to login I have to increase brightness (switch backlight on).
WIth previous kernels the screen was backlit after successful loading of X.
Vangel Ajanovski (ajanovski) wrote : | #85 |
I forgot to say that I also have included the ubuntu-
aproposnix (aproposnix) wrote : | #86 |
- screens of the error Edit (2.3 MiB, application/zip)
@Seth I still can't seem to find the error message that I see on boot freeze in the logs. I tired taking a photo of it. Maybe it'll help?
uname -a
Linux ClarifyUbuntu 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Seth Forshee (sforshee) wrote : | #87 |
@harry, what that message probably means is that the bootloader is using something other than VESA for graphics, VGA perhaps. It's probably nothing to worry about, and I don't think it has anything to do with the problem during the radeon driver probe.
aproposnix (aproposnix) wrote : | #88 |
@Seth You wouldn't know how to fix this would you? :)
I do worry about it though as i am constantly having to reboot many times until I finally get into Ubuntu.
It's annoying as well as embarrassing around my Mac and Window using colleagues. They think I'm an idiot for using Ubuntu... maybe I am?
Marco Trevisan (Treviño) (3v1n0) wrote : | #89 |
According to https:/
> I can't reproduce this bug on 2.6.39-git19 and 3.0-rc3.
> Seems bug fixed.
> Thanks!
Seth Forshee (sforshee) wrote : | #90 |
No one has identified what patches fix the issue, if it is indeed fixed at all. The thing to do now is to test these kernels on more affected machines, and if it fixes the problem universally we can try to find the fix and backport it. You can get mainline kernels for Ubuntu from the following link:
http://
I don't know exactly what version 2.6.39-git19 corresponds to, but a 3.0-rc3 is definitely available. If you find a build that does fix the problem it would also be useful to move backward one version at a time until you find the last version that is still broken.
I'll also look to see if I can find out what changes might have fixed the issue.
Thanks!
Changed in linux (Ubuntu): | |
status: | In Progress → Incomplete |
Jean Demange (jea-demange) wrote : | #91 |
Hi every body,
I've tried some kernel and I find that using the kernel 3.0-rc2, it seems to work properly. But using the 2.36.1 it doesn't work. But i didn't try the 3.0-rc1.
Seth Forshee (sforshee) wrote : | #92 |
@Jean, thanks for testing.
I don't see anything in the radeon driver changes that looks like it obviously fixes the issue, but I've identified a handful that could be responsible. These commits were introduced between 2.6.39-rc5 and -rc7. What would be most helpful right now would to be able to state something like, "version x doesn't work but version x+1 does." At this point I'd suggest starting with 2.6.39-rc5 through -rc7.
Ap. Syvertsen (taperkatt) wrote : | #93 |
Hi,
Just a quick question. Isn't v2.6.39.1-oneiric/ after v2.6.39-
Because for me, 2.6.39-1 (which I'm guessing Jean is also referring to, since 11.04 comes with 2.6.38-8) doesn't work while 3.0-rc3 works. Testing rc1 now.
Ap. Syvertsen (taperkatt) wrote : | #94 |
Yup,
My linux crashes occasionally during startup with kernel 2.6.39-1 with this kernel BUG, while I'm unable to produce this on 3.0-rc1.
It is also worth noting that my rc.local script that turns off the power to the radeon card works on 3.0-rc1 while it was not run on earlier kernels even the times it booted. (I would have to run it manually after startup).
Changed in linux (Ubuntu): | |
status: | Incomplete → In Progress |
Seth Forshee (sforshee) wrote : | #95 |
Yes, 2.6.39-1 is after 2.6.39-rc7.
I've been looking between 2.6.39 and 3.0-rc1 now, but nothing stands out as the change that potentially fixes this issue. The next step is to perform a bisection to try and locate the commit that fixes the problem. I'll provide a series of kernels, please test each one and let me know whether or not it contains the issue.
Everyone who is able to test please do so, as it can't hurt to have multiple people testing.
Bisect build #000 is available at:
http://
Bisect log so far (note that since we're hunting for the commit that fixes the problem, the meanings of 'good' and 'bad' are swapped in the log).
# bad: [55922c9d1b84b8
# good: [61c4f2c81c61f7
git bisect start 'v3.0-rc1' 'v2.6.39'
Changed in linux (Ubuntu): | |
status: | In Progress → Incomplete |
Ap. Syvertsen (taperkatt) wrote : | #96 |
Are these bisects between 2.6.39-1 and 3.0-rc1 or between 2.6.39 and 3.0-rc1?
I'm more than willing to try the bisects, but as I've already outlined, 2.6.39-1 is already "good", it crashes, and I don't want to try too many bisects.
Seth Forshee (sforshee) wrote : | #97 |
@Ap. Syvertsen: Let's try to lay out everything clearly to make sure there isn't any confusion. The swapping of the meanings of "good" and "bad" are bound to cause confusion, so let's use the usual meanings for any discussion with the understanding that the meanings are swapped *only in the bisect logs.*
You said, "My linux crashes occasionally during startup with kernel 2.6.39-1 with this kernel BUG, while I'm unable to produce this on 3.0-rc1." I took that to mean 2.6.39.1 is bad, i.e. exhibits an oops message similar to that in the bug description. And version 3.0-rc1 does not exhibit the oops. Is that understanding correct?
It's a little problematic to bisect between 2.6.39.1 and 3.0-rc1. The stable kernels (anything with the fourth component of the version number, i.e. 2.6.39.y) are somewhat of a branch off of normal kernel development, and the bisection will go more quickly if we start with 2.6.39. This version should still have the bug if 2.6.39.1 does, unless it was fixed and then broken again, but that's unlikely. It can't hurt to verify that 2.6.39 really does have the problem however.
So I started the bisection by marking 2.6.39 as bad (i.e. "good" in the bisect log) and 3.0-rc1 as good (i.e. "bad" in the bisect logs). git picked an intermediate version for testing, and that is what the bisect000 build represents. That is correct so far as my understanding of the situation as communicated above is correct. Does that make sense?
Jean Demange (jea-demange) wrote : | #98 |
For me bisect #000 does not work. The bug affect it.
Seth Forshee (sforshee) wrote : | #99 |
Second build (bisect001) is now available in the same location.
Bisect log:
# bad: [55922c9d1b84b8
# good: [61c4f2c81c61f7
git bisect start 'v3.0-rc1' 'v2.6.39'
# good: [c44dead70a841d
git bisect good c44dead70a841d9
Klaus Reichl (klaus-reichl) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics) | #100 |
Unfortunately I don't have the laptop around where the bug showed up
originally for me.
I'll be there again on one of the next weekends.
But be warned, I thought the bug has gone after updating natty a two
weeks ago.
I happily installed packages, got googleearth running, everything
looked great, and than ...
after next reboot - loose loose loose.
Switched back to poor graphics with radeon blacklisted to have at least a
base system working.
Klaus
--
Klaus Reichl <email address hidden>
Danhausergasse 8/16 +43 6991 84 137 94
1040 Wien
On Tue, Jun 21, 2011 at 9:21 PM, Seth Forshee
<email address hidden> wrote:
> Yes, 2.6.39-1 is after 2.6.39-rc7.
>
> I've been looking between 2.6.39 and 3.0-rc1 now, but nothing stands out
> as the change that potentially fixes this issue. The next step is to
> perform a bisection to try and locate the commit that fixes the problem.
> I'll provide a series of kernels, please test each one and let me know
> whether or not it contains the issue.
>
> Everyone who is able to test please do so, as it can't hurt to have
> multiple people testing.
>
> Bisect build #000 is available at:
>
> http://
>
> Bisect log so far (note that since we're hunting for the commit that
> fixes the problem, the meanings of 'good' and 'bad' are swapped in the
> log).
>
> # bad: [55922c9d1b84b8
> # good: [61c4f2c81c61f7
> git bisect start 'v3.0-rc1' 'v2.6.39'
>
> ** Changed in: linux (Ubuntu)
> Status: In Progress => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
> normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
> Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
> Confirmed
> Status in “linux” package in Ubuntu:
> Incomplete
> Status in “xserver-
> Triaged
>
> Bug description:
> [Problem]
> On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a failure occurs resulting in a black screen and errors from the radeon kernel module, as shown below.
>
> [Cause]
> From upstream developer:
>
> "The switcheroo code needs more work to switch properly on some
> systems it seems. There are a set acpi methods required to
> activate/deactivate the respective gpus. The drivers need to load and
> initialize active hw. If the hw is not active when the driver loads,
> then the hw is not set up properly and it won't work. Probably some
> ordering issues in how the switcheroo acpi methods are called."
>
> [Workarounds]
> Several options:
>
> 1. If your BIOS includes functionality to disable the Intel card, use
> BIOS settings to select which chip to load.
>
> 2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note
> that the default radeon gallium driver only works with KMS, so YMMV.
>
> [Original Report]
> I'm running natty, and every since the upgrade to 6.14.0...
Jean Demange (jea-demange) wrote : | #101 |
Hello every body,
Bisect #001 work for me. The bug does not show up. But with this kernel my CPU is always at 100%, but nothing to deal with the bug.
Seth Forshee (sforshee) wrote : | #102 |
Hrm, I've got a lurking suspicion that this issue has a timing component
(and that could actually be why it works in 3.0, not that it's really
fixed but that the timing that some change in timing makes the problem
much harder to trigger). If your CPU is being pegged then that could be
affecting the results.
We'll mark this one as good for now, but if the bisect goes bad we may
come back to this point.
Third build (bisect002) is now available.
# bad: [a09ed5e0008444
git bisect bad a09ed5e00084448
Jean Demange (jea-demange) wrote : | #103 |
Same comportment with this bisect. No X bug but CPU at 100% all the time.
Ap. Syvertsen (taperkatt) wrote : | #104 |
Hi Seth, thanks a lot for clarifying.
I've now tested the the three bisects and bisect000 crashes while bisect001 boots every time. I have no problems with CPU use either.
For me however, bisect002 also crashes during start-up (two out of three times).
As a side note, I had to install the headers of the bisects to make it crash, if I didn't install the headers I could boot all the kernels.
Seth Forshee (sforshee) wrote : | #105 |
@Ap. Syvertsen: That's interesting about the headers. With the next
build that tests bad for you, can you compare the lsmod output with and
without the headers packages installed?
bisect003 is now available.
# good: [9461702d2a54cd
git bisect good 9461702d2a54cd4
Ap. Syvertsen (taperkatt) wrote : | #106 |
First of all, bisect003 crashes as well, with or without headers.
I looks like I kinda jumped to a conclusion too fast, as I am unable to reproduce the no-header-boot link.
The reason why I was led to believe this is that I installed bisect001-image without headers and suddenly I could boot 2.6.39-1. I keep this image and also try it to check if something else has changed the boot behavior. I've now tried for an hour to reproduce the no-header-boot link with no luck, so I would regard my previous statement as false.
Jean Demange (jea-demange) wrote : | #107 |
So, I've tried the last bisect, and it doesn't work : 2 boots of 10 succeed. So was wondering if the others was really not working. And the bisect 002 crashes every time too whereas the bisect 001 worked every times.
For information for every kernel I have to add the option --force-depends.
Seth Forshee (sforshee) wrote : | #108 |
bisect004 is now available.
# good: [2b030bda66b0a5
git bisect good 2b030bda66b0a59
Jean Demange (jea-demange) wrote : | #109 |
Bisect 004 works as the bisect 001. No X bug but CPU at 100% every time.
Ap. Syvertsen (taperkatt) wrote : | #110 |
Yup, same with me. bisect004 boots fine.
Seth Forshee (sforshee) wrote : | #111 |
bisect005 is now available.
# bad: [931474c4c30633
git bisect bad 931474c4c306334
Jean Demange (jea-demange) wrote : | #112 |
Bisect 005 works fine : no X bug and no CPU issue.
Seth Forshee (sforshee) wrote : | #113 |
bisect006 is now available.
# bad: [69f7876b2ab61e
git bisect bad 69f7876b2ab61e8
Jean Demange (jea-demange) wrote : | #114 |
Bisect 006 works fine too : no X bug and no CPU issue.
Seth Forshee (sforshee) wrote : | #115 |
bisect007 is now available.
# bad: [4b65177b27ede9
git bisect bad 4b65177b27ede9d
Jean Demange (jea-demange) wrote : | #116 |
Bisect 007 does not work as well as the others. Half boots end badly. But when it bugs there is no writting but just some points every where in the screen. It is impossible to change tty.
Seth Forshee (sforshee) wrote : | #117 |
@Jean: So as I understand it you were unable to tell whether you were
getting the oops or not? Did you check your kern.log to see whether or
not it appears when booting with the bisect007 kernel?
For now I'm going to wait and see if we get some more testing on this
one.
bruise lee (workformydream) wrote : | #118 |
confirm: 2.6.39-2 from mainline does not fix it. 50% chance for a good boot
HP DV6tqe with dual GPU of intel/ati(good boot reports 2G NVRAM, bad boot report 3G)
Jean Demange (jea-demange) wrote : | #119 |
Seth Forshee (sforshee) wrote : | #120 |
@Jean, that log does display the bug.
bisect008 is now available. We're getting close to the end, should be 3 builds after this one. It's looking like it will end up identifying a change to the i915 driver.
# good: [2c34b850ee1e9f
git bisect good 2c34b850ee1e9f8
DLHDavidLH (dlhdavidlh-yahoo) wrote : | #121 |
this bug also affects
AMD Radeon HD 3300
DLHDavidLH (dlhdavidlh-yahoo) wrote : | #122 |
this bug affects
Ubuntu 11.04 -and- 11.10
Jean Demange (jea-demange) wrote : | #123 |
Same issue than the previous one.
Seth Forshee (sforshee) wrote : | #124 |
bisect009 is now available.
# good: [fcca7926299944
git bisect good fcca79262999448
Viktor Pal (deere) wrote : | #125 |
Can confirm this bug with AMD Radeon HD 6470M
adrianszwej (adrian-szwej) wrote : | #126 |
Yet another confirmation with AMD Radeon HD 6470M on Dell Vostro 3350
lspci | grep -i radeon
01:00.0 VGA compatible controller: ATI Technologies Inc NI Seymour [AMD Radeon HD 6470M] (rev ff)
uname -a
2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
bin$ cat /etc/issue
Ubuntu 11.04 \n \l
I noticed that changing power supply and non powersupply mode triggers this more frequent.
But having this laptop for one week I dont really se the pattern.
Do the radeon remember that it should sort of resume the card??
Now I have blacklisted both radeon and fglrx and I seem to be able to boot with less struggle.
In /etc/rc.local I modprobe the radeon and run switchero. Most of the time fan goes of; but sometime I have to run the switchero command in my gnome-session.
/Adrian
adrianszwej (adrian-szwej) wrote : | #127 |
aproposnix (aproposnix) wrote : | #128 |
Sorry to sound stupid, but I think there are many people affected by this.... in layman terms, what's the status of this issue?
Seth Forshee (sforshee) wrote : | #129 |
On Sat, Jul 09, 2011 at 09:34:42PM -0000, harry wrote:
> Sorry to sound stupid, but I think there are many people affected by
> this.... in layman terms, what's the status of this issue?
The status is that there are multiple reports of this being fixed in
kernel version 3.0, and if this is true then the issue is fixed for
oneiric. We're currently undergoing a process known as bisection to
identify what changes fixed the problem in 3.0 so that the problem can
be fixed in earlier versions. Anyone experiencing this issue can help
with the bisection process by testing the bisect builds that I am
posting and reporting whether or not the crash is occuring for each of
the builds.
Martin Stjernholm (msub) wrote : | #130 |
[Side note: Had to install module-init-tools from oneiric to satisfy deps in these bisect builds.]
Just to double check a little, I've tried the three latest bisects 007-009, and I've seen the oops in evertgreen_
However, the trig ratio is fairly low for me - bisect 009 gave the oops only 2 times out of 10, which means that to conclude its absence with a reasonable degree of certainty would require 50 boots or so.
I also consistently get the no backlight bug with all builds (is there a separate report for that?). If one isn't aware of that issue, a successful start (wrt this bug) may be confused with a hang with no tty output.
Btw, bisect 007 also gave an oops in radeon_gart_unbind. It was not inside any pci probe stuff but rather in some sort of cleanup code inside drm_release. I can attach the log if it's interesting.
Seth Forshee (sforshee) wrote : | #131 |
Martin, thanks for testing. Your results at least match those reported
for the builds you tested. I don't know whether there's a bug for your
backlight issue. If you can't find one then you should open a new one.
bisect010 is now available.
# good: [8eb572942ca028
git bisect good 8eb572942ca0289
Martin Stjernholm (msub) wrote : | #132 |
I have now tried bisect 010 and got the oops in evergreen_cp_resume on the third boot.
Seth Forshee (sforshee) wrote : | #133 |
bisect011 is now available.
# good: [4697995b98417c
git bisect good 4697995b98417c6
Bryce Harrington (bryce) wrote : | #134 |
[Closing out the -ati task; the issue has been pretty definitively narrowed to the kernel and the kernel team is active on it; nothing else we should do at the X end for this bug.]
Changed in xserver-xorg-video-ati (Ubuntu): | |
status: | Triaged → Invalid |
Martin Stjernholm (msub) wrote : | #135 |
Can confirm the evergreen_cp_resume oops in bisect 11 (on fifth boot).
Martin Stjernholm (msub) wrote : | #136 |
Since I've always gotten the bug in all the bisects I've tried, I thought I better reverify bisect 006 which has been reported as working by others. Unfortunately it didn't work for me, so we're getting different results here. :(
I didn't get the oops in the log file, but I checked on screen that the top 6 frame names were the same as in the oops at the top of this ticket. If it's of any use, I can try to trig it again (in bisect 006, that is) and write it down.
Seth Forshee (sforshee) wrote : | #137 |
Thanks for checking, Martin. I was afraid that might happen with this bug. Can you also try the other builds before 006 that reported as good (001, 004, 005) and see if any of those fail?
It's best if we can get as many testers as possible for each build. It's pretty clear when we get a bad result, but with the good results it's a lot less clear whether or not it's really good. More testers means a better chance of hitting the bug for each build.
josejuan05 (josejuan05) wrote : | #138 |
I have a HP dm4-1100 (sometimes notated dm4t) with the 5470 switchable graphics. I tested bisects 001, 004, 005, 006, and 011.
001, 004, and 006 each booted properly 10 times out of 10.
005 failed the first three times before I gave up on it.
011 booted 10 out of 13 times. The last two times I booted 011 I didn't get the evergreen error, and my cursor appeared (and would move properly), but I couldn't do anything else. I took some pictures with my phone of the error on the last two times, and if you think it would be helpful I can transcribe them, but I'd rather not if you do not think it would be of any help.
josejuan05 (josejuan05) wrote : | #139 |
Here's what I got as an error with bisect 011:
[ 29.400514] Stack:
[ 29.400524] ffff88014d4dfc38 ffffffffa01c7303 ffff88014d4dfc08 00000631815dd48e
[ 29.400561] ffff88014d4dfc38 ffff88013f2b3240 0000000000000002 ffff880143a60968
[ 29.400595] ffff880143a60560 ffff88014db00148 ffff88014d4dfc58 ffffffffa01c4e41
[ 29.400630] Call Trace:
[ 29.400653] [<ffffffffa01c7
[ 29.400687] [<ffffffffa01c4
[ 29.400717] [<ffffffffa0165
[ 29.400741] [<ffffffffa0165
[ 29.400769] [<ffffffffa0166
[ 29.400798] [<ffffffffa0166
[ 29.400822] [<ffffffffa0166
[ 29.400852] [<ffffffff812d9
[ 29.400873] [<ffffffffa0166
[ 29.400904] [<ffffffffa01c6
[ 29.400937] [<ffffffffa012d
[ 29.400971] [<ffffffffa01dd
[ 29.401000] [<ffffffffa012d
[ 29.401024] [<ffffffff812d9
[ 29.401052] [<ffffffffa01d5
[ 29.401085] [<ffffffffa013b
[ 29.401112] [<ffffffffa012d
[ 29.401136] [<ffffffff81163
[ 29.402583] [<ffffffff81163
[ 29.404042] [<ffffffff8115f
[ 29.405462] [<ffffffff8115f
[ 29.406859] [<ffffffff815e5
[ 29.408252] Code: ea ff ff ff 48 8b 8f 80 03 00 00 85 f6 78 21 3b b7 68 03 00 00 77 19 c1 e6 03 48 81 e2 00 f0 ff ff 48 63 f6 48 83 ca 67 48 01 f1
[ 29.408451] 89 11 31 c0 5d c3 0f 1f 40 00 55 48 89 e5 53 48 83 ec 08 0f
[ 29.411472] RIP [<ffffffffa01f8
[ 29.412948] RSP <ffff88014d4dfbe8>
[ 29.414386] CR2: ffffc90011601088
Martin Stjernholm (msub) wrote : | #140 |
Because testing good bisects is tedious, I went backwards:
Bisect 005 is inconclusive since it bugs out with no tty output. Got one kernel hang, but since there's no tty output I don't know if it's this bug or not (nothing in kern.log from that boot either).
Bisect 004 does not show the bug after 30 boots.
@josejuan05: I suspect the radeon_gart_unbind oops is unrelated. I have seen it too, as well as another user in comment #17. Perhaps there is (or should be) a separate ticket for it.
Seth Forshee (sforshee) wrote : | #141 |
I'm trying to construct a list of good and bad commits to use for starting a new bisection run. Please take note that it only takes a single occurrence of the evergreen_cp_resume oops to qualify a commit as bad, so there's no need to continue testing once you've seen that oops.
I do agree that the radeon_gart_unbind oops looks different, so that shouldn't be considered a failure for the purposes of the bisection.
Here's the list I have now.
Bad builds: 000 003 005 006 007 008 009 010 011
Good builds: 001 004
Inconclusive: 005
Martin, just to verify -- with each of the builds you've indicated have failed for you, you've verified that you're getting the evergreen_cp_resume oops?
josejuan05, when you tested build 005, did you verify that you were getting an oops in evergreen_
Once this is all sorted out I'll use the information to seed a new bisection and start providing more builds to test.
Thanks everyone!
Seth Forshee (sforshee) wrote : | #142 |
Oops, just noticed that I included 005 in the bad and inconclusive lists. I meant for it to be only in the inconclusive list.
Martin Stjernholm (msub) wrote : | #143 |
Yes, I have verified that it's an oops in evergreen_cp_resume that I've gotten at least once in each of bisects 006-011.
Seth Forshee (sforshee) wrote : | #144 |
I've put up a new build (bisect012) at:
http://
Here's the bisect log I used to re-seed the bisection.
# bad: [55922c9d1b84b8
# good: [61c4f2c81c61f7
git bisect start 'v3.0-rc1' 'v2.6.39'
# good: [c44dead70a841d
git bisect good c44dead70a841d9
# bad: [a09ed5e0008444
git bisect bad a09ed5e00084448
# good: [9461702d2a54cd
git bisect good 9461702d2a54cd4
# good: [2b030bda66b0a5
git bisect good 2b030bda66b0a59
# bad: [931474c4c30633
git bisect bad 931474c4c306334
# skip: [69f7876b2ab61e
git bisect skip 69f7876b2ab61e8
# good: [4b65177b27ede9
git bisect good 4b65177b27ede9d
# good: [2c34b850ee1e9f
git bisect good 2c34b850ee1e9f8
# good: [fcca7926299944
git bisect good fcca79262999448
# good: [8eb572942ca028
git bisect good 8eb572942ca0289
# good: [4697995b98417c
git bisect good 4697995b98417c6
This build is testing commit fcfc768806f2ed8
josejuan05 (josejuan05) wrote : | #145 |
Ok.
So far I've had 11 consecutive successful boots with bisect012
I took another look at 005. While the screen goes black and I can bring up no TTY, kern.log shows no evergreen_cp_resume errors. Furthermore, the system eventually gets to the gdm login screen (I can hear the "ding noise" and can move my cursor and keys in order to restart the system, blindly), but the screen is still black. I managed to properly boot to the gdm login screen 20 times consecutively. Furthermore, I didn't get any kernel panics.
As it doesn't appear exhibit the evergreen_cp_resume error, I believe for the purposes of the bisect that 005 should be in the "good" category, with 001 and 004 (and, by my testing, 012)
As for the _gart_unbind error, I will hold off on reporting it unless I see it again, since I have only seen it on a bisect.
Seth Forshee (sforshee) wrote : | #146 |
bisect013 is now available.
http://
# bad: [fcfc768806f2ed
git bisect bad fcfc768806f2ed8
Now testing commit 2703c21a82301f5
Martin Stjernholm (msub) wrote : | #147 |
Bisect 013 booted ok 24 times out of 30. 3 of the remaining times failed with an afaics unrelated oops in azx_interrupt in the snd_hda_intel driver. The last 3 times there were oops'es which didn't manage to get sufficiently logged on the tty to see where they were (the computer froze either before logging the relevant line, or after having scrolled it off the screen). I think it's most likely that those were in snd_hda_intel as well, since some other boots with that oops froze before logging the oops entirely.
So my conclusion is that bisect 013 is not affected by the evergreen_cp_resume bug. I therefore didn't test 012 since it should work by implication under the bisect assumption.
Seth Forshee (sforshee) wrote : | #148 |
bisect014 is now available.
http://
# bad: [2703c21a82301f
git bisect bad 2703c21a82301f5
Now testing commit 000703f44c77b15
Martin Stjernholm (msub) wrote : | #149 |
After 30 boots with bisect 014 I got 3 oopses but none mentioning evergreen_
josejuan05 (josejuan05) wrote : | #150 |
I have had 20 boots with one oops, but not evergreen_cp_resume on bisect 014. I second Martin.
Seth Forshee (sforshee) wrote : | #151 |
bisect015 is now available.
http://
# bad: [000703f44c77b1
git bisect bad 000703f44c77b15
Now testing commit 63f7d9828bf55cc
josejuan05 (josejuan05) wrote : | #152 |
I have booted bisect015 twice, and both times got evergreen_
This commit is affected by the bug.
Seth Forshee (sforshee) wrote : | #153 |
bisect016 is now available.
http://
# good: [63f7d9828bf55c
git bisect good 63f7d9828bf55cc
Now testing commit 99b38b4acc0d7db
josejuan05 (josejuan05) wrote : | #154 |
On five boots of bisect016 I have two evergreen_cp_start errors.
Seth Forshee (sforshee) wrote : | #155 |
bisect017 is now available.
http://
# good: [99b38b4acc0d7d
git bisect good 99b38b4acc0d7db
Now testing commit 3448a19da479b6b
josejuan05 (josejuan05) wrote : | #156 |
On bisect017 I've had 19 successful boots, and one that ended with the gart_unbind error (not evergreen_cp_...)
Seth Forshee (sforshee) wrote : | #157 |
bisect018 is now available. This should be the last bisect build, then we'll need to apply the commit identified as fixing the issue onto natty and see if that fixes the problem there.
http://
# bad: [3448a19da479b6
git bisect bad 3448a19da479b6b
Now testing commit 8116188fdef5946
josejuan05 (josejuan05) wrote : | #158 |
Out of three boots, the first two failed on an evergreen_cp_ error.
josejuan05 (josejuan05) wrote : | #159 |
Oh. Clarification: Out of three boots of bisect018, the first two failed on an evergreen_cp_ error.
Seth Forshee (sforshee) wrote : | #160 |
The bisect identified this as the commit that fixes the problem:
3448a19 vgaarb: use bridges to control VGA routing where possible.
A natty build with this patch applied is available at the link below. Please test to see whether or not the bug is reproducible in this build. Thanks!
http://
josejuan05 (josejuan05) wrote : | #161 |
I have 20 consecutive successful boots on this commit. I have no boot failures yet.
aproposnix (aproposnix) wrote : | #162 |
Cool! it seems to be working for me. I even got a little taste of
the Plymouth boot screen which I haven't seen in a long time :)
I'll continue testing and let you know if I experience the B[lack]SOD.
On Fri, Aug 26, 2011 at 8:19 PM, josejuan05 <email address hidden>wrote:
> I have 20 consecutive successful boots on this commit. I have no boot
> failures yet.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
> normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
> Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
> Confirmed
> Status in “linux” package in Ubuntu:
> Incomplete
> Status in “xserver-
> Invalid
>
> Bug description:
> [Problem]
> On hybrid graphics hardware with this ATI chip and another (e.g. Intel), a
> failure occurs resulting in a black screen and errors from the radeon kernel
> module, as shown below.
>
> [Cause]
> From upstream developer:
>
> "The switcheroo code needs more work to switch properly on some
> systems it seems. There are a set acpi methods required to
> activate/deactivate the respective gpus. The drivers need to load and
> initialize active hw. If the hw is not active when the driver loads,
> then the hw is not set up properly and it won't work. Probably some
> ordering issues in how the switcheroo acpi methods are called."
>
> [Workarounds]
> Several options:
>
> 1. If your BIOS includes functionality to disable the Intel card, use
> BIOS settings to select which chip to load.
>
> 2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note
> that the default radeon gallium driver only works with KMS, so YMMV.
>
> [Original Report]
> I'm running natty, and every since the upgrade to 6.14.0 I've been unable
> to consistently boot. After some discussion in the forums, I tried
> repeatedly to boot into recovery mode. In most cases, I got a black screen.
> One time though, when I was able to successfully increase the brightness, I
> saw some errors from the radeon module. I took a photo (available at
> http://
> best as I can read it:
>
> Stack:
> ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
> 00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
> ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
> Call Trace:
> [<ffffffffa022a
> [<ffffffffa022c
> [<ffffffffa01fe
> [<ffffffffa022d
> [<ffffffffa01a5
> [<ffffffffa01a7
> [<ffffffffa007f
> [<ffffffff81154
> [<ffffffffa0237
> [<ffffffff812fe
> [<fffff...
Martin Stjernholm (msub) wrote : | #163 |
I've tested the patched natty kernel and have over 30 boots without the evergreen_cp_resume oops, so the bisected patch indeed appears to be the right one. Does it explain the race?
Seth Forshee (sforshee) wrote : | #164 |
I've been looking at the patch we identified as fixing the problem, but I can't work out any causal relationship between what it does and the GPU being on when nouveau probes. I've inquired about it on the upstream bugzilla to see if I'm missing something. But I'm beginning to suspect that the patch alters the timing of things enough to prevent the problem from being triggered.
Changed in linux (Ubuntu): | |
status: | Incomplete → In Progress |
aproposnix (aproposnix) wrote : | #165 |
Hi Seth,
One bad experience I have had since installing the patched kernel. After countless good boots I decided to add the following lines to /etc/tc.local in order to use switcheroo:
chown "username" /sys/kernel/
echo OFF > /sys/kernel/
Everything seemed to work for the first 5 or so boots but then I started getting the blackscreens again. The message was similar but not the same. Unfortunately I didn't manage to get a screen of it.
Once I got back into my desktop I removed the lines from rc.local and then the issue disappeared.
I've gone ahead and added the lines again to see if I can recreate the issue but so far nothing.
Not sure if this info is of any use to you at all but I thought I would share just in case it would.
AceLan Kao (acelankao) wrote : | #166 |
Seth,
I can confirm the commit
3448a19 vgaarb: use bridges to control VGA routing where possible.
fixed this issue.
And this issue is a hwcert block issue, so is there anything that I can help to make the SRU process faster?
tags: | added: blocks-hwcert |
josejuan05 (josejuan05) wrote : | #167 |
@harry
I don't know if it means anything, but in newer kernels you may be unable to use that command, since /sys/kernel/debug may not be owned by you. A quick and dirty (if dangerous) solution would be to change the first line to
chown "username(:group)" /sys/kernel/debug/ -R
where username is again your username and :group is the optional argument for the group ownership of the folder
Back to the bug, though, I did note one related oops in 50 boots (looking through my logs). I did note when it happened - it did cause a boot failure, but I did not give the evergreen error. Rather I got the gart_set_page error. However, it didn't cause a kernel dump like in the bisects. I only believe that the gart_set_page error to be related because it does not show up in kernels which were susceptible to the evergreen_cp_start oops. This still does not explain the race condition.
FWIW I did some plotting out of the bisects on paper and found that if there was a commit that fixed the gart_set error it was between bisects 014 and 017.
Seth Forshee (sforshee) wrote : | #168 |
AceLan: The problem right now is that I suspect that the patch doesn't actually do anything to directly fix the problem. I.e., that the patch fixes this oops is just a side-effect of burning more time before the driver tries to access the hardware or something like that. I'm not sure though so I'd like to get confirmation from upstream whether or not the patch is a real fix for the problem, but so far I haven't received a response.
AceLan Kao (acelankao) wrote : | #169 |
Seth,
To exam your assumption, I reverted that commit and replaced the vga_arbiter_
The error message is the same.
Do you have any suggestion to do the test?
==================
diff --git a/drivers/
index ace2b16..5b93935 100644
--- a/drivers/
+++ b/drivers/
@@ -500,6 +500,8 @@ static bool vga_arbiter_
#endif
+ msleep(1000);
+
/* Add to the list */
===================
josejuan05 (josejuan05) wrote : | #170 |
If gart_ and evergreen_ are related errors (my presumption), I can confirm an actual failure (complete with debug dump) on the patched kernel.
It took about 100 or so boots for this to show up.
Changed in xserver-xorg-driver-ati: | |
importance: | High → Critical |
Changed in xserver-xorg-driver-ati: | |
status: | Confirmed → Fix Released |
Seth Forshee (sforshee) wrote : | #171 |
Has anyone tested this yet since oneiric released? I'd like to get confirmation that the problem is fixed there. Thanks!
dsainty (dsainty) wrote : | #172 |
In Oneiric it seems to do better (in respect to default handling of the hardware), in at least it boots to the integrated card, rather than a black screen.
I haven't had any success switching to the discrete card via vgaswitcheroo though.
Seth Forshee (sforshee) wrote : | #173 |
Moving status to Fix Released based on positive test results with Oneiric noted in comment #172.
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
Elia (elia-baragiola) wrote : | #174 |
one question i have ubuntu 10.04 LTS
the fix will be released on this version or not?
for fix the bug , What should I do? :D
thanks you !
Ihorko (ihorchyhin) wrote : | #175 |
I have not such good experience with Oneiric on my HP G62-a35er. First of all by default startup brightness is 0% (see bug #873191). The second is that I tried to turn power on discrete card by adding corresponding command to /etc/rc.local and once some kernel oops occurred at startup (now I moved this command to startup with delay 10 seconds because of some problems with snd_hda_intel too). I can attach part of that fail log next week because of I have only mobile broadband connection from time to time on my laptop.
Cabalbl4 (i-vohmin) wrote : | #176 |
The bug is back now for me (( The thing I have done before to make it gone - pass "nosplash" instead of "quiet splash" in grub and enable option GRUB_TERMINAL=
Now i have reverted these options - and the bug is back for me on Natty 2.6.38-10-generic.
So it seems to be graphical-mode related. (Maybe plymouth - dependable?)
Now reverting back to working options.
Cabalbl4 (i-vohmin) wrote : | #177 |
No, the later fix is not working anymore. Now blacklisting radeon to modprobe it manually when switching.
Daniel Buchner (danieljb2) wrote : | #178 |
I just ran into this bug after installing the latest 11.10 release on a new ENVY, looks like the fix here didn't work :(
Elia (elia-baragiola) wrote : | #179 |
bug persist :(
someone have update the bios ? now can fix the graphic card?
tags: | added: hybrid-graphics |
Vangel Ajanovski (ajanovski) wrote : | #180 |
HP dm4t (ATI 5470), after several installs last weekend - bug no longer present in Xubuntu 11.10 stock, also not present after upgrade to kernel 3.2.0-10. It was still present with LinixMint Debian Edition stock (at that moment 2.6 kernel), but after upgrade to latest kernel it was fixed.
In fact I have not seen this for some months now, but I did a clean reinstall just in order to check it.
Klaus Reichl (klaus-reichl) wrote : Invitation to connect on LinkedIn | #181 |
LinkedIn
------------
Bug,
I'd like to add you to my professional network on LinkedIn.
- Klaus
Klaus Reichl
Technology Expert at Thales
Austria
Confirm that you know Klaus Reichl:
https:/
--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://
(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.
Zentai Andras (andras-zentai) wrote : | #182 |
Really clever way to invite all bug subscribers to your LinkedIn network... ;)
Cabalbl4 (i-vohmin) wrote : | #183 |
Switched to 12.04. Bug seems to be gone.
kolwas (kolwas) wrote : | #184 |
<Switched to 12.04. Bug seems to be gone.>
On my machine it is not true, I don't know if there was some updates but now ubuntu starts in 50%
Klaus Reichl (klaus-reichl) wrote : Re: [Bug 727620] Re: [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in normal boot (Hybrid graphics) | #185 |
Hi all,
I really apoligize for that, still don't know what was going on.
Sorry,
Klaus
--
Klaus Reichl <email address hidden>
Danhausergasse 8/16 +43 6991 84 137 94
1040 Wien
On Tue, Jun 5, 2012 at 9:33 PM, Zentai Andras <email address hidden>wrote:
> Really clever way to invite all bug subscribers to your LinkedIn
> network... ;)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> [Radeon HD 5650 and 5470] Kernel BUG during recovery boot and in
> normal boot (Hybrid graphics)
>
> Status in The Linux Kernel:
> Confirmed
> Status in X.org XServer - ATI gfx chipset driver:
> Fix Released
> Status in “linux” package in Ubuntu:
> Fix Released
> Status in “xserver-
> Invalid
>
> Bug description:
> [Problem]
> On hybrid graphics hardware with this ATI chip and another (e.g. Intel),
> a failure occurs resulting in a black screen and errors from the radeon
> kernel module, as shown below.
>
> [Cause]
> From upstream developer:
>
> "The switcheroo code needs more work to switch properly on some
> systems it seems. There are a set acpi methods required to
> activate/deactivate the respective gpus. The drivers need to load and
> initialize active hw. If the hw is not active when the driver loads,
> then the hw is not set up properly and it won't work. Probably some
> ordering issues in how the switcheroo acpi methods are called."
>
> [Workarounds]
> Several options:
>
> 1. If your BIOS includes functionality to disable the Intel card, use
> BIOS settings to select which chip to load.
>
> 2. Disable KMS by adding `radeon.modeset=0` in the boot line. Note
> that the default radeon gallium driver only works with KMS, so YMMV.
>
> [Original Report]
> I'm running natty, and every since the upgrade to 6.14.0 I've been unable
> to consistently boot. After some discussion in the forums, I tried
> repeatedly to boot into recovery mode. In most cases, I got a black
> screen. One time though, when I was able to successfully increase the
> brightness, I saw some errors from the radeon module. I took a photo
> (available at http://
> call trace, as best as I can read it:
>
> Stack:
> ffff880149eb8000 ffff880149eb8000 0000000000000011 0000000000000911
> 00000000fffffff4 ffff88014b6c7800 ffff88014b0f7b58 ffffffffa022aba0
> ffff8801460f7b58 ffff880149eb8000 0000000000000000 0000000000410028
> Call Trace:
> [<ffffffffa022a
> [<ffffffffa022c
> [<ffffffffa01fe
> [<ffffffffa022d
> [<ffffffffa01a5
> [<ffffffffa01a7
> [<ffffffffa007f
> [<ffffffff81154
> [<ffffffffa0237
> [<ffffffff812fea...
Cabalbl4 (i-vohmin) wrote : | #186 |
As far as I understand it has time racing condition between intel and radeon modules. And it may be related with vgaswitcheroo too. On earlyer versions of ubuntu changing the waiting time and the gfx mode in grub seemed to affect the bug somehow. Some combination of that even maked it disappear :)
I forgot to mention, my computer is an HP Envy 14, so I have the discrete ATI card, and also integrated graphics from the core i5 (which uses the i915 driver). Just in case it's some interaction between the two that causes the crash.