[ATI] GPU lockup with gfxpayload=keep

Bug #605614 reported by andrew thomas on 2010-07-14
54
This bug affects 9 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Undecided
Unassigned
Maverick
Undecided
Unassigned
linux (Ubuntu)
Undecided
Unassigned
Maverick
Undecided
Unassigned

Bug Description

Binary package hint: grub2

The addition of
    recordfail
    load_video
    set gfxpayload=keep
to the new 10_linux file results in an unbootable system. If I edit out the three lines, I am able to boot properly. I have not yet experimented as to which line needs to be eliminated, although I suspect that it may be the load_video. Let me know if you need any more info. While I do not believe that it is relavent, I am using the xorg-edgers ppa.

Nate Muench (Mink) (n-muench) wrote :

I can confirm this. But I haven't booted my Maverick VM since Monday (I forgot yesterday).

Colin Watson (cjwatson) wrote :

See this post for background:

  https://lists.ubuntu.com/archives/ubuntu-devel/2010-July/030995.html

What happens if you change 'set gfxpayload=keep' to 'set gfxpayload=text'? In what way does it break - is it "just" a corrupted console (e.g. is the machine pingable) or does it fail in some other way? What kernel version are you booting?

Colin Watson (cjwatson) wrote :

Oh, and what graphics card do you have?

andrew thomas (atswartz) wrote :

OK, I see what part of the problem was. I had uninstalled the vesa driver. On the other distros that I use, I compile my own kernels without vesa so I thought that I didn't need it. Once I installed the vesa driver, I was able to boot into low-graphics mode. When I changed 'set gfxpayload=keep' to 'set gfxpayload=text' it booted ok. I have a radeon 3870 using the open-source drivers. I usually only get the logo for a few seconds just before gdm login. For the first 10 seconds or so, I have a blinking cursor in the upper-left corner. Then I get a message about k10temp module being unsuitable for my processor (Phenom 9500,) then I get the ubuntu logo for maybe 5-10 seconds before the gdm login screen. Without vesa, I was getting about to the point where the ubuntu logo should display and the top-half of the monitor would be pitch black and the bottom-half would be grey (back-lit, kind of like the color of the default gnome panel) and I could do nothing but ctrl-alt-delete. I am booting 2.6.35-7-generic. I got to go now, but will read more of your link tomorrow. Thanks for your quick response.

Michael Bienia (geser) wrote :
Download full text (5.0 KiB)

I'm having a similar problem. Without editing the boot entry to "set gfxpayload=text" I can't boot into X. Short after the boot both my monitors don't show any video signal, I get only to see a glimpse of the text messages from booting. I can only login from remote (ssh) to reboot. I've an "ATI Technologies Inc RV670PRO [Radeon HD 3850]" and I'm using the open-source Xorg driver (if it's important). I also use "GRUB_GFXMODE=1280x1024" but the few experiments I did with "GRUB_GFXMODE=640x480" didn't show any difference. I'm currently using kernel 2.6.35-7-generic (AMD64).

Looking at the syslog for the problematic boot I see:

Jul 15 09:34:19 vorlon kernel: [ 16.910054] radeon 0000:02:00.0: GPU lockup CP stall for more than 1000msec
Jul 15 09:34:19 vorlon kernel: [ 16.910058] ------------[ cut here ]------------
Jul 15 09:34:19 vorlon kernel: [ 16.910083] WARNING: at /build/buildd/linux-2.6.35/drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x365/0x3d0 [rade
on]()
Jul 15 09:34:19 vorlon kernel: [ 16.910085] Hardware name: M56S-S3
Jul 15 09:34:19 vorlon kernel: [ 16.910087] GPU lockup (waiting for 0x00000002 last fence id 0x00000001)
Jul 15 09:34:19 vorlon kernel: [ 16.910088] Modules linked in: snd_hda_codec_atihdmi vga16fb vgastate snd_hda_codec_realtek radeon ttm ppdev drm_kms_helper snd_
hda_intel lp snd_hda_codec drm snd_seq_midi snd_hwdep psmouse serio_raw snd_rawmidi snd_pcm snd_seq_midi_event snd_seq snd_timer snd_seq_device parport_pc parport
 i2c_algo_bit snd soundcore snd_page_alloc i2c_nforce2 k8temp edac_core edac_mce_amd raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq
 async_tx raid1 raid0 firewire_ohci firewire_core multipath ahci crc_itu_t forcedeth pata_amd libahci linear
Jul 15 09:34:19 vorlon kernel: [ 16.910119] Pid: 1076, comm: Xorg Not tainted 2.6.35-7-generic #12-Ubuntu
Jul 15 09:34:19 vorlon kernel: [ 16.910121] Call Trace:
Jul 15 09:34:19 vorlon kernel: [ 16.910128] [<ffffffff8105f6ef>] warn_slowpath_common+0x7f/0xc0
Jul 15 09:34:19 vorlon kernel: [ 16.910131] [<ffffffff8105f7e6>] warn_slowpath_fmt+0x46/0x50
Jul 15 09:34:19 vorlon kernel: [ 16.910144] [<ffffffffa02d64d5>] radeon_fence_wait+0x365/0x3d0 [radeon]
Jul 15 09:34:19 vorlon kernel: [ 16.910148] [<ffffffff8107e4d0>] ? autoremove_wake_function+0x0/0x40
Jul 15 09:34:19 vorlon kernel: [ 16.910160] [<ffffffffa02d6cd1>] radeon_sync_obj_wait+0x11/0x20 [radeon]
Jul 15 09:34:19 vorlon kernel: [ 16.910168] [<ffffffffa00f51b3>] ttm_bo_wait+0x103/0x1c0 [ttm]
Jul 15 09:34:19 vorlon kernel: [ 16.910182] [<ffffffffa02ed927>] radeon_gem_wait_idle_ioctl+0x97/0x140 [radeon]
Jul 15 09:34:19 vorlon kernel: [ 16.910197] [<ffffffffa020035a>] drm_ioctl+0x34a/0x4c0 [drm]
Jul 15 09:34:19 vorlon kernel: [ 16.910200] [<ffffffff811e6de9>] ? ext4_file_write+0x39/0xb0
Jul 15 09:34:19 vorlon kernel: [ 16.910215] [<ffffffffa02ed890>] ? radeon_gem_wait_idle_ioctl+0x0/0x140 [radeon]
Jul 15 09:34:19 vorlon kernel: [ 16.910218] [<ffffffff8116159d>] vfs_ioctl+0x3d/0xd0
Jul 15 09:34:19 vorlon kernel: [ 16.910221] [<ffffffff81161e71>] do_vfs_ioctl+0x81/0x340
Jul 15 09:34:19 vorlon kernel: [ 16.910224] [<...

Read more...

Andy Whitcroft (apw) on 2010-07-15
tags: added: kernel-candidate kernel-graphics kernel-reviewed
Colin Watson (cjwatson) on 2010-07-15
summary: - Maverick's grub-pc package (1.98+20100710-1ubuntu1) causes unbootable
- system
+ [ATI] Maverick's grub-pc package (1.98+20100710-1ubuntu1) causes
+ unbootable system
summary: - [ATI] Maverick's grub-pc package (1.98+20100710-1ubuntu1) causes
- unbootable system
+ [ATI] GPU lockup with gfxpayload=keep
Colin Watson (cjwatson) wrote :

I'd like to give the kernel guys an opportunity to debug this; we did expect some problems switching from VESA to native, but we were never going to shake them out without giving it a try. I expect it's possible to blacklist cards from gfxpayload=keep with a bit of work, although I'd like to save that for a last resort.

Colin Watson (cjwatson) wrote :

Nate, it's interesting that you're seeing this in a VM. Perhaps you could file a separate bug for that, and extract the syslog from that boot attempt and attach it?

(It may be the same type of issue, but this will probably be driver-specific to some extent, so I'd like to keep instances for different drivers separate.)

andrew thomas (atswartz) wrote :

In addition to comment #4
When I try to use 'set gfxpayload=keep' after my screen goes bad I can get a terminal with ctl-alt-f2 and login and reboot.
On f1 I get this [drm_radeon_cs_ioctl] *ERROR* Faild to schedult IB! followed by a bunch of similar messages. In logs I have this:
log attached.

andrew thomas (atswartz) wrote :

Would it be a bad idea for me to edit /etc/grub.d/10_linux to change:

# Use a simple linear framebuffer appropriate to the platform if support
  # for it is known to be built into the kernel. We need it to be built-in
  # rather than modular, as otherwise early output from the kernel won't
  # work.
  if [ "x$GRUB_GFXPAYLOAD_LINUX" = x ]; then
      cat << EOF
    load_video
EOF
      if [ "x$LINUX_CONFIG_FB" != x ] \
      && grep -qx "$LINUX_CONFIG_FB=y" /boot/config-${version} 2> /dev/null \
      && grep -qx "CONFIG_VT_HW_CONSOLE_BINDING=y" /boot/config-${version} 2> /dev/null; then
      cat << EOF

EOF
      fi
  else
      cat << EOF
    set gfxpayload=$GRUB_GFXPAYLOAD_LINUX
EOF
  fi

changing set gfxpayload=keep to set gfxpayload=text?

andrew thomas (atswartz) wrote :

# Use a simple linear framebuffer appropriate to the platform if support
  # for it is known to be built into the kernel. We need it to be built-in
  # rather than modular, as otherwise early output from the kernel won't
  # work.
 Sorry I made an error in my cut and paste

 if [ "x$GRUB_GFXPAYLOAD_LINUX" = x ]; then
      cat << EOF
    load_video
EOF
      if [ "x$LINUX_CONFIG_FB" != x ] \
      && grep -qx "$LINUX_CONFIG_FB=y" /boot/config-${version} 2> /dev/null \
      && grep -qx "CONFIG_VT_HW_CONSOLE_BINDING=y" /boot/config-${version} 2> /dev/null; then
      cat << EOF
    set gfxpayload=text # change keep to text?
EOF
      fi
  else
      cat << EOF
    set gfxpayload=$GRUB_GFXPAYLOAD_LINUX
EOF
  fi

It would be far simpler to add this line to /etc/default/grub:

GRUB_GFXPAYLOAD_LINUX=text

Doing that is fine to get things going again for you - it would be good
if you could be available to revert that once the kernel folks have a
candidate fix to test, though.

Nate Muench (Mink) (n-muench) wrote :

Colin,

2 questions:

1) How do it extract the syslog?

2) I assume you want the the log with the GRUB that works, not the one that doesn't.

Colin Watson (cjwatson) wrote :

On Thu, Jul 15, 2010 at 07:45:11PM -0000, Nate Muench wrote:
> 1) How do it extract the syslog?

Assuming that it doesn't lock up so badly that it can't write to the
syslog, then you should just be able to boot again with 'set
gfxpayload=text' and then look back in the syslog to the previous boot
attempt.

> 2) I assume you want the the log with the GRUB that works, not the one
> that doesn't.

No - logs of things working are boring. Logs of things failing are
interesting.

Nate Muench (Mink) (n-muench) wrote :

One more stupid question:

Where would this syslog be located?

Also I want to make a note on the VM I'm using. It's being used on VMware Workstation 7.1 which have my own personal flavor of open-vm-tools (located here: https://launchpad.net/~n-muench/+archive/ppa/+packages), since the packages currently available in Maverick aren't compatible with Maverick. My flavor is the one from Maverick's repos, but with the necessary patches (from this bug report: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/598542) to make it work with Maverick.

I also a forthcoming build of open-vm-tools which is the newest version (from June), and comes with the Debian config (there's been 2 revisions for this release so far), and the updated patches (to make it work with Maverick).

Dave Gilbert (ubuntu-treblig) wrote :

Also happens for me; upgrade from Lucid - Radeon HD4350 on an ASRock P55M Pro

Dave

Jasper Frumau (jfrumau) wrote :

Have them same Issue using Maverick Meerkat and 2.6.35.x kernel in VmWare Fusion Virtual Box 3.1.0 . I have attached the latest 1oo syslog lines. I can startup Ubuntu MM using set gfxpayload=text instead of set gfxpayload=keep , but loading is still slow. And of course I would prefer a more permanent fix.

Jasper Frumau (jfrumau) wrote :

Have them same Issue using Maverick Meerkat and 2.6.35.x kernel in VmWare Fusion Virtual Box 3.1.0 . I have attached the latest 100 syslog lines. I can startup Ubuntu MM using set gfxpayload=text instead of set gfxpayload=keep , but loading is still slow. And of course I would prefer a more permanent fix.

Dan Andreșan (danyer) wrote :

no image on the monitor after booting Maverick with a newly inserted ATI Radeon HD 3450.
cannot switch to the text consoles
I'm going to try some workarounds and post back.

tags: added: kj-triage
John Webster (civil-bigpond) wrote :

I appear to have the same problem with my plain vanilla Compaq C700 laptop (Celeron 560 processor, integrated graphics), except for the small oddity that the boot process stalls with two side-by-side static Ubuntu logos displayed in the top half of the screen, rather than having an entirely dark screen.

I can't now be certain which version of the kernel introduced the problem. I know that 2.6.35.6 (the original Alpha 2 version) worked OK, and I know that the problem has persisted through versions -9 and -10. To date, I have worked around the issue simply by booting in recovery mode and manually restarting the X system.

Andy Whitcroft (apw) wrote :

We believe that this may be fixed by updates in the latest Maverick kernel. Could you test that (it should work with lucid userspace). Please test and report back here. Thanks.

tags: removed: kernel-candidate
Michael Bienia (geser) wrote :

I've checked linux-image-2.6.35-11-generic 2.6.35-11.16 and with "set gfxpayload=keep" my box boots again into X. If only bug 605843 would get fixed soon too, then I could use this kernel and gfxpayload=keep.

andrew thomas (atswartz) wrote :

I am able to boot now with 2.6.35-11 kernel, but I do get a moment of screen corruption when the ubuntu splash is supposed to be displayed ( it is not) and then a normal procession to gdm login screen.

andrew thomas (atswartz) wrote :

I have just installed the new 2.6.35-12.17 kernel. It is now hit and miss as to whether I can boot to a graphical desktop. I sometimes get corruption at shutdown. The weird thing is that if I get corruption at shutdown, it will boot to gdm, yet when the ubuntu logo is displayed at shutdown, the next boot locks and I have to ctl-alt-f2 and restart. I also have updated mesa to 7.9.0+git20100727.

Michael Bienia (geser) wrote :
Download full text (12.6 KiB)

This here is with 2.6.35-12.17 and "set gfxpayload=keep". At first boot I get some garbage when X starts (on both my monitors) but after a reboot X starts up fine. Here is some kernel log from the first unsuccessful boot:

Jul 30 09:19:48 vorlon kernel: [ 21.819264] ------------[ cut here ]------------
Jul 30 09:19:48 vorlon kernel: [ 21.819288] WARNING: at /build/buildd/linux-2.6.35/drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x365/0x3d0 [radeon]()
Jul 30 09:19:48 vorlon kernel: [ 21.819290] Hardware name: M56S-S3
Jul 30 09:19:48 vorlon kernel: [ 21.819292] GPU lockup (waiting for 0x00000003 last fence id 0x00000001)
Jul 30 09:19:48 vorlon kernel: [ 21.819293] Modules linked in: snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec radeon snd_hwdep snd_seq_midi snd_pcm ttm snd_rawmidi drm_kms_helper snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse drm snd serio_raw i2c_algo_bit k8temp i2c_nforce2 soundcore ppdev edac_core edac_mce_amd snd_page_alloc parport_pc lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath ahci firewire_ohci firewire_core crc_itu_t linear forcedeth pata_amd libahci
Jul 30 09:19:48 vorlon kernel: [ 21.819329] Pid: 1086, comm: Xorg Tainted: G D 2.6.35-12-generic #17-Ubuntu
Jul 30 09:19:48 vorlon kernel: [ 21.819331] Call Trace:
Jul 30 09:19:48 vorlon kernel: [ 21.819338] [<ffffffff8105f6ef>] warn_slowpath_common+0x7f/0xc0
Jul 30 09:19:48 vorlon kernel: [ 21.819341] [<ffffffff8105f7e6>] warn_slowpath_fmt+0x46/0x50
Jul 30 09:19:48 vorlon kernel: [ 21.819354] [<ffffffffa0288575>] radeon_fence_wait+0x365/0x3d0 [radeon]
Jul 30 09:19:48 vorlon kernel: [ 21.819358] [<ffffffff8107e4a0>] ? autoremove_wake_function+0x0/0x40
Jul 30 09:19:48 vorlon kernel: [ 21.819370] [<ffffffffa0288d71>] radeon_sync_obj_wait+0x11/0x20 [radeon]
Jul 30 09:19:48 vorlon kernel: [ 21.819378] [<ffffffffa02071a3>] ttm_bo_wait+0x103/0x1c0 [ttm]
Jul 30 09:19:48 vorlon kernel: [ 21.819392] [<ffffffffa029fa27>] radeon_gem_wait_idle_ioctl+0x97/0x140 [radeon]
Jul 30 09:19:48 vorlon kernel: [ 21.819407] [<ffffffffa014d34a>] drm_ioctl+0x34a/0x4c0 [drm]
Jul 30 09:19:48 vorlon kernel: [ 21.819421] [<ffffffffa029f990>] ? radeon_gem_wait_idle_ioctl+0x0/0x140 [radeon]
Jul 30 09:19:48 vorlon kernel: [ 21.819425] [<ffffffff81588ee5>] ? _raw_spin_lock_irq+0x15/0x20
Jul 30 09:19:48 vorlon kernel: [ 21.819430] [<ffffffff810097d1>] ? handle_signal+0x131/0x280
Jul 30 09:19:48 vorlon kernel: [ 21.819432] [<ffffffff810099a1>] ? do_signal+0x81/0x1a0
Jul 30 09:19:48 vorlon kernel: [ 21.819436] [<ffffffff8116150d>] vfs_ioctl+0x3d/0xd0
Jul 30 09:19:48 vorlon kernel: [ 21.819438] [<ffffffff81161de1>] do_vfs_ioctl+0x81/0x340
Jul 30 09:19:48 vorlon kernel: [ 21.819441] [<ffffffff81009da5>] ? sys_rt_sigreturn+0x235/0x250
Jul 30 09:19:48 vorlon kernel: [ 21.819444] [<ffffffff81162121>] sys_ioctl+0x81/0xa0
Jul 30 09:19:48 vorlon kernel: [ 21.819447] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
Jul 30 09:19:48 vorlon kernel: [ 21.819449] ---[ end trace d3627d239cbe15f9 ]---
Jul 30 09:19:48 vorlo...

Dave Gilbert (ubuntu-treblig) wrote :

This is still happening for me on 2.6.35-12.17 ; just get a black screen, removing the set gfxpayload and I'm back in business.
(Hd4350 on i7)

Dave

Jasper Frumau (jfrumau) wrote :

Still having the same startup problems. Just did the last Maverick Meerkat partial upgrade. Then I tried to reinstall VMWare Tools, but it failed. On reboot I got a black sccreen. Then I restarted aagain and got the reboot menu. there I was able to adjust gfxpayload=text and boot up the latest kernel again
FYI:
$ uname -a
Linux ubuntu 2.6.35-15-generic #21-Ubuntu SMP Wed Aug 11 16:41:40 UTC 2010 i686 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.10
DISTRIB_CODENAME=maverick
DISTRIB_DESCRIPTION="Ubuntu maverick (development branch)"

Launchpad Janitor (janitor) wrote :
Download full text (6.8 KiB)

This bug was fixed in the package grub2 - 1.98+20100804-4ubuntu1

---------------
grub2 (1.98+20100804-4ubuntu1) maverick; urgency=low

  * Resynchronise with Debian. Remaining changes:
    - Adjust for default Ubuntu boot options ("quiet splash").
    - Default to hiding the menu; holding down Shift at boot will show it.
    - Set a monochromatic theme for Ubuntu.
    - Apply Ubuntu GRUB Legacy changes to legacy update-grub script: title,
      recovery mode, quiet option, tweak how memtest86+ is displayed, and
      use UUIDs where appropriate.
    - Fix backslash-escaping in merge_debconf_into_conf.
    - Remove "GNU/Linux" from default distributor string.
    - Add crashkernel= options if kdump and makedumpfile are available.
    - If other operating systems are installed, then automatically unhide
      the menu. Otherwise, if GRUB_HIDDEN_TIMEOUT is 0, then use keystatus
      if available to check whether Shift is pressed. If it is, show the
      menu, otherwise boot immediately. If keystatus is not available, then
      fall back to a short delay interruptible with Escape.
    - Allow Shift to interrupt 'sleep --interruptible'.
    - Don't display introductory message about line editing unless we're
      actually offering a shell prompt. Don't clear the screen just before
      booting if we never drew the menu in the first place.
    - Remove some verbose messages printed before reading the configuration
      file.
    - Suppress progress messages as the kernel and initrd load for
      non-recovery kernel menu entries.
    - Change prepare_grub_to_access_device to handle filesystems
      loop-mounted on file images.
    - Ignore devices loop-mounted from files in 10_linux.
    - Show the boot menu if the previous boot failed, that is if it failed
      to get to the end of one of the normal runlevels.
    - Handle RAID devices containing virtio components.
    - Don't generate /boot/grub/device.map during grub-install or
      grub-mkconfig by default.
    - Adjust upgrade version checks for Ubuntu.
    - Don't display "GRUB loading" unless Shift is held down.
    - Adjust versions of grub-doc and grub-legacy-doc conflicts to tolerate
      our backport of the grub-doc split.
    - Fix LVM/RAID probing in the absence of /boot/grub/device.map.
    - Look for .mo files in /usr/share/locale-langpack as well, in
      preference.
    - Make sure GRUB_TIMEOUT isn't quoted unnecessarily.
    - Probe all devices in 'grub-probe --target=drive' if
      /boot/grub/device.map is missing.
    - Adjust hostdisk id for hard disks, allowing grub-setup to use its
      standard workaround for broken BIOSes.
    - Build-depend on qemu-kvm rather than qemu-system for grub-pc tests.
    - Use qemu rather than qemu-system-i386.
    - Extend the EFI version of grub-install to be able to install into an
      EFI System Partition mounted on /boot/efi in a location that complies
      with the EFI specification.
    - Upgrade the installed core image when upgrading grub-efi-ia32 or
      grub-efi-amd64, although only if /boot/efi/EFI/ubuntu already exists.
    - Make grub-efi-ia32 and grub-efi-amd64 depend on efibootmgr so that
      grub-install...

Read more...

Changed in grub2 (Ubuntu):
status: New → Fix Released

I'm closing the Maverick linux kernel task for now as grub2 switched back to default to gfxpayload=text for Maverick. However, we do want to keep the linux task open for Natty. Thanks.

Changed in linux (Ubuntu Maverick):
status: New → Won't Fix
tags: added: kernel-key-gfxpayload

This bug is missing log files that will aid in dianosing the problem. From a terminal window please run:

apport-collect 605614

and then change the status of the bug back to 'New'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers