libdrm-nouveau2 crashes X with kernel 3.13.0-58

Bug #1477801 reported by Daniel Barrett
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Nouveau Xorg driver
Fix Released
Critical
libdrm (Debian)
Fix Released
Unknown
libdrm (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

(Apologies for not using ubuntu-bug, but I've been waiting an hour for it to get past "Collecting problem information".)

I upgraded system to kernel 3.16.0-43.58~14.04.1 (amd64) today. X now crashes with the error "nouveau - gpu lockup" whenever I run Google Chrome (google-chrome-stable 44.0.2403.89-1). Big black boxes appear when I open a bunch of tabs and hover the mouse over the tabs. Eventually X crashes, freezing the whole screen with garbage all over it, and syslog contains the "nouveau" errors below.

The problem went away when I downgraded libdrm2 and libdrm-nouveau2 to 2.4.56-1~ubuntu2 (was 2.4.60-2~ubuntu14.04.1). I did this after reading https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 (see comment #19).

Here's syslog. The "nouveau gpu lockup" error doesn't appear, but it did appear onscreen.

Jul 23 19:22:35 myhost kernel: [ 2568.498288] nouveau E[chrome[7073]] multiple instances of buffer 322 on validation list
Jul 23 19:22:35 myhost kernel: [ 2568.498297] nouveau E[chrome[7073]] validate_init
Jul 23 19:22:35 myhost kernel: [ 2568.498299] nouveau E[chrome[7073]] validate: -22
Jul 23 19:22:35 myhost kernel: [ 2568.514019] nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at 0x0008101000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f9af000 [unknown]
Jul 23 19:23:06 myhost kernel: [ 2598.744949] nouveau E[ DRM] GPU lockup - switching to software fbcon
Jul 23 19:23:23 myhost kernel: [ 2616.347931] nouveau E[Xorg[1550]] failed to idle channel 0xcccc0001 [Xorg[1550]]
Jul 23 19:23:38 myhost kernel: [ 2631.335533] nouveau E[Xorg[1550]] failed to idle channel 0xcccc0001 [Xorg[1550]]
Jul 23 19:23:40 myhost kernel: [ 2633.336751] nouveau E[ PFIFO][0000:01:00.0] playlist 0 update timeout
Jul 23 19:23:43 myhost kernel: [ 2635.629793] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:23:47 myhost kernel: [ 2639.921155] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:23:50 myhost vmnet-dhcpd: DHCPINFORM for 192.168.142.131 from 00:0c:29:d5:2d:15 via vmnet8
Jul 23 19:23:50 myhost vmnet-dhcpd: DHCPACK on 192.168.142.131 to 00:0c:29:d5:2d:15 via vmnet8
Jul 23 19:23:51 myhost kernel: [ 2644.212519] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:23:55 myhost kernel: [ 2648.321472] nouveau E[Xorg[1550]] failed to idle channel 0xcccc0000 [Xorg[1550]]
Jul 23 19:23:56 myhost kernel: [ 2648.503871] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:00 myhost kernel: [ 2652.795221] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:04 myhost kernel: [ 2657.086572] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:08 myhost kernel: [ 2661.377922] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:10 myhost kernel: [ 2663.309074] nouveau E[Xorg[1550]] failed to idle channel 0xcccc0000 [Xorg[1550]]
Jul 23 19:24:12 myhost kernel: [ 2665.310067] nouveau E[ PFIFO][0000:01:00.0] playlist 0 update timeout
Jul 23 19:24:12 myhost colord: Automatic remove of icc-ed0e29bb4d99e8caee0ed705188568cc from xrandr-Dell Inc.-DELL 2405FPW-T61335980T0S
Jul 23 19:24:12 myhost colord: Profile removed: icc-ed0e29bb4d99e8caee0ed705188568cc
Jul 23 19:24:12 myhost colord: Profile removed: icc-cc453361e0e5fe47e15ec698dbee0254
Jul 23 19:24:12 myhost colord: device removed: xrandr-Dell Inc.-DELL 2405FPW-T61335980T0S
Jul 23 19:24:13 myhost kernel: [ 2665.669272] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:17 myhost kernel: [ 2669.960633] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:21 myhost kernel: [ 2674.252007] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:26 myhost kernel: [ 2678.543381] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:27 myhost kernel: [ 2680.434907] nouveau E[chrome[7073]] failed to idle channel 0xcccc0000 [chrome[7073]]
Jul 23 19:24:30 myhost kernel: [ 2682.834756] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:34 myhost kernel: [ 2687.126132] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:38 myhost kernel: [ 2691.417509] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:42 myhost kernel: [ 2695.422508] nouveau E[chrome[7073]] failed to idle channel 0xcccc0000 [chrome[7073]]
Jul 23 19:24:43 myhost kernel: [ 2695.708885] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:44 myhost kernel: [ 2697.420975] nouveau E[ PFIFO][0000:01:00.0] channel 4 [chrome[7073]] kick timeout
Jul 23 19:24:44 myhost kernel: [ 2697.423943] nouveau W[ PFIFO][0000:01:00.0] unknown status 0x00000100
Jul 23 19:24:46 myhost kernel: [ 2699.422229] nouveau E[ PFIFO][0000:01:00.0] playlist 0 update timeout
Jul 23 19:24:46 myhost kernel: [ 2699.422279] nouveau ![ PFIFO][0000:01:00.0] unhandled status 0x00000001
Jul 23 19:24:49 myhost kernel: [ 2701.560679] nouveau E[ PFIFO][0000:01:00.0] playlist 0 update timeout
Jul 23 19:24:51 myhost kernel: [ 2703.576392] nouveau E[ PFIFO][0000:01:00.0] playlist 0 update timeout

Revision history for this message
In , Óscar García Amor (oscar-garcia-amor) wrote :

Created attachment 114766
Nouveau with Gnome 3.16.0 crash log

Using nouveau drivers, when plays with the new legacy tray in Gnome 3.16.0 (open it, close, and reopen again) the entire system hangs and must to restart Gnome Shell & GDM.

Card: GeForce GTS 250

Versions:
Kernel 3.19.3-1-ARCH
xf86-video-nouveau 1.0.11-3
mesa 10.5.2-1

I attach journal log.

Note: I open related bug in GNOME https://bugzilla.gnome.org/show_bug.cgi?id=747115 They advised me that open bug here too.

Revision history for this message
In , grahams1 (gps1539) wrote :

Created attachment 114831
Attaching journalctl output after gnome 3.16 freeze. Freeze happened @ ~ 23:06

Revision history for this message
In , grahams1 (gps1539) wrote :

The gnome team thinks I may be hitting the same issue

NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)

Versions:
Kernel 3.19.3-1-ARCH
xf86-video-nouveau 1.0.11-3
mesa 10.5.2-1

Attached my journal log

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

Somehow gnome-shell is able to convince nouveau to do something very dumb. I didn't even think this was possible... libdrm is supposed to de-dup these, no?

nouveau E[gnome-shell[1773]] multiple instances of buffer 215 on validation list
nouveau E[gnome-shell[1773]] validate_init
nouveau E[gnome-shell[1773]] validate: -22

Revision history for this message
In , Arjen-x (arjen-x) wrote :

I'm also getting the same errors since april 1st:

Apr 01 11:07:08 arjen-imac.office.react.nl kernel: nouveau E[gnome-shell[4997]] multiple instances of buffer 228 on validation list
Apr 01 11:36:47 arjen-imac.office.react.nl kernel: nouveau E[gnome-shell[905]] multiple instances of buffer 255 on validation list
Apr 01 13:51:37 arjen-imac.office.react.nl kernel: nouveau E[gnome-shell[2939]] multiple instances of buffer 146 on validation list
Apr 02 12:20:26 arjen-imac.office.react.nl kernel: nouveau E[gnome-shell[2939]] multiple instances of buffer 415 on validation list
Apr 02 17:00:46 arjen-imac.office.react.nl kernel: nouveau E[gnome-shell[895]] multiple instances of buffer 327 on validation list

Just before the 1st crash I upgrade mesa:

[2015-04-01 09:55] [ALPM] upgraded mesa (10.5.1-2 -> 10.5.2-1)

Versions:
Kernel 3.19.2-1-ARCH
xf86-video-nouveau 1.0.11-3
mesa 10.5.2-1
Gnome 3.14.2

So I think this is mesa related, and not related to Gnome 3.16.

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

I'm guessing all you guys have libdrm-2.4.60 -- can you try downgrading to libdrm-2.4.59?

Revision history for this message
In , Renato Garcia (renatao-garcia) wrote :

I have the same issue and can confirm that downgrading
from libdrm-2.4.60 to libdrm-2.4.59 seems to stop the issue from
happening as there are no more hangs.

Thanks,
Rennie

Revision history for this message
In , Sean Bogie (spbogie) wrote :

git bisect puts the first bad commit @
commit 5ea6f1c32628887c9df0c53bc8c199eb12633fec
Author: Maarten Lankhorst <email address hidden>
Date: Thu Feb 26 11:54:03 2015 +0100

    nouveau: make nouveau importing global buffers completely thread-safe, with tests
...

ArchLinux bug report (https://bugs.archlinux.org/task/44680) suggests an additional reproduction method "when I move my mouse over VLC's seekbar and it shows a small tooltip to show the time gnome-shell freezes".

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

My favourite is "run mplayer with vdpau, then move the window". I arrived at that one by accident, but that repros it 100%. No compositors or anything like that.

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

*** Bug 90201 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Skeggsb (skeggsb) wrote :

I just push a commit[1] to libdrm which should fix this issue.

[1] http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

(In reply to Ben Skeggs from comment #10)
> I just push a commit[1] to libdrm which should fix this issue.
>
> [1]
> http://cgit.freedesktop.org/mesa/drm/commit/
> ?id=812e8fe6ce46d733c30207ee26c788c61f546294

I can confirm that this fixes my repro case (move mplayer vdpau window around). I knew it was something relating to named bo's, so good to see that the fix also involved those.

Revision history for this message
In , Joev-8450 (joev-8450) wrote :

Created attachment 115697
jounralctl events when gnome-shell freezes

Gnome version is 3.14.4-2-fc21. I had reported this event to the gnome team #749128; they referred me here. I had initiated a download in firefox when this freeze occurred but i have experienced it in other applications

Revision history for this message
In , Joev-8450 (joev-8450) wrote :

In closer review of the thread above I checked on downgrading libdrm from 2.4.60. In my installation yum tells me I need to also downgrade libdrm-devel and apparently the downgrade version is 2.4.58 rather than 2.4.59. Is that what you recommend?

Revision history for this message
In , Matthew Miller (mattdm) wrote :

(In reply to Joe Verreau from comment #13)
> In closer review of the thread above I checked on downgrading libdrm from
> 2.4.60. In my installation yum tells me I need to also downgrade
> libdrm-devel and apparently the downgrade version is 2.4.58 rather than
> 2.4.59. Is that what you recommend?

Joe, are you on the Fedora 22 bet? If so, an update will be going out soon. You can get it immediately from https://admin.fedoraproject.org/updates/FEDORA-2015-7930/libdrm-2.4.61-3.fc22

This is version 2.4.61, which fixes the regression. If you're not on F22, I expect updates will be coming soon. In the meantime, go ahead and downgrade to whatever works.

Revision history for this message
In , Joev-8450 (joev-8450) wrote :

Matthew, actually I'm on fedora 21 so I will downgrade libdrm, libdrm-devel to 2.4.58 and await the update to 2.4.61 in the normal distribution. thanks.

Revision history for this message
In , Richard Barlow (richardpbarlow) wrote :

I too am experiencing the issue described in this ticket. It is affecting 6 machines all running Fedora 21. They generally hang around 2-3 times a day. Is there going to be an update to 2.4.61 pushed for Fedora 21 at some point?

Revision history for this message
In , Victor Porton (porton) wrote :

I have a similar bug with

Debian Linux "testing" ("stretch")
GeForce 8400 GS Rev. 3
Linux 3.10-2-amd64
xserver-xorg-video-nouveau 1:1.0.11-1+b1
libdrm-nouveau1a 2.4.40-1~deb7u2
libdrm-nouveau2 2.4.60-3
libgl1-mesa-dri 10.5.7-1
libgl1-mesa-glx 10.5.7-1
Gnome 3.14.0-1

Revision history for this message
In , Victor Porton (porton) wrote :

Created attachment 116767
when manipulating Gnome tray

See the error log produced by journalctl when manipulating Gnome tray.

Revision history for this message
In , Ilia Mirkin (imirkin) wrote :

This is fixed by not using libdrm 2.4.60 which was a buggy release on the nouveau end. libdrm 2.4.59 or libdrm 2.4.61 should work fine.

Revision history for this message
Daniel Barrett (dbarrett-m) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libdrm (Ubuntu):
status: New → Confirmed
Revision history for this message
Peter Hurley (phurley) wrote :

libdrm-nouveau2 2.4.60 is broken.

Fix is here https://bugs.freedesktop.org/show_bug.cgi?id=89842#c10

Revision history for this message
Daniel Barrett (dbarrett-m) wrote :

Thank you Peter. Do you know if the 2.4.61 fix can be applied to Ubuntu 14.04 LTS officially? This is one deadly bug.

Revision history for this message
Peter Hurley (phurley) wrote :

Haven't tried yet; I will tonight.

Revision history for this message
Peter Hurley (phurley) wrote :

Yep, that bug fix works. I applied it to 2.4.60-2~ubuntu14.04.1, which is the broken version from trusty-updates, and confirm it fixes the observed problem in chrome + libdrm-nouveau2.

I pushed the repackage to PPA @ ppa:phurley/libdrm (or just downgrade to libdrm=2.4.56-1~ubuntu2).

Changed in nouveau:
importance: Unknown → Critical
status: Unknown → Fix Released
Revision history for this message
Daniel Barrett (dbarrett-m) wrote :

Thank you Peter!!!

I am not familiar with the migration path from a PPA to the official Ubuntu release. Is your fix likely to become part of official 14.04.1 LTS, and if so, how long does that usually take?

Changed in libdrm (Debian):
status: Unknown → Fix Released
Revision history for this message
In , Joev-8450 (joev-8450) wrote :

... at this point I'm inferring there will not be upgraded versions of libdrm, libdrm-devel for fc21? I did downgrade my laptop from 2.4.60 to 2.4.58 in Jun as noted below. I'm guessing the fix really is to go to fc22. I ask because now my desktop is also experiencing these freeze ups tho not in the frequency that others have reported.

Revision history for this message
Witold Szczeponik (wsz) wrote :

I get the same crash with Ubuntu 14.04.3 LTS and "libdrm" from ubuntu-updates. Any chances of progressing the patched version from https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/1477801/comments/6 to the repositories?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.