10de:0867 [iMac9,1] Module nouveau systematically crashes after a few minutes, whereas module nvidia from nvidia-331-updates crashes only after several hours of work

Bug #1396840 reported by Etienne URBAH
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Module 'nouveau' systematically crashes after a few minutes, whereas module 'nvidia' from 'nvidia-331-updates' crashes only after several hours of work.

$ uname -r
3.16.0-25-generic

$ dpkg-query -W 'linux-*image-*3.16.0-25*'
linux-image-3.16.0-25-generic 3.16.0-25.33
linux-image-extra-3.16.0-25-generic 3.16.0-25.33
linux-signed-image-3.16.0-25-generic 3.16.0-25.33

$ lsmod | head -1 ; lsmod | grep '^nouveau'
Module Size Used by
nouveau 1234845 3

$ sudo lspci -nn -vv -s 03:00.0
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation C79 [GeForce 9400] [10de:0867] (rev b1) (prog-if 00 [VGA controller])
 Subsystem: Apple Inc. iMac 9,1 [106b:00ad]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 256 bytes
 Interrupt: pin A routed to IRQ 45
 Region 0: Memory at d2000000 (32-bit, non-prefetchable) [size=16M]
 Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
 Region 5: I/O ports at 1000 [size=128]
 Expansion ROM at d3000000 [disabled] [size=128K]
 Capabilities: [60] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
  Address: 00000000fee0300c Data: 4152
 Kernel driver in use: nouveau

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: linux-image-extra-3.16.0-25-generic 3.16.0-25.33
ProcVersionSignature: Ubuntu 3.16.0-25.33-generic 3.16.7
Uname: Linux 3.16.0-25-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: urbah 3073 F.... pulseaudio
CurrentDesktop: GNOME
Date: Thu Nov 27 02:05:32 2014
HibernationDevice: RESUME=UUID=d78e46a2-825e-4ad0-8f14-c7d2935c33c7
InstallationDate: Installed on 2014-11-03 (23 days ago)
InstallationMedia: Ubuntu-GNOME 14.10 "Utopic Unicorn" - Release amd64 (20141022.1)
MachineType: Apple Inc. iMac9,1
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.16.0-25-generic.efi.signed root=UUID=b587542c-4076-491a-ae4d-21942b20daf6 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.16.0-25-generic N/A
 linux-backports-modules-3.16.0-25-generic N/A
 linux-firmware 1.138
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/27/09
dmi.bios.vendor: Apple Inc.
dmi.bios.version: IM91.88Z.008D.B08.0904271717
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Mac-F2218FC8
dmi.board.vendor: Apple Inc.
dmi.chassis.asset.tag: Asset Tag#
dmi.chassis.type: 13
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-F2218FC8
dmi.modalias: dmi:bvnAppleInc.:bvrIM91.88Z.008D.B08.0904271717:bd04/27/09:svnAppleInc.:pniMac9,1:pvr1.0:rvnAppleInc.:rnMac-F2218FC8:rvr:cvnAppleInc.:ct13:cvrMac-F2218FC8:
dmi.product.name: iMac9,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.

Revision history for this message
Etienne URBAH (eurbah) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
penalvch (penalvch) wrote : Re: Module nouveau systematically crashes after a few minutes, whereas module nvidia from nvidia-331-updates crashes only after several hours of work

Etienne URBAH, thank you for reporting this and helping make Ubuntu better. Could you please test the latest upstream kernel available from the very top line at the top of the page (the release names are irrelevant for testing, and please do not test the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue.

If the test did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested exactly shown as:
kernel-fixed-upstream-3.18-rc6

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description.

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
Revision history for this message
Etienne URBAH (eurbah) wrote :

This bug is about the 'nouveau' module, which is usually contained in the linux-image-EXTRA-.. package.

Following https://wiki.ubuntu.com/KernelMainlineBuilds :

I went to the http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-rc6-vivid/ folder, which contains the 3 following DEB files :
linux-headers-3.18.0-031800rc6-generic_3.18.0-031800rc6.201411231935_amd64.deb
linux-headers-3.18.0-031800rc6_3.18.0-031800rc6.201411231935_all.deb
linux-image-3.18.0-031800rc6-generic_3.18.0-031800rc6.201411231935_amd64.deb

But this folder does NOT contain the linux-image-EXTRA-3.18.0-031800rc6-generic_3.18.0-031800rc6.201411231935_amd64.deb file which should contain the 'nouveau' module.

Does the linux-image-EXTRA-3.18.0-031800rc6-generic_3.18.0-031800rc6.201411231935_amd64.deb file exist ?
- If yes, where can I find it ?
- If not, where can I find a debian package containing the latest 'nouveau' module ?

For the record, I downloaded and installed the 3 existing DEB files, then rebooted, and not surprisingly, the graphical interface stayed entirely black. I could reboot using tty1.

Again, this bug is about the 'nouveau' module, which is usually contained in the linux-image-EXTRA-.. package.

Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, as is verified by actually opening up the mainline kernel files with Archive Manager (ex. /lib/modules/3.18.0-031800rc6-generic/kernel/drivers/gpu/drm/nouveau/ ) testing to nouveau does not require the extra file. The extra file is for modules built outside the mainline kernel proper (ex. proprietary or out of tree drivers maintained downstream like nvidia, fglrx, etc.).

As is described in the article, did you first uninstall the nvidia driver before rebooting into the mainline kernel? If you didn't, that is one reason why booting into the mainline kernel would boot to a black screen.

Revision history for this message
Etienne URBAH (eurbah) wrote :

Lot of thanks to Christopher M. Penalver (penalvch) for his explanations and his patience with me.

After uninstallation of the 'nvidia-331-updates' package, the installation of linux-image-3.18.0-031800rc6-generic_3.18.0-031800rc6.201411231935_amd64.deb succeeded (except for the 'wl' driver, which is NOT critical for me).

After reboot :

- The graphical interface of nicely displayed the gnome login screen with all the registered users.

- On tty1, I verified that the running kernel really is 'linux 3.18.0 rc6', and that it contains the 'nouveau' module :

   $ lsmod | head -1 ; lsmod | grep '^nouveau'
   Module Size Used by
   nouveau 1365969 3

- But graphical login systematically failed, showing just a black screen.

- I could reboot using tty1.

So this bug is NOT fixed upstream yet.

tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-3.18-rc6
Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, does the kernel boot parameter nomodeset provide a WORKAROUND to this issue using nouveau (not nvidia)?

tags: added: needs-crash-log
Etienne URBAH (eurbah)
tags: added: kernel-bug-exists-upstream-3.18-rc7
Revision history for this message
Etienne URBAH (eurbah) wrote :

I just installed 'linux 3.18.0 rc7' :

- After regular boot, graphical login systematically fails.

- After a boot in 'recovery mode' (which includes the 'nomodeset' parameter), graphical login systematically succeeds, but Cinnamon complains that the graphical driver lacks support for hardware acceleration, and Cinnamon works in software rendering mode.

- After having replaced 'quiet splash' by 'nomodeset' in the 'regular boot' menu entry of 'grub.cfg' and rebooted, the behavior is exactly the same as in 'recovery mode'.

I see that you have added a tag requesting a crash log.
I am willing to provide it.
Can you explain where I can find a crash log, or how I can trigger its generation ?

Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, one may gather a system crash via https://help.ubuntu.com/community/DebuggingSystemCrash . It needs to contain the verbage "call trace" (for kernel crash) or "backtrace" (for xorg crash).

Revision history for this message
Etienne URBAH (eurbah) wrote :
Revision history for this message
Etienne URBAH (eurbah) wrote :

Thank to remote debugging through SSH, this attached 'kern.log' hopefully contains useful information for 2 separate X freezes on graphical login with module 'nouveau' from linux-3.18.0-rc7 :

- From line 2415 beginning with 'Dec 7 16:23:24'
to line 12681 beginning with 'Dec 7 16:35:06'.

- From line 15671 beginning with 'Dec 7 23:47:53'
to line 28115 beginning with 'Dec 8 00:11:56'
(I set LogLevel to 1 at line 23806 beginning with 'Dec 8 00:11:28').

I have already attached the corresponding 'Xorg.0.log'.

Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, the issue you are reporting is an upstream one. Could you please report this problem through the appropriate channel by following the instructions _verbatim_ at http://nouveau.freedesktop.org/wiki/Bugs/ ?

Please provide a direct URL to your bug report once you have made it so that it may be tracked.

Thank you for your understanding.

tags: removed: needs-crash-log
Changed in linux (Ubuntu):
status: Incomplete → Triaged
summary: - Module nouveau systematically crashes after a few minutes, whereas
- module nvidia from nvidia-331-updates crashes only after several hours
- of work
+ 10de:0867 [iMac9,1] Module nouveau systematically crashes after a few
+ minutes, whereas module nvidia from nvidia-331-updates crashes only
+ after several hours of work
Revision history for this message
Etienne URBAH (eurbah) wrote :

Yes, my 2 previous attachments were related to an upstream Linux kernel.

But this time, my attachments are duly related to the CURRENT Linux kernel 3.16.0-28.38
I hope that they will be useful.

Revision history for this message
Etienne URBAH (eurbah) wrote :

The end of the attachment is 'kern.log' for an X bug with the 'nouveau' module from Linux kernel 3.16.0-28.38 :

- Line 4253 beginning with 'Dec 22 04:13:22' is the start of the test with the 'nouveau' module from Linux kernel 3.16.0-28.38

- Line 5624 beginning with 'Dec 22 04:27:46' shows the changing of LogLevel to 1

- Then, I logged in using cinnamon, started firefox, ...
   I lost system responsiveness.
   With Ctrl-Alt-F1, I could still connect to a system console and use it.
   After Ctrl-Alt-F7, GDM showed its login screen, but login froze.
   After Ctrl-Alt-F1 then Ctrl-Alt-F7, GDM tried to show its login screen, but NO user icon appeared.

- Line 5673 beginning with 'Dec 22 04:49:39' is the beginning of the request to show the kernel state.

Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, before following the directions in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396840/comments/12 , please ensure the issue is reproducible with the latest mainline kernel 3.19-rc1 (not 3.18.x, 3.16.x, etc).

Revision history for this message
Etienne URBAH (eurbah) wrote :

Latest mainline kernel 3.19-rc2 does NOT solve the bug.
See https://bugs.freedesktop.org/show_bug.cgi?id=87819

tags: added: kernel-bug-exists-upstream-3.19-rc2
penalvch (penalvch)
tags: removed: kernel-bug-exists-upstream-3.18-rc6 kernel-bug-exists-upstream-3.18-rc7
Etienne URBAH (eurbah)
tags: added: kernel-bug-fixed-upstream-3.19.1
penalvch (penalvch)
tags: added: needs-reverse-bisect
Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, to see if this is already fixed in Ubuntu, could you please test for this via http://cdimage.ubuntu.com/daily-live/current/ and advise to the results?

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Etienne URBAH (eurbah) wrote :

I am just running the live CD of Ubuntu Vivid Beta 2.

$ uname -r
3.19.0-10-generic

Even with 4 simultaneous Firefox windows displaying 1 video streaming each, I can NOT trigger the bug anymore.
So, this bug is probably fixed.

tags: added: kernel-bug-fixed-upstream-3.19.0-10
Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, would you need a backport to a release prior to Vivid, or may this be closed as Status Invalid?

Revision history for this message
Etienne URBAH (eurbah) wrote :

I have upgraded all my machines from 'utopic' (with linux 3.16) to 'vivid' (with linux 3.19) where this bug is fixed.
So, I do not PERSONALLY need a backport.

But this bug still exists for 'utopic', and perhaps even LTS 'trusty'.
So, it is possible that this bug needs a backport to 'utopic' and LTS 'trusty'.

For 'utopic' and LTS 'trusty', this bug may be closed only AFTER :
- a backport corrects the problem (then the status would be " Fix Released") or
- the Ubuntu release has reached its end of life (then the status would be "Won't Fix').

Anyway, this bug is perfectly valid, and its status must NOT be set to "Invalid".

Revision history for this message
penalvch (penalvch) wrote :

Etienne URBAH, this bug report is being closed due to your last comment https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396840/comments/20 regarding this being fixed with an update, and you don't need a backport. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time to report this bug and helping to make Ubuntu better. Please submit any future bugs you may find.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Hans Erik van Elburg (hanserik) wrote :

This is still an issue in Ubuntu 14.04 LTS on iMac9,1 , why is this issue closed?

Changed in linux (Ubuntu):
status: Invalid → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

Hans Erik van Elburg:
>"...why is this issue closed?"

Please see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396840/comments/21 .

Despite this, if you have an issue in Ubuntu, please file a new report via a terminal:
ubuntu-bug linux

Feel free to subscribe me to it.

Changed in linux (Ubuntu):
importance: High → Undecided
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.