[Witherspoon-DD2.2][Ubu 18.10] [4.18.0-7-generic ] OS booting thrown with nouveau errors; OS booted successfully

Bug #1794055 reported by bugproxy
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Canonical Kernel Team
linux (Ubuntu)
Won't Fix
High
Canonical Kernel Team
Bionic
Won't Fix
Undecided
Unassigned
Cosmic
Won't Fix
High
Canonical Kernel Team
linux-firmware (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Medium
Seth Forshee
Cosmic
Fix Released
High
Seth Forshee

Bug Description

SRU Justification

Impact: Missing firmware for nouveau is causing errors to appear in dmesg.

Fix: Add missing firmware files from upstream linux-firmware.

Test Case: Confirm that errors in dmesg are gone once new firmware files are present.

Regression Potential: New and updated firmware always has potential to cause regressions, however this firmware has been in disco for several months with no reported issues.

---

== Comment: #0 - Kalpana Shetty <email address hidden> - 2018-09-15 23:55:13 ==
---Problem Description---
[Witherspoon-DD2.2][Ubu 18.10] [4.18.0-7-generic ] OS booting thrown with nouveau errors

Contact Information = <email address hidden>, <email address hidden>

---uname output---
root@ltc-wcwsp3:~# uname -a Linux ltc-wcwsp3 4.18.0-7-generic #8-Ubuntu SMP Tue Aug 28 18:20:56 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = Witherspoon DD2.2 LC

Steps:
1. Netinstall Ubu 18.10 on Witherspoon-LC-DD2.2 6GPU system ------> PASS
2. Boot the OS ---> PASS but error thrown on the console related open source NVIDIA driver.

  [Disk: sdb2 / c0302064-c5a3-49a7-8bd4-402283e6fcbe]
    Ubuntu, with Linux 4.18.0-7-generic (recovery mode)
    Ubuntu, with Linux 4.18.0-7-generic
    Ubuntu
  [Disk: nvme0n1p2 / c5d042f1-812e-49e0-94b2-ade477084061]
    Ubuntu, with Linux 4.18.0-7-generic (recovery mode)
 * Ubuntu, with Linux 4.18.0-7-generic
    Ubuntu

  System information
  System configuration
  System status log
  Language
  Rescan devices
  Retrieve config from URL
  Plugins (0)
  Exit to shell
 ??????????????????????????????????????????????????????????????????????????????
 Enter=accept, e=edit, n=new, x=exit, l=language, g=log, h=help
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
[ 57.513329] kexec_core: Starting new kernel
[ 149.358703978,5] OPAL: Switch to big-endian OS
[ 153.355498935,5] OPAL: Switch to little-endian OS
[ 2.943735] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
[ 2.943738] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
[ 3.132733] vio vio: uevent: failed to send synthetic uevent
[ 4.058698] nouveau 0004:04:00.0: gr: failed to load gr/sw_nonctx
[ 4.129215] nouveau 0004:04:00.0: DRM: failed to create kernel channel, -22
[ 19.126509] nouveau 0004:04:00.0: DRM: failed to idle channel 0 [DRM]
[ 19.281450] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
[ 19.351322] nouveau 0004:05:00.0: DRM: failed to create kernel channel, -22
[ 34.350509] nouveau 0004:05:00.0: DRM: failed to idle channel 0 [DRM]
[ 34.502063] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
[ 34.572144] nouveau 0004:06:00.0: DRM: failed to create kernel channel, -22
[ 49.570509] nouveau 0004:06:00.0: DRM: failed to idle channel 0 [DRM]
[ 49.734754] nouveau 0035:03:00.0: gr: failed to load gr/sw_nonctx
[ 49.805057] nouveau 0035:03:00.0: DRM: failed to create kernel channel, -22
[ 64.802510] nouveau 0035:03:00.0: DRM: failed to idle channel 0 [DRM]
[ 64.955442] nouveau 0035:04:00.0: gr: failed to load gr/sw_nonctx
[ 65.025537] nouveau 0035:04:00.0: DRM: failed to create kernel channel, -22

[ 80.022509] nouveau 0035:04:00.0: DRM: failed to idle channel 0 [DRM]
[ 80.181169] nouveau 0035:05:00.0: gr: failed to load gr/sw_nonctx
[ 80.251481] nouveau 0035:05:00.0: DRM: failed to create kernel channel, -22
[ 95.250509] nouveau 0035:05:00.0: DRM: failed to idle channel 0 [DRM]
/dev/nvme0n1p2: recovering journal
/dev/nvme0n1p2: clean, 72569/97681408 files, 7384418/390701312 blocks
-.mount
kmod-static-nodes.service
dev-hugepages.mount
dev-mqueue.mount
sys-kernel-debug.mount
ufw.service
lvm2-lvmetad.service
systemd-remount-fs.service
systemd-random-seed.service
systemd-sysusers.service
keyboard-setup.service
systemd-tmpfiles-setup-dev.service
lvm2-monitor.service
finalrd.service
console-setup.service
swapfile.swap
ebtables.service
systemd-udevd.service
systemd-journald.service
systemd-journal-flush.service
systemd-tmpfiles-setup.service
systemd-update-utmp.service
[ 100.997765] vio vio: uevent: failed to send synthetic uevent
systemd-udev-trigger.service
systemd-timesyncd.service
apparmor.service
lvm2-pvscan@8:3.service
systemd-modules-load.service
sys-kernel-config.mount
sys-fs-fuse-connections.mount
systemd-sysctl.service
ondemand.service
dbus.service
irqbalance.service
opal-prd.service
lxcfs.service
atd.service
cron.service
iprdump.service
iprinit.service
systemd-logind.service
iprupdate.service
systemd-networkd.service
rsyslog.service
polkit.service
accounts-daemon.service
lxd-containers.service
networkd-dispatcher.service
var-lib-lxcfs.mount
tmp-selftest\x2dmountpoint\x2d039055037.mount
snapd.service
snapd.seeded.service
systemd-resolved.service
systemd-networkd-wait-online.service
blk-availability.service
systemd-user-sessions.service
apport.service

Ubuntu Cosmic Cuttlefish (development branch) ltc-wcwsp3 hvc0

ltc-wcwsp3 login:

== Comment: #2 - Kalpana Shetty <email address hidden> - 2018-09-16 00:07:26 ==
sosreport -> http://9.114.13.132/repo/bugs/ubu/sosreport-BZ171506.171506-20180915235600.tar.xz

== Comment: #3 - Kalpana Shetty <email address hidden> - 2018-09-16 00:33:02 ==

== Comment: #4 - Praveen K. Pandey <email address hidden> - 2018-09-19 05:52:23 ==
facing nouveau related error on power8 system as well

[ 4.764818] nouveau 0002:01:00.0: fifo: fault 00 [READ] at 0000000000020000 engine 0c [HOST6] client 06 [GPC0/L1_2] reason 02 [PTE] on channel 0 [03ffb18000 DRM]
[ 4.942169] nouveau 000a:01:00.0: fifo: fault 00 [READ] at 0000000000020000 engine 0c [HOST6] client 06 [GPC0/L1_2] reason 02 [PTE] on channel 0 [03ffb18000 DRM]
/dev/sdb2: clean, 132397/61054976 files, 5995714/244188416 blocks
[ 11.206278] vio vio: uevent: failed to send synthetic uevent
[ OK ] Started Show Plymouth Boot Screen.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Started Forward Password Requests to Plymouth Directory Watch.
plymouth-start.service
[ OK ] Started ebtables ruleset management.

== Comment: #5 - Chandni Verma <email address hidden> - 2018-09-20 16:41:49 ==
--- screening ---

From provided dmesg, I notice:

1294 [ 19.281478] nouveau 0004:05:00.0: bios: version 88.00.13.00.02
1295 [ 19.282753] nouveau 0004:05:00.0: Direct firmware load for nvidia/gv100/gr/sw_nonctx.bin failed with error -2
1296 [ 19.282755] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
1297 [ 19.282813] nouveau 0004:05:00.0: Using 32-bit DMA via iommu

..

1322 [ 34.367713] nouveau 0004:06:00.0: NVIDIA GV100 (140000a1)
1323 [ 34.497152] nouveau 0004:06:00.0: bios: version 88.00.13.00.02
1324 [ 34.502736] nouveau 0004:06:00.0: Direct firmware load for nvidia/gv100/gr/sw_nonctx.bin failed with error -2
1325 [ 34.502738] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
1326 [ 34.502797] nouveau 0004:06:00.0: Using 32-bit DMA via iommu

..

upto 6 instances of the above...

Looks like an NVIDIA firmware issue.

== Comment: #6 - Luciano Chavez <email address hidden> - 2018-09-20 17:03:31 ==
(In reply to comment #5)
> --- screening ---
>
> From provided dmesg, I notice:
>
>
> 1294 [ 19.281478] nouveau 0004:05:00.0: bios: version 88.00.13.00.02
> 1295 [ 19.282753] nouveau 0004:05:00.0: Direct firmware load for
> nvidia/gv100/gr/sw_nonctx.bin failed with error -2
> 1296 [ 19.282755] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
> 1297 [ 19.282813] nouveau 0004:05:00.0: Using 32-bit DMA via iommu
>
> ..
>
> 1322 [ 34.367713] nouveau 0004:06:00.0: NVIDIA GV100 (140000a1)
> 1323 [ 34.497152] nouveau 0004:06:00.0: bios: version 88.00.13.00.02
> 1324 [ 34.502736] nouveau 0004:06:00.0: Direct firmware load for
> nvidia/gv100/gr/sw_nonctx.bin failed with error -2
> 1325 [ 34.502738] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
> 1326 [ 34.502797] nouveau 0004:06:00.0: Using 32-bit DMA via iommu
>
> ..
>
> upto 6 instances of the above...
>
>
> Looks like an NVIDIA firmware issue.

Well, I think those message mean that the nouveau module can't find the firmware file as opposed to it being a FW issue. Might be a packaging issue if this is actually not causing any real issues. Probably best to mirror this to Canonical for their comment.

== Comment: #10 - Chandni Verma <email address hidden> - 2018-09-24 03:25:35 ==

Revision history for this message
bugproxy (bugproxy) wrote : dmesg
  • dmesg Edit (112.3 KiB, application/octet-stream)

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-171506 severity-high targetmilestone-inin1810
Revision history for this message
bugproxy (bugproxy) wrote : sosreport

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

It looks like this issue is seen with the "in development" 18.10 4.18 kernel. Does the same issue occur with 18.04 (4.15 kernel)?

Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
Revision history for this message
Manoj Iyer (manjo) wrote :

Looks like we don't ship the nvidia/gv100/gr/sw_nonctx.bin in our linux-firmware package. This firmware is also missing in git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git and would need to be upstreamed first to the linux-firmware tree so that we can sync to Ubuntu.

Currently the following nvidia firmware is available upstream:

linux-firmware$ find . -name sw_nonctx.bin
./nvidia/gm204/gr/sw_nonctx.bin
./nvidia/gp107/gr/sw_nonctx.bin
./nvidia/gp108/gr/sw_nonctx.bin
./nvidia/gm206/gr/sw_nonctx.bin
./nvidia/gp100/gr/sw_nonctx.bin
./nvidia/gm20b/gr/sw_nonctx.bin
./nvidia/gp10b/gr/sw_nonctx.bin
./nvidia/gp106/gr/sw_nonctx.bin
./nvidia/gm200/gr/sw_nonctx.bin
./nvidia/gk20a/sw_nonctx.bin
./nvidia/gp104/gr/sw_nonctx.bin
./nvidia/gp102/gr/sw_nonctx.bin

Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: New → Incomplete
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Marking as incomplete awaiting nvidia firmware for this card landing in linux-firmware upstream.

Changed in linux (Ubuntu):
status: Incomplete → In Progress
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: Joseph Salisbury (jsalisbury) → nobody
status: In Progress → Incomplete
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-10-29 10:34 EDT-------
Is there anything IBM can help with on making the nvidia firmware for this card available in linux-firmware upstream?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-12 16:03 EDT-------
It looks like this is upstream now (added right after the check for it):
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/nvidia/gv100/

Kalpana, do you want to just add the gv100 code to your /lib/firmware/nvidia to see if that resolves it?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-12 20:21 EDT-------
(In reply to comment #20)
> It looks like this is upstream now (added right after the check for it):
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/
> tree/nvidia/gv100/
>
> Kalpana, do you want to just add the gv100 code to your /lib/firmware/nvidia
> to see if that resolves it?

sure, let me try this.

tags: added: kernel-da-key
Revision history for this message
Manoj Iyer (manjo) wrote :

After you have verified that the adding the firmware fixes this for you, please add a note here so that we can start the SRU process of adding that firmware to Ubuntu.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-06 02:22 EDT-------
I recreated the problem where I could see the errors in dmesg (and the console) and then added the firmware to /lib/firmware/nvidia/gx100. After that:
mranweil@ltc-wspoon5:~$ dmesg|grep -i nouv
[ 6.632529] nouveau 0004:04:00.0: enabling device (0140 -> 0142)
[ 6.632613] nouveau 0004:04:00.0: Using 32-bit DMA via iommu
[ 6.632721] nouveau 0004:04:00.0: NVIDIA GV100 (140000a1)
<snip>
[ 7.061963] nouveau 0035:03:00.0: DRM: Pointer to TMDS table invalid
[ 7.061966] nouveau 0035:03:00.0: DRM: DCB version 4.1
[ 7.063141] nouveau 0035:03:00.0: DRM: MM: using COPY for buffer copies
[ 7.063154] [drm] Initialized nouveau 1.3.1 20120801 for 0035:03:00.0 on minor 2
mranweil@ltc-wspoon5:~$

So looks like the firmware from the current git tree addresses the error messages. I didn't do anything further with the driver.

Changed in ubuntu-power-systems:
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu Cosmic):
status: Incomplete → Confirmed
Changed in linux-firmware (Ubuntu Cosmic):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Seth Forshee (sforshee)
Revision history for this message
Seth Forshee (sforshee) wrote :

Added bionic nomination too since the 18.10 kernel is available there in the hwe packages.

Changed in linux-firmware (Ubuntu Bionic):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Medium
status: New → In Progress
Changed in linux-firmware (Ubuntu Cosmic):
status: Confirmed → In Progress
Seth Forshee (sforshee)
Changed in linux-firmware (Ubuntu):
status: New → Fix Released
Changed in linux-firmware (Ubuntu Cosmic):
status: In Progress → Fix Committed
Changed in linux-firmware (Ubuntu Bionic):
status: In Progress → Fix Committed
Seth Forshee (sforshee)
description: updated
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Confirmed → Fix Committed
Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Changed in linux (Ubuntu Bionic):
status: New → Won't Fix
Changed in linux (Ubuntu Cosmic):
status: Confirmed → Won't Fix
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello bugproxy, or anyone else affected,

Accepted linux-firmware into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/1.175.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

bugproxy (bugproxy)
tags: added: verification-needed-cosmic
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello bugproxy, or anyone else affected,

Accepted linux-firmware into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/1.173.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2019-04-01 15:04 EDT-------
I tried this on cosmic. Here's the before:
ubuntu@ltc-wspoon5:~$ dmesg |grep nouveau |grep fail
[ 3.640787] nouveau 0004:04:00.0: Direct firmware load for nvidia/gv100/gr/sw_nonctx.bin failed with error -2
[ 3.640789] nouveau 0004:04:00.0: gr: failed to load gr/sw_nonctx
[ 3.714073] nouveau 0004:04:00.0: DRM: failed to create kernel channel, -22
[ 18.713045] nouveau 0004:04:00.0: DRM: failed to idle channel 0 [DRM]
[ 18.880156] nouveau 0035:03:00.0: Direct firmware load for nvidia/gv100/gr/sw_nonctx.bin failed with error -2
[ 18.880160] nouveau 0035:03:00.0: gr: failed to load gr/sw_nonctx
[ 18.951277] nouveau 0035:03:00.0: DRM: failed to create kernel channel, -22
[ 33.949044] nouveau 0035:03:00.0: DRM: failed to idle channel 0 [DRM]
ubuntu@ltc-wspoon5:~$ dpkg --list |grep firmware
ii linux-firmware 1.175.1 all Firmware for Linux kernel drivers

And after updating to linux-firmware from -proposed:
ubuntu@ltc-wspoon5:~$ dmesg |grep nouveau |grep fail
ubuntu@ltc-wspoon5:~$ dpkg --list |grep firmware
ii linux-firmware 1.175.3 all Firmware for Linux kernel drivers

Looks fixed, thank you!

tags: added: verification-done-cosmic
removed: verification-needed-cosmic
Revision history for this message
Seth Forshee (sforshee) wrote :

Thanks for verifying on cosmic! Looks like verification for bionic is still missing however.

tags: added: verification-needed verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 1.175.3

---------------
linux-firmware (1.175.3) cosmic; urgency=medium

  * To add power setting bin file for SAR support (QCA6174) (LP: #1817817)
    - ath10k: QCA6174 hw3.0: update board-2.bin

  * iwlwifi Intel 8265 firmware crashing on lenovo x1 Gen 6 (LP: #1808389)
    - UBUNTU: revert most iwlwifi firmware back to versions from 1.173.3

linux-firmware (1.175.2) cosmic; urgency=medium

  * iwlwifi Intel 8265 firmware crashing on lenovo x1 Gen 6 (LP: #1808389)
    - iwlwifi: update firmwares for 7000, 8000 and 9000 series
    - iwlwifi: update firmwares for 8000 series

  * OS booting thrown with nouveau errors (LP: #1794055)
    - nvidia: add GV100 signed firmware

 -- Seth Forshee <email address hidden> Thu, 21 Mar 2019 15:29:33 -0500

Changed in linux-firmware (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for linux-firmware has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Manoj Iyer (manjo) wrote :

ubuntu@bobone:~$ uname -a
Linux bobone 4.18.0-17-generic #18-Ubuntu SMP Wed Mar 13 14:30:03 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

ubuntu@bobone:~$ dmesg | grep nouveau | grep error
ubuntu@bobone:~$

ubuntu@bobone:~$ find /lib/firmware/nvidia/gv100/ -name sw_nonctx.bin
/lib/firmware/nvidia/gv100/gr/sw_nonctx.bin
ubuntu@bobone:~$

ubuntu@bobone:~$ tree /lib/firmware/nvidia/gv100/
/lib/firmware/nvidia/gv100/
├── acr
│   ├── bl.bin
│   ├── ucode_load.bin
│   ├── ucode_unload.bin
│   └── unload_bl.bin
├── gr
│   ├── fecs_bl.bin
│   ├── fecs_data.bin
│   ├── fecs_inst.bin
│   ├── fecs_sig.bin
│   ├── gpccs_bl.bin
│   ├── gpccs_data.bin
│   ├── gpccs_inst.bin
│   ├── gpccs_sig.bin
│   ├── sw_bundle_init.bin
│   ├── sw_ctx.bin
│   ├── sw_method_init.bin
│   └── sw_nonctx.bin
├── nvdec
│   └── scrubber.bin
└── sec2
    ├── desc.bin
    ├── image.bin
    └── sig.bin

4 directories, 20 files
ubuntu@bobone:~$

Revision history for this message
Manoj Iyer (manjo) wrote :
Download full text (3.3 KiB)

ubuntu@bobone:~$ uname -a
Linux bobone 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:40:40 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux
ubuntu@bobone:~$ dmesg | grep nouveau | grep error
[ 3.020472] nouveau: probe of 0004:04:00.0 failed with error -12
[ 3.020667] nouveau: probe of 0004:05:00.0 failed with error -12
[ 3.021693] nouveau: probe of 0035:03:00.0 failed with error -12
[ 3.022595] nouveau: probe of 0035:04:00.0 failed with error -12
ubuntu@bobone:~$ find /lib/firmware/nvidia/gv100/ -name sw_nonctx.bin
find: ‘/lib/firmware/nvidia/gv100/’: No such file or directory
ubuntu@bobone:~$ apt-cache policy linux-firmware
linux-firmware:
  Installed: 1.173.3
  Candidate: 1.173.3
  Version table:
 *** 1.173.3 500
        500 http://ports.ubuntu.com/ubuntu-ports bionic-updates/main ppc64el Packages
        500 http://ports.ubuntu.com/ubuntu-ports bionic-security/main ppc64el Packages
        100 /var/lib/dpkg/status
     1.173 500
        500 http://ports.ubuntu.com/ubuntu-ports bionic/main ppc64el Packages
ubuntu@bobone:~$

== install linux-firmware from proposed ==
The nouveau driver does not seems to identify the Nvidia chipset in the 4.15 kernel. But it seems to load the firmware from linux-firmware, and load the driver.

ubuntu@bobone:~$ apt-cache policy linux-firmware
linux-firmware:
  Installed: 1.173.5
  Candidate: 1.173.5
  Version table:
 *** 1.173.5 500
        500 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el Packages
        100 /var/lib/dpkg/status
     1.173.3 500
        500 http://ports.ubuntu.com/ubuntu-ports bionic-updates/main ppc64el Packages
        500 http://ports.ubuntu.com/ubuntu-ports bionic-security/main ppc64el Packages
     1.173 500
        500 http://ports.ubuntu.com/ubuntu-ports bionic/main ppc64el Packages
ubuntu@bobone:~$

ubuntu@bobone:~$ find /lib/firmware/nvidia/gv100/ -name sw_nonctx.bin
/lib/firmware/nvidia/gv100/gr/sw_nonctx.bin
ubuntu@bobone:~$

ubuntu@bobone:~$ tree /lib/firmware/nvidia/gv100/
/lib/firmware/nvidia/gv100/
├── acr
│   ├── bl.bin
│   ├── ucode_load.bin
│   ├── ucode_unload.bin
│   └── unload_bl.bin
├── gr
│   ├── fecs_bl.bin
│   ├── fecs_data.bin
│   ├── fecs_inst.bin
│   ├── fecs_sig.bin
│   ├── gpccs_bl.bin
│   ├── gpccs_data.bin
│   ├── gpccs_inst.bin
│   ├── gpccs_sig.bin
│   ├── sw_bundle_init.bin
│   ├── sw_ctx.bin
│   ├── sw_method_init.bin
│   └── sw_nonctx.bin
├── nvdec
│   └── scrubber.bin
└── sec2
    ├── desc.bin
    ├── image.bin
    └── sig.bin

4 directories, 20 files
ubuntu@bobone:~$

ubuntu@bobone:~$ dmesg | grep nouveau | grep error
[ 157.554971] nouveau: probe of 0004:04:00.0 failed with error -12
[ 157.555041] nouveau: probe of 0004:05:00.0 failed with error -12
[ 157.555957] nouveau: probe of 0035:03:00.0 failed with error -12
[ 157.556078] nouveau: probe of 0035:04:00.0 failed with error -12
ubuntu@bobone:~$

[ 157.554928] nouveau 0004:04:00.0: unknown chipset (140000a1)
[ 157.554994] nouveau 0004:05:00.0: unknown chipset (140000a1)
[ 157.555880] nouveau 0035:03:00.0: unknown chipset (140000a1)
[ 157.556002] nouveau 0035:04:00.0: unknown chipset (140000a1)

ubuntu@bobone:~$ lsmod | grep nouveau
nouveau 2150398 0...

Read more...

Revision history for this message
Manoj Iyer (manjo) wrote :

Also should have mentioned the kernel was also upgraded from -proposed on bionic.

ubuntu@bobone:~$ uname -a
Linux bobone 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:26:19 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux
ubuntu@bobone:~$

Revision history for this message
Manoj Iyer (manjo) wrote :

We see unknown chipset errors in 4.15 GA kernel in Bionic, but the missing firmware issue is now fixed in bionic. We have support for this Nvidia GPU in the bionic (4.18) HWE kernel and so I am marking this as verification-done.

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 1.173.5

---------------
linux-firmware (1.173.5) bionic; urgency=medium

  * To add power setting bin file for SAR support (QCA6174) (LP: #1817817)
    - ath10k: QCA6174 hw3.0: update board-2.bin

  * iwlwifi Intel 8265 firmware crashing on lenovo x1 Gen 6 (LP: #1808389)
    - UBUNTU: revert most iwlwifi firmware back to versions from 1.173.3

  * Update linux-firmware in bionic for 5.0-based hwe-edge kernel
    (LP: #1821239)
    - amdgpu: sync up bonaire firmware with 18.20 release
    - amdgpu: sync up hawaii firmware with 18.20 release
    - amdgpu: sync up kabini firmware with 18.20 release
    - amdgpu: sync up mullins firmware with 18.20 release
    - amdgpu: sync up kaveri firmware with 18.20 release
    - amdgpu: sync up hainan firmware with 18.20 release
    - amdgpu: sync up oland firmware with 18.20 release
    - amdgpu: sync up tahiti firmware with 18.20 release
    - amdgpu: sync up pitcairn firmware with 18.20 release
    - amdgpu: sync up verde firmware with 18.20 release
    - linux-firmware: mediatek: add MT7622 Bluetooth firmwares and license file
    - linux-firmware: add firmware for mt76x0
    - qed: Add firmware 8.37.7.0
    - firmware/icl/dmc: Add v1.07 of DMC for Icelake
    - iwlwifi: add -41.ucode firmwares for 9000 series
    - ath10k: QCA9377 hw1.0: add firmware-6.bin to WLAN.TF.2.1-00021-QCARMSWP-1
    - linux-firmware: add firmware for mt7610e
    - linux-firmware: add firmware for mt7650e
    - amdgpu: add raven dmcu firmware
    - amdgpu: Add new polaris SMC firmwares
    - amdgpu: Add new polaris MC firmwares
    - amdgpu: add firmware for vega12
    - amdgpu: update polaris10 fw for 18.50 release
    - amdgpu: update polaris11 fw for 18.50 release
    - amdgpu: update vega12 fw for 18.50 release
    - Mellanox: Add new mlxsw_spectrum firmware 13.1910.622
    - brcm: provide new firmwares for BCM4366 chipset
    - iwlwifi: update -41.ucode for 9000 series
    - iwlwifi: add -43.ucode for 9000 series
    - rtl_bt: Add firmware and configuration files for the Bluetooth part of RTL8723BS
    - iwlwifi: update firmwares for 9000 series
    - amdgpu: add picasso fw for 18.50 release
    - amdgpu: add raven2 fw for 18.50 release
    - amdgpu: add firmware for vega20 from 18.50
    - amdgpu: update raven2 rlc firmware
    - drm/amdgpu: update vega20 to latest from 18.50 branch
    - drm/amdgpu: update polaris12 to latest from 18.50 branch
    - drm/amdgpu: update picasso to latest from 18.50 branch

linux-firmware (1.173.4) bionic; urgency=medium

  * iwlwifi Intel 8265 firmware crashing on lenovo x1 Gen 6 (LP: #1808389)
    - WHENCE: Fix typo Version
    - iwlwifi: update firmwares for 7000, 8000 and 9000 series
    - iwlwifi: update firmwares for 8000 series

  * OS booting thrown with nouveau errors (LP: #1794055)
    - nvidia: add GV100 signed firmware

 -- Seth Forshee <email address hidden> Thu, 21 Mar 2019 15:17:46 -0500

Changed in linux-firmware (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.