some nvidia p1000 graphic cards hang during the boot

Bug #1791569 reported by Hui Wang
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
Undecided
Unassigned
linux (Ubuntu)
Critical
Hui Wang
Bionic
Undecided
Unassigned
linux-oem (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

BugLink: https://bugs.launchpad.net/bugs/1791569

This patch is in the 4.18 already, no need to send it to cosmic.

Due to the context conflict, if we want to apply this patch as it is, we
need to apply a large amount of patches ahead of this patch, it is possible
to introduce some regression. So I made some change in this patch, it is
some differnt from the orignal patch, but they have the same logic, and it
can be applied to bionic kernel.

[Impact]
We have 2 nvidia graphic cards, and the nouveau driver in the bionic kernel can't
work well with both of these 2 cards, one of the cards hang during the boot
process, we compared the output of lspci and vbios version of these 2 cards, they
are same; and according to nivida's reply, it is possible that they have some
difference on computational units (https://devtalk.nvidia.com/default/topic/1038973/
linux/2-same-quadro-p1000-cards-but-only-one-can-install-ubuntu-/), and kernel-4.18
fixed this problem, through bisect, this patch was found.

[Fix]
backport a upstream patch to fix this problem. without this patch, the number of tpc
(texture process cluster) is hardcoded to be 5 for some nv graphic families, but
in practice, the tpc number of many families is not 5. And the p1000 grphic card belong
to gp107 family, it is 3 intead of 5.

[Test Case]
tested this patch with P1000, P2000, P620 and P500 graphic cards, all work well as before.

[Regression Potential]
Very low, this patch comes from upstream, and I have tested it with many nv graphic
cards, they all worked well as before.

Hui Wang (hui.wang)
Changed in linux (Ubuntu):
importance: Undecided → Critical
tags: added: originate-from-1784286 sutton
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1791569

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Hui Wang (hui.wang)
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Hui Wang (hui.wang)
tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (20.0 KiB)

This bug was fixed in the package linux - 4.15.0-38.41

---------------
linux (4.15.0-38.41) bionic; urgency=medium

  * linux: 4.15.0-38.41 -proposed tracker (LP: #1797061)

  * Silent data corruption in Linux kernel 4.15 (LP: #1796542)
    - block: add a lower-level bio_add_page interface
    - block: bio_iov_iter_get_pages: fix size of last iovec
    - blkdev: __blkdev_direct_IO_simple: fix leak in error case
    - block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

linux (4.15.0-37.40) bionic; urgency=medium

  * linux: 4.15.0-37.40 -proposed tracker (LP: #1795564)

  * hns3: enable ethtool rx-vlan-filter on supported hw (LP: #1793394)
    - net: hns3: Add vlan filter setting by ethtool command -K

  * hns3: Modifying channel parameters will reset ring parameters back to
    defaults (LP: #1793404)
    - net: hns3: Fix desc num set to default when setting channel

  * hisi_sas: Add SATA FIX check for v3 hw (LP: #1794151)
    - scsi: hisi_sas: Add SATA FIS check for v3 hw

  * Fix potential corruption using SAS controller on HiSilicon arm64 boards
    (LP: #1794156)
    - scsi: hisi_sas: add memory barrier in task delivery function

  * hisi_sas: Reduce unnecessary spin lock contention (LP: #1794165)
    - scsi: hisi_sas: Tidy hisi_sas_task_prep()

  * Add functional level reset support for the SAS controller on HiSilicon D06
    systems (LP: #1794166)
    - scsi: hisi_sas: tidy host controller reset function a bit
    - scsi: hisi_sas: relocate some common code for v3 hw
    - scsi: hisi_sas: Implement handlers of PCIe FLR for v3 hw

  * HiSilicon SAS controller doesn't recover from PHY STP link timeout
    (LP: #1794172)
    - scsi: hisi_sas: tidy channel interrupt handler for v3 hw
    - scsi: hisi_sas: Fix the failure of recovering PHY from STP link timeout

  * getxattr: always handle namespaced attributes (LP: #1789746)
    - getxattr: use correct xattr length

  * Fix unusable NVIDIA GPU after S3 (LP: #1793338)
    - PCI: Reprogram bridge prefetch registers on resume

  * Fails to boot under Xen PV: BUG: unable to handle kernel paging request at
    edc21fd9 (LP: #1789118)
    - x86/EISA: Don't probe EISA bus for Xen PV guests

  * qeth: use vzalloc for QUERY OAT buffer (LP: #1793086)
    - s390/qeth: use vzalloc for QUERY OAT buffer

  * SRU: Enable middle button of touchpad on ThinkPad P72 (LP: #1793463)
    - Input: elantech - enable middle button of touchpad on ThinkPad P72

  * Dell new AIO requires a new uart backlight driver (LP: #1727235)
    - SAUCE: platform/x86: dell-uart-backlight: new backlight driver for DELL AIO
    - updateconfigs for Dell UART backlight driver

  * [Ubuntu] s390/crypto: Fix return code checking in cbc_paes_crypt.
    (LP: #1794294)
    - s390/crypto: Fix return code checking in cbc_paes_crypt()

  * hns3: Retrieve RoCE MSI-X config from firmware (LP: #1793221)
    - net: hns3: Fix MSIX allocation issue for VF
    - net: hns3: Refine the MSIX allocation for PF

  * net: hns: Avoid hang when link is changed while handling packets
    (LP: #1792209)
    - net: hns: add the code for cleaning pkt in chip
    - net: hns: add netif_carrier_off before change speed and duplex

  * Page leaki...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem - 4.15.0-1024.29

---------------
linux-oem (4.15.0-1024.29) bionic; urgency=medium

  * linux-oem: 4.15.0-1024.29 -proposed tracker (LP: #1797069)

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  [ Ubuntu: 4.15.0-38.41 ]

  * linux: 4.15.0-38.41 -proposed tracker (LP: #1797061)
  * Silent data corruption in Linux kernel 4.15 (LP: #1796542)
    - block: add a lower-level bio_add_page interface
    - block: bio_iov_iter_get_pages: fix size of last iovec
    - blkdev: __blkdev_direct_IO_simple: fix leak in error case
    - block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

 -- Chia-Lin Kao (AceLan) <email address hidden> Tue, 16 Oct 2018 10:32:03 +0800

Changed in linux-oem (Ubuntu Bionic):
status: New → Fix Released
Changed in linux-oem (Ubuntu):
status: New → Fix Released
Changed in hwe-next:
status: New → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers