nvidia-dkms-* FTBS with linux 6.5

Bug #2028165 reported by Paolo Pisati
384
This bug affects 76 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-390 (Ubuntu)
Fix Released
Undecided
Alberto Milone
Jammy
Confirmed
Undecided
Unassigned
Mantic
Fix Released
Undecided
Alberto Milone
nvidia-graphics-drivers-450-server (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
Mantic
Fix Released
Undecided
Unassigned
nvidia-graphics-drivers-470 (Ubuntu)
Fix Released
Undecided
Alberto Milone
Jammy
Invalid
Undecided
Unassigned
Mantic
Fix Released
Undecided
Alberto Milone
nvidia-graphics-drivers-470-server (Ubuntu)
Fix Released
Undecided
Alberto Milone
Jammy
Invalid
Undecided
Unassigned
Mantic
Fix Released
Undecided
Alberto Milone
nvidia-graphics-drivers-525 (Ubuntu)
Fix Released
Undecided
Alberto Milone
Jammy
Invalid
Undecided
Unassigned
Mantic
Fix Released
Undecided
Alberto Milone
nvidia-graphics-drivers-525-server (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
Mantic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

...
In file included from /var/lib/dkms/nvidia/390.157/build/common/inc/nv-linux.h:21,
                 from /var/lib/dkms/nvidia/390.157/build/nvidia/nv-instance.c:13:
/var/lib/dkms/nvidia/390.157/build/common/inc/nv-mm.h: In function ‘NV_GET_USER_PAGES_REMOTE’:
/var/lib/dkms/nvidia/390.157/build/common/inc/nv-mm.h:164:45: error: passing argument 1 of ‘get_user_pages_remote’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  164 | return get_user_pages_remote(tsk, mm, start, nr_pages, flags,
      | ^~~
      | |
      | struct task_struct *
...

[Fix]

Apply the attached fix.

[How to test]

Install (and build) the patched packet.

[Regression potential]

The fix is composed of two patches:

1) the first patch simply garbage collect a reference to a function that was never used but that had the API changed in Linux 6.5 - so, it's a trivial change.

2) the second patch actually reimplement part of the vma scanning that was removed in __get_user_pages_locked() in upstream commit b2cac248191b7466c5819e0da617b0705a26e197 "mm/gup: removed vmas
array from internal GUP functions" - here is where most likely any regression could be found.

Paolo Pisati (p-pisati)
description: updated
description: updated
description: updated
Revision history for this message
Paolo Pisati (p-pisati) wrote :
tags: added: patch
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Changed in nvidia-graphics-drivers-390 (Ubuntu Mantic):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-470 (Ubuntu Mantic):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-470-server (Ubuntu Mantic):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-525 (Ubuntu Mantic):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-390 (Ubuntu Mantic):
status: New → In Progress
Changed in nvidia-graphics-drivers-470 (Ubuntu Mantic):
status: New → In Progress
Changed in nvidia-graphics-drivers-470-server (Ubuntu Mantic):
status: New → In Progress
Changed in nvidia-graphics-drivers-525 (Ubuntu Mantic):
status: New → In Progress
Revision history for this message
Paolo Pisati (p-pisati) wrote (last edit ):

New patch for the 470 driver

Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :

525-server fix for Linux 6.5

Revision history for this message
Paolo Pisati (p-pisati) wrote :

450-server Linux 6.5 fix.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-450-server (Ubuntu):
status: New → Confirmed
Changed in nvidia-graphics-drivers-525-server (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-525-server - 525.125.06-0ubuntu2

---------------
nvidia-graphics-drivers-525-server (525.125.06-0ubuntu2) mantic; urgency=medium

  * debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-get_user_pages-get_user_pages_remote.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages_remote.patch:
    - Fix build with Linux 6.5 (LP: #2028165)

 -- Paolo Pisati <email address hidden> Wed, 26 Jul 2023 09:50:27 +0000

Changed in nvidia-graphics-drivers-525-server (Ubuntu Mantic):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.199.02-0ubuntu2

---------------
nvidia-graphics-drivers-470 (470.199.02-0ubuntu2) mantic; urgency=medium

  * debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-get_user_pages-get_user_pages_remote.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages_remote.patch:
    - Fix build with Linux 6.5 (LP: #2028165)

 -- Paolo Pisati <email address hidden> Fri, 21 Jul 2023 10:56:36 +0000

Changed in nvidia-graphics-drivers-470 (Ubuntu Mantic):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470-server - 470.199.02-0ubuntu2

---------------
nvidia-graphics-drivers-470-server (470.199.02-0ubuntu2) mantic; urgency=medium

  * debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-get_user_pages-get_user_pages_remote.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages_remote.patch:
    - Fix build with Linux 6.5 (LP: #2028165)

 -- Paolo Pisati <email address hidden> Tue, 25 Jul 2023 15:23:06 +0000

Changed in nvidia-graphics-drivers-470-server (Ubuntu Mantic):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-450-server - 450.248.02-0ubuntu2

---------------
nvidia-graphics-drivers-450-server (450.248.02-0ubuntu2) mantic; urgency=medium

  * debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-get_user_pages-get_user_pages_remote.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-fix-pin_user_pages_remote.patch:
    - Fix build with Linux 6.5 (LP: #2028165)

 -- Paolo Pisati <email address hidden> Wed, 26 Jul 2023 15:21:26 +0000

Changed in nvidia-graphics-drivers-450-server (Ubuntu Mantic):
status: Confirmed → Fix Released
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-525 - 525.125.06-0ubuntu4

---------------
nvidia-graphics-drivers-525 (525.125.06-0ubuntu4) mantic; urgency=medium

  * debian/open-kernel/patches/buildfix_kernel_6.5-fix-get_user_pages-get_user_pages_remote.patch,
    debian/open-kernel/patches/buildfix_kernel_6.5-fix-pin_user_pages.patch,
    debian/open-kernel/patches/uildfix_kernel_6.5-fix-pin_user_pages_remote.patch:
    - Fix build with Linux 6.5 for the -open variant (LP: #2028165)

 -- Paolo Pisati <email address hidden> Tue, 29 Aug 2023 08:57:20 +0000

Changed in nvidia-graphics-drivers-525 (Ubuntu Mantic):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-390 - 390.157-0ubuntu8

---------------
nvidia-graphics-drivers-390 (390.157-0ubuntu8) mantic; urgency=medium

  * debian/dkms_nvidia.conf,
    debian/templates/dkms_nvidia.conf.in,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-garbage-collect-all-references-to-get_user.patch,
    debian/dkms_nvidia/patches/buildfix_kernel_6.5-handle-get_user_pages-vmas-argument-remova.patch:
    - Support linux 6.5 ABI (LP: #2028165).

 -- Paolo Pisati <email address hidden> Wed, 19 Jul 2023 13:06:56 +0000

Changed in nvidia-graphics-drivers-390 (Ubuntu Mantic):
status: In Progress → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I've opened a jammy task for nvidia 390. We're getting a bunch of bug reports from jammy users complaining their Nvidia driver doesn't work anymore since they got automatically upgraded to kernel 6.5.

Changed in nvidia-graphics-drivers-390 (Ubuntu Jammy):
status: New → Confirmed
Changed in nvidia-graphics-drivers-450-server (Ubuntu Jammy):
status: New → Invalid
Changed in nvidia-graphics-drivers-470 (Ubuntu Jammy):
status: New → Invalid
Changed in nvidia-graphics-drivers-470-server (Ubuntu Jammy):
status: New → Invalid
Changed in nvidia-graphics-drivers-525 (Ubuntu Jammy):
status: New → Invalid
Changed in nvidia-graphics-drivers-525-server (Ubuntu Jammy):
status: New → Invalid
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This is a regression for jammy in the update of the linux-generic-hwe-22.04 package to version 6.5. So I guess that's more regression-update than regression-release. It won't be a regression-release unless the issue is still unresolved in 22.04.4.

tags: added: regression-update
Revision history for this message
Sunit (sunit-parab) wrote (last edit ):

so are we saying this would be 'fixed' in 22.04.4? And when is that expected? Asking cause this has rather let me either using the noveau drive(which is laggy) or get a new graphic card(which am unsure would even work?)

Revision history for this message
Daniel Letzeisen (dtl131) wrote :

@Sunit: use the PPA until(if?) Ubuntu fixes this: https://launchpad.net/~dtl131/+archive/ubuntu/nvidiaexp

Revision history for this message
Daniel van Vugt (vanvugt) wrote (last edit ):

If you need nvidia-390 on 22.04 then the best workaround I can suggest is to boot an older kernel like 6.2 instead.

Revision history for this message
Daniel Letzeisen (dtl131) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

It's only a workaround. Sometimes workarounds force you to use older software.

Revision history for this message
Sunit (sunit-parab) wrote :

Thanks Daniel Letzesien that worked

tags: added: regression-release
Steve Langasek (vorlon)
tags: removed: regression-release
Revision history for this message
Daniel Letzeisen (dtl131) wrote (last edit ):

@Steve Langesak, I added the regression-release tag based on vanvugt's comment that this this would be such a bug if not fixed by 22.04.4 release. Please explain why you removed it.

Revision history for this message
Brian Murray (brian-murray) wrote :

Probably because regression-update is more appropriate as this was caused by a package update in a stable release of Ubuntu.

https://wiki.ubuntu.com/Bugs/Tags

Revision history for this message
Dmitry Lapshin (lapshin-dv) wrote :

Why nvidia-graphics-drivers-390 for jammy isn't assigned and just confirmed? @p-pisati was the last uploader of the package for jammy, and @albertomilone handled the mantic one, fix from the PPA seem to work fine.

Revision history for this message
Daniel Letzeisen (dtl131) wrote :

@Dmitry: Ubuntu devs made a conscious decision to drop the 390 driver and some other branches: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/2035189
So if they haven't fixed the Jammy version by now, it's probably not going to happen.

Good news though: I patched the package in my PPA for kernel 6.8 and added a version for Ubuntu 24.04.
You should be able to use the versions in the PPA until the year 2029.

I will try to keep up with future kernel version, but no guarantees. I'm very good at getting distracted by life and/or procrastinating.

Revision history for this message
Test (3560holy) wrote :

@dtl131 Why did the Ubuntu devs make this decision? Does this mean that the driver functionality will not work correctly and Ubuntu users will not be able to use Ubuntu?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I don't think the "simpledrm" reason listed in 2035189 holds up. Unless you're in a virtual machine, SimpleDRM is only used for the few seconds of initrd, or for the duration of the disk unlock prompt. Even if old Nvidia drivers can't support SimpleDRM, they can support legacy framebuffers which we are continuing to ship in initrd. So old Nvidia drivers should remain usable, even if low res, during boot.

Also the unresolved Jammy task at the top of this page would not still say Confirmed if it was decided to never be fixed.

Revision history for this message
Daniel Letzeisen (dtl131) wrote (last edit ):

@vanvugt: I appreciate you opening the Jammy task, but I'm not holding my breath for this to get fixed after reading https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/2035189/comments/1
Basically, they knew this bug was coming when they removed the 390 package, but they seem to be okay with that as long as people can use 5.15

The 6.5 kernel has been out for months and 6.8 is coming down the pipe soon. The fix is simple and they already had the appropriate patches applied for mantic. So maybe this wasn't marked "Won't Fix" yet, but to me, the silence speaks volumes..

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Everyone is very busy with the 24.04 release as well as other responsibilities. So don't take silence to imply anything :)

Revision history for this message
Daniel Letzeisen (dtl131) wrote :

Sorry, but I don't believe that for a second. The fix is not difficult or time-consuming, and they already had a patched version.

"Eventually 6.5+ based jammy kernels will not have it in lrm either. And we will only keep it against older kernels in bionic..jammy (ga only)"
Unless they reverse ^that decision, it means Won't Fix. I'll believe otherwise when I see it.

To post a comment you must log in.