Random whole system lockups on Lenovo ThinkStation P350 Tiny

Bug #2054121 reported by Allen
30
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

Random lockups in Ubuntu.

ProblemType: Bug
DistroRelease: Ubuntu 23.10
Package: xorg 1:7.7+23ubuntu2
ProcVersionSignature: Ubuntu 6.5.0-17.17-generic 6.5.8
Uname: Linux 6.5.0-17-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
.proc.driver.nvidia.capabilities.gpu0: Error: path was not a regular file.
.proc.driver.nvidia.capabilities.mig: Error: path was not a regular file.
.proc.driver.nvidia.gpus.0000.01.00.0: Error: path was not a regular file.
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.suspend: suspend hibernate resume
.proc.driver.nvidia.suspend_depth: default modeset uvm
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.154.05 Thu Dec 28 15:37:48 UTC 2023
 GCC version:
ApportVersion: 2.27.0-0ubuntu5
Architecture: amd64
BootLog: Error: [Errno 13] Permission denied: '/var/log/boot.log'
CasperMD5CheckResult: unknown
CompositorRunning: None
CurrentDesktop: ubuntu:GNOME
Date: Fri Feb 16 09:08:41 2024
DistUpgraded: Fresh install
DistroCodename: mantic
DistroVariant: ubuntu
ExtraDebuggingInterest: Yes, if not too technical
GpuHangFrequency: Several times a day
GpuHangReproducibility: Seems to happen randomly
GpuHangStarted: Immediately after installing this version of Ubuntu
GraphicsCard:
 NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] [10de:1fb0] (rev a1) (prog-if 00 [VGA controller])
   Subsystem: Lenovo TU117GLM [Quadro T1000 Mobile] [17aa:12db]
InstallationDate: Installed on 2024-02-16 (0 days ago)
InstallationMedia: Ubuntu 23.10.1 "Mantic Minotaur" - Release amd64 (20231016.1)
MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
ProcEnviron:
 LANG=en_US.UTF-8
 PATH=(custom, no user)
 SHELL=/bin/bash
 TERM=xterm-256color
 XDG_RUNTIME_DIR=<set>
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.5.0-17-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro quiet splash vt.handoff=7
SourcePackage: xorg
Symptom: display
Title: Xorg freeze
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/15/2023
dmi.bios.release: 1.60
dmi.bios.vendor: LENOVO
dmi.bios.version: M3JKT3CA
dmi.board.name: 32DD
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40697 WIN 3305435660291
dmi.chassis.type: 35
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.ec.firmware.release: 1.25
dmi.modalias: dmi:bvnLENOVO:bvrM3JKT3CA:bd11/15/2023:br1.60:efr1.25:svnLENOVO:pn30EF004VUS:pvrThinkStationP350Tiny:rvnLENOVO:rn32DD:rvrSDK0J40697WIN3305435660291:cvnLENOVO:ct35:cvrNone:skuLENOVO_MT_30EF_BU_Think_FM_ThinkStationP350Tiny:
dmi.product.family: ThinkStation P350 Tiny
dmi.product.name: 30EF004VUS
dmi.product.sku: LENOVO_MT_30EF_BU_Think_FM_ThinkStation P350 Tiny
dmi.product.version: ThinkStation P350 Tiny
dmi.sys.vendor: LENOVO
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.115-1
version.libgl1-mesa-dri: libgl1-mesa-dri 23.2.1-1ubuntu3.1
version.libgl1-mesa-glx: libgl1-mesa-glx N/A
version.nvidia-graphics-drivers: nvidia-graphics-drivers-* N/A
version.xserver-xorg-core: xserver-xorg-core 2:21.1.7-3ubuntu2.7
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-3
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20210115-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.17-2build1

Revision history for this message
Allen (ccwtech) wrote :
Revision history for this message
Allen (ccwtech) wrote :

Attached files

Revision history for this message
Allen (ccwtech) wrote :

Attached

affects: ubuntu → xorg (Ubuntu)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks for the bug report. I can't yet see any cause for the freezes. Next time it happens please:

1. Note the exact time and date of the freeze.

2. Reboot.

3. Run: journalctl -b-1 > prevboot.txt

4. Attach the resulting text file here along with a note telling us the exact time and date of the freeze.

5. Check /var/crash for crash files.

affects: xorg (Ubuntu) → ubuntu
Changed in ubuntu:
status: New → Incomplete
Revision history for this message
Allen (ccwtech) wrote :

Will do. It has stopped freezing since I switched to Cinnamon desktop.

Revision history for this message
Allen (ccwtech) wrote :

Prevboot

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

What was the exact date and time of that freeze? Next time it happens please follow the instructions in comment #4.

Revision history for this message
Allen (ccwtech) wrote :

1 - 14:38:34
2 - Done
3 - Done
4 - Attached - See above
5 - /var/crash is empty

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK thanks. It looks like whatever caused the problem didn't make it into the system log before the reboot. To fix that we add an extra step:

1. Note the exact time and date of the freeze.

2. Press 'Alt + PrtSc + S' a couple of times. Or just waiting 30 seconds might also work.

3. Reboot.

4. Run: journalctl -b-1 > prevboot.txt

5. Attach the resulting text file here along with a note telling us the exact time and date of the freeze.

6. Check /var/crash for crash files.

Revision history for this message
Allen (ccwtech) wrote :

I can try that, however, every time it has locked up in the past, it is 100% unresponsive. I have run extensive hardware diagnostics and nothing is wrong. I ran Windows just to test things and no lockups. So it's for sure something not 100% compatible with Linux.

Revision history for this message
Allen (ccwtech) wrote :

16:11:59
I waited over 30 seconds.
I pressed ALT + PrtSc + S several times.

Here is the log

Revision history for this message
Allen (ccwtech) wrote :

And another...

16:27:38
I waited over 30 seconds.
I pressed ALT + PrtSc + S several times.

Here is the log.

Still nothing in /var/crash

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Indeed those logs still don't explain any freeze.

The WD 4TB USB hard drive keeps showing some hardware errors but I also doubt that's the main problem here. Still, to find the cause of the freezes we should start removing things from the equation. Please try each of these separately:

 * Leaving the WD 4TB USB hard drive unplugged
 * Selecting 'Ubuntu on Wayland' on the login screen
 * Selecting a different Nvidia driver version in the 'Additional Drivers' app.

Revision history for this message
Allen (ccwtech) wrote :

Ok, I have unplugged the WD 4TB drive.

I am going to be having surgery tomorrow so you may not hear back from me for a couple of days. As soon as I can report back I will. Thanks for your help.

Revision history for this message
Allen (ccwtech) wrote :

Unplugging the 4 TB HD didn't stop the freezes.
Now running Ubuntu on Wayland.

Revision history for this message
Allen (ccwtech) wrote :

Ubuntu on Wayland still crashes. I have tried various different versions of the NVIDIA drivers before and it makes no difference. Here is another prevboot.txt file.

Lockup at 17:50:44

Revision history for this message
Allen (ccwtech) wrote :

17:45 or 17:46. I didn't get the exact time.

Revision history for this message
Allen (ccwtech) wrote :

Is any of this helpful? Is there anything else I can do? (Other than buying a new computer without an NVIDIA card!?!

Revision history for this message
Allen (ccwtech) wrote :

11:51:23
Attached file

Revision history for this message
Allen (ccwtech) wrote :

1724 / 1725

Revision history for this message
Allen (ccwtech) wrote :

18:13:11

Revision history for this message
Allen (ccwtech) wrote :

In UBUNTU IRC Chat two things were suggested:
Uninstall Brave Browser
Disable hardware acceleration on Firefox browser. (The browser I am using).

I was using Chrome and had tried disabling hardware acceleration on it to troubleshoot. That didn't help which is why I switched to Firefox.

It never locks up when I am not using the PC. Only when I am using it. In other words, I never come into the office from it being on overnight and finding it locked up.

Revision history for this message
Allen (ccwtech) wrote :

18:44:29
Sadly, issue persists.

Revision history for this message
Allen (ccwtech) wrote :

19:27

Revision history for this message
Allen (ccwtech) wrote :

19:31:51

Revision history for this message
Allen (ccwtech) wrote :

Looks like there is a new BIOS available for this PC. I updated to M3JKT3DA/1.0.0.61.

Revision history for this message
Allen (ccwtech) wrote :

BIOS Update didn't resolve anything.

New lockup at 21:13:20

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks for your efforts. The cause of the freezes still seems to be invisible in the logs.

Next please use the Additional Drivers app to uninstall the Nvidia 535 driver and try installing 545 instead.

Revision history for this message
Allen (ccwtech) wrote :

4:49:41 with NVIDIA 545 Driver (open kernel)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Does the mouse cursor also freeze when the bug occurs?

Please try deleting any local GNOME extensions:

  cd ~/.local/share/gnome-shell/
  rm -rf extensions

and then log in again. Does the bug still occur?

Revision history for this message
Allen (ccwtech) wrote :

The mouse freezes as well.

Currently, I am using 545 proprietary. So far I have 18:48 of uptime. I am going to continue to run to see if it crashes before deleting extensions. I'll report back.

Of course in the meantime there have also been some small updates as well.

Revision history for this message
Allen (ccwtech) wrote :

I spoke too soon. Here is the latest file. I will continue with your instructions now.

Revision history for this message
Allen (ccwtech) wrote :

8:06:14 was time of crash.

I ran the commands.

Revision history for this message
Allen (ccwtech) wrote :

11:51:05 is latest crash.

Revision history for this message
Daniel van Vugt (vanvugt) wrote (last edit ):

The cause of the freeze is still invisible. I think the next step is to figure out if it's the kernel or userspace. Please:

  sudo apt install openssh-server

and then log in using 'ssh' from another machine. This should give us a better idea of whether the non-graphical parts of the system are still responsive during or after a freeze.

Revision history for this message
Allen (ccwtech) wrote :

Will do.

Revision history for this message
Allen (ccwtech) wrote :

Latest lockup - 14:22:51
Non responsive k/b & mouse
Can't SSH in after the computer locked up.

It's 100% locked up.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. Next I would resort to drastic measures: Uninstall the Nvidia driver completely. Then wait and see if that's eliminated the freezes.

affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Incomplete → New
tags: added: nvidia
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in nvidia-graphics-drivers-545 (Ubuntu):
status: New → Incomplete
Revision history for this message
Allen (ccwtech) wrote :

Use x.org X server Nouveau dissplay drivers?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

You will actually get Wayland by default if the Nvidia driver is uninstalled. So try that as well as Xorg (selectable on the login screen).

Revision history for this message
Allen (ccwtech) wrote :

I am running the x.org X server Nouveau display driver now. I will advise.

Revision history for this message
Allen (ccwtech) wrote :

Here is the latest crash file.
Unresponsive - even trying ssh won't work

Revision history for this message
Allen (ccwtech) wrote :

And another. Crash at 85128

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. Please also check that ssh did work before the freeze :)

Changed in nvidia-graphics-drivers-545 (Ubuntu):
status: Incomplete → Invalid
no longer affects: nvidia-graphics-drivers-545 (Ubuntu)
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Allen (ccwtech) wrote :

Ssh works fine until it freezes.

Revision history for this message
Allen (ccwtech) wrote :

Should I continue to report this? Is there anything else?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm running out of ideas. If it was my machine I would try booting a different kernel:

  https://kernel.ubuntu.com/mainline/?C=M;O=D

And if that didn't work then I would even try changing the RAM in the machine.

Revision history for this message
Allen (ccwtech) wrote :

Newest or a specific one?

Never locks up if I run Windows and passes all hardware testing (repeated testing for overnight/ several passes)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Please try the newest one first:

  https://kernel.ubuntu.com/mainline/v6.8-rc7/amd64/

And if the freezes still happen in kernel 6.8 then try some older versions.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also warning: In order to test such unsigned kernels I think you need to disable Secure Boot in the BIOS.

Revision history for this message
Allen (ccwtech) wrote :

Ok, running 6.8.

uname -r
6.8.0-060800rc7-generic

Revision history for this message
Allen (ccwtech) wrote :

Still lockups- No ssh
18:08:04 attached

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Please try an older v5.x kernel

Revision history for this message
Allen (ccwtech) wrote :

 uname -r
5.19.17-051917-generic

Revision history for this message
Allen (ccwtech) wrote :

110230 - Locked up, but instead of staying locked up the desktop crashed and took me to the login screen.

Revision history for this message
Allen (ccwtech) wrote :

130538 full lockup, no ssh available.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Maybe try an even older kernel next, and/or disabling the Nvidia GPU in the BIOS if there's an option to do so.

Revision history for this message
Allen (ccwtech) wrote :

What kernel do you suggest.
Disabling NVIDA isn't an option. I need all 4 monitors.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Perhaps we're approaching it wrong. Try going back to the Ubuntu kernel (6.5) and enabling debug messages with these kernel parameters:

  drm.debug=0xff loglevel=8

then collect another prevboot.txt after the freeze occurs.

Revision history for this message
Allen (ccwtech) wrote :

Came in this morning and it was locked up. NO SSH.

Going to move back to 6.5

Revision history for this message
Allen (ccwtech) wrote :

I'm not sure how to do this:

"drm.debug=0xff loglevel=8"

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

1. Edit /etc/default/grub and add them to GRUB_CMDLINE_LINUX_DEFAULT=...

2. Run: sudo update-grub

3. Reboot.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Allen (ccwtech) wrote :

Trying to upload the latest file. It's 1.3 G

Uh oh!
Something has gone wrong. We're sorry!

If we are in the middle of an update, Launchpad will be back in a couple of minutes. Otherwise, we are working to fix the unexpected problems. Check @launchpadstatus on Twitter or @<email address hidden> on Mastodon for updates.

If the problem persists, let us know in the #launchpad IRC channel on libera.chat.

Technically, the load balancer took too long to connect to an application server.

Reload this page or try again in a few minutes

No one on Launchpad chat responds to me.

Revision history for this message
Allen (ccwtech) wrote :

9:14:36 No SSL Possible. Attached.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. It seems I asked for too much of the wrong information there. Please:

1. Remove drm.debug=0xff but keep loglevel=8

2. Reinstall the Nvidia driver using the 'Additional Drivers' app.

3. Reboot

Revision history for this message
Allen (ccwtech) wrote :

Got it. Testing.

Revision history for this message
Allen (ccwtech) wrote :

17:15:47 - SSL not avail

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Allen (ccwtech) wrote :

I see the status change of the bug. Did the logs finally show anything?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

That's because I ran out of ideas, sorry.

summary: - Random Freezing
+ Random whole system lockups
summary: - Random whole system lockups
+ Random whole system lockups on Lenovo ThinkStation P350 Tiny
Revision history for this message
Allen (ccwtech) wrote :

Should I still submit logs?

I think I'll just buy a new PC w/o NVIDIA and re-purpose this one...

Here is the latest

Lockup 070244 - SSL not avail

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I would not give up on NVIDIA just yet. No one else with NVIDIA hardware is reporting an issue like this so it's just as likely a problem with some other kernel driver or system component.

That said, unless you want to experiment with swapping components (like the NVIDIA card or RAM) then changing to a new PC would be the fastest solution.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.