[Dell G5 5590] lspci freezes computer on Ubuntu 20.04

Bug #1917194 reported by Ariel Torti
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Committed
Undecided
Unassigned

Bug Description

After upgrading from Ubuntu 18.04 to 20.04 my computer started to freeze when running lspci.

I ended up noticing that it happened when trying to query device '1f:00.0' a USB port. After some kernel and bios updates the problem got worse. Now the computer freezes ~5 seconds after booting up.

I tried:
1) Running in recovery mode, also freezes
2) Running an older kernel version (5.4.0-66-generic), also freezes
3) Removing nvidia drivers (just in case), also freezes
4) Running all combinations nodemodeset, acpi=off an nouveau blacklisting
5) Updating all drivers

The computer freezes for ever, I can't switch TTYs, move the mouse, anything, the only way to leave was to force shutdown. On the last cases after switching to a tty I could see this messages before the freeze ocurred:

kernel: xhci_hcd 0000:1f:00.0: PCI post-resume error -19!
kernel: xhci_hcd 0000:1f:00.0: HC died; cleaning up

Running Ubuntu 20.04 from a USB sticks works fine, as well as doing a fresh install.
All logs were obtained by chrooting to the faulty partition from a fresh install, if there are any more logs that would be helpful, let me know.

Tags: focal
Revision history for this message
Ariel Torti (atorti) wrote :
Revision history for this message
Ariel Torti (atorti) wrote :
Revision history for this message
Ariel Torti (atorti) wrote :
Revision history for this message
Ariel Torti (atorti) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1917194

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Ariel Torti (atorti)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Ariel, thanks for your report! So, let me clarify: now your system is freezing on boot, you don't need to run "lspci" to observe the freeze anymore?
Can you try adding "ignore_loglevel" in the kernel command-line on Grub, and observe the boot progressing? If so, then please take a screenshot/photo when it freezes, so we can see the last messages observed.

If machine is booting fine with the USB stick, can you perform such boot and try lspci from the live Ubuntu image? If lspci is a reproducer, it is worth to collect a dmesg while the problem happens, so we can investigate if any message pops during the event.
Thanks,

Guilherme

Revision history for this message
Ariel Torti (atorti) wrote :

Hey !

I'm able to boot and get into a tty. But after a few seconds (even if I don't do anything) it freezes.

Machine boots fine with the USB stick and running lspci doesn't trigger a freeze.

Attached is the dmesg log, but it seems to be truncated. I managed to log into a tty and run `dmesg -w` before it froze. I'll attached the images below

Revision history for this message
Ariel Torti (atorti) wrote :

You can see the image contains a few more logs that couldn't be written to the disk

Revision history for this message
Ariel Torti (atorti) wrote :

This is what happens if I go into the tty and don't do anything. All screenshots and logs were obtained running with `ignore_loglevel` kernel param

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Ariel, thanks for the pictures! This is interesting, seems you're facing a USB host controller breakage, so all your USB devices are frozen. Can you try something, in order for us to be sure about that? Please, setup a SSH server [0], and after the freeze, try to connect to the machine using SSH. I'm wondering if the system is working "fine", only USB is broken. That will help us to debug.

Oh, and in case you don't have multiple computers, you could use an Android/iOS ssh client to access the computer, just for the testing perspective.

Thanks,

Guilherme

[0] https://help.ubuntu.com/community/SSH/OpenSSH/Configuring

Revision history for this message
Ariel Torti (atorti) wrote :

Seems like everything gets locked, including the kernel.
I tried connecting through SSH but it isn't answering.

debug1: Connecting to ariel-nt [ariel-nt] port 22.
debug1: connect to address ariel-nt port 22: No route to host
ssh: connect to host ariel-nt port 22: No route to host

I also tried pinging it, when it freezes it also stops answering.

# Before booting
From _ icmp_seq=30 Destination Host Unreachable
From _ icmp_seq=31 Destination Host Unreachable
From _ icmp_seq=32 Destination Host Unreachable
From _ icmp_seq=33 Destination Host Unreachable
From _ icmp_seq=34 Destination Host Unreachable
From _ icmp_seq=35 Destination Host Unreachable
From _ icmp_seq=36 Destination Host Unreachable
From _ icmp_seq=37 Destination Host Unreachable
From _ icmp_seq=38 Destination Host Unreachable
From _ icmp_seq=39 Destination Host Unreachable
# Connected to the internet
64 bytes from ariel-nt: icmp_seq=40 ttl=64 time=2512 ms
64 bytes from ariel-nt: icmp_seq=41 ttl=64 time=1490 ms
64 bytes from ariel-nt: icmp_seq=42 ttl=64 time=467 ms
64 bytes from ariel-nt: icmp_seq=43 ttl=64 time=33.6 ms
64 bytes from ariel-nt: icmp_seq=44 ttl=64 time=39.4 ms
64 bytes from ariel-nt: icmp_seq=45 ttl=64 time=6.22 ms
64 bytes from ariel-nt: icmp_seq=46 ttl=64 time=7.62 ms
64 bytes from ariel-nt: icmp_seq=47 ttl=64 time=6.84 ms
64 bytes from ariel-nt: icmp_seq=48 ttl=64 time=3.22 ms
# Froze
From _ icmp_seq=97 Destination Host Unreachable
From _ icmp_seq=98 Destination Host Unreachable
From _ icmp_seq=99 Destination Host Unreachable
From _ icmp_seq=100 Destination Host Unreachable
From _ icmp_seq=101 Destination Host Unreachable
From _ icmp_seq=102 Destination Host Unreachable
From _ icmp_seq=103 Destination Host Unreachable
From _ icmp_seq=107 Destination Host Unreachable
From _ icmp_seq=108 Destination Host Unreachable

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

So, when the system is working fine, before the freeze, can you SSH into it?

Also, what is the kernel version running in the Live image you're booting there? This information shows early in the kernel logs...but you can also collect using the "uname -a" command. I'd like to compare the version of the booting USB stick versus the regular version showing issues.
Thanks in advance,

Guilherme

Revision history for this message
Ariel Torti (atorti) wrote :

> So, when the system is working fine, before the freeze, can you SSH into it?
Yes, but get timeout after the freeze

The USB stick is running Linux 5.8.0-43-generic

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

OK, so how about if you install the kernel version 5.8.0-43 in your system? In order to do that, I suggest you to boot using the Live CD and use chroot - there are goos instructions here : https://askubuntu.com/a/28100

The only thing to change is step 6: please run "sudo apt-get install linux-image-5.8.0-43-generic linux-modules-extra-5.8.0-43-generic". After that, try to boot your regular system using the kernel that seems to work for you, from the Live CD experience.
Thanks,

Guilherme

Revision history for this message
Ariel Torti (atorti) wrote :

Tried 5.8.0-43 kernel and didn't work.
I also tried booting Manjaro which uses 5.6.15-1-MANJARO and the issue is present.

I ended up doing a workaround, compiling an ubuntu kernel with `pci-stub` as builtin and `xhci-hcd` as a dynamic module. That way I can make pci-stub bind to the culprit device and everything works fine.

Revision history for this message
koba (kobako) wrote :

@Ariel, would you like try the oem kerenl!?
#sudo apt install linux-image-oem-20.04b

koba (kobako)
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.