Linux 5.11.0-25 & 5.11.0-27 panic when boot with thunderbolt dock connected

Bug #1940113 reported by Joseph Maillardet
4
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Hello,

My laptop run well with last 5.4 linux kernel even if I connect my thunderbolt dock.

When trying to boot with 5.11.0-25 or 5.11.0-27 (hwe) kernel with dock and 2 monitors connected, the kernel panic after 3 or 4 seconds. Without the dock, he work fine.

After a lot of try, I was able to boot sometime on 5.11 with the dock, but randomly and only if I only connect 1 extra monitor (I usually use two).
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.18
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jmaillar 1988 F.... pulseaudio
 /dev/snd/controlC1: jmaillar 1988 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: GNOME
DistroRelease: Ubuntu 20.04
EcryptfsInUse: Yes
HibernationDevice: RESUME=none
MachineType: Dell Inc. Precision 7530
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.11.0-27-generic root=UUID=b21fab6e-02fe-4295-a308-9da15f339a36 ro
ProcVersionSignature: Ubuntu 5.11.0-27.29~20.04.1-generic 5.11.22
RelatedPackageVersions:
 linux-restricted-modules-5.11.0-27-generic N/A
 linux-backports-modules-5.11.0-27-generic N/A
 linux-firmware 1.187.15
Tags: focal
Uname: Linux 5.11.0-27-generic x86_64
UpgradeStatus: Upgraded to focal on 2020-04-20 (483 days ago)
UserGroups: adm audio cdrom dip disk input kismet kvm libvirt libvirtd lpadmin lxd netdev plugdev sambashare sudo systemd-network users vboxusers video wireshark
_MarkForUpload: True
dmi.bios.date: 05/25/2021
dmi.bios.release: 1.16
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.16.1
dmi.board.name: 0425K7
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 10
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.16.1:bd05/25/2021:br1.16:svnDellInc.:pnPrecision7530:pvr:rvnDellInc.:rn0425K7:rvrA00:cvnDellInc.:ct10:cvr:
dmi.product.family: Precision
dmi.product.name: Precision 7530
dmi.product.sku: 0831
dmi.sys.vendor: Dell Inc.
modified.conffile..etc.default.apport:
 # set this to 0 to disable apport, or to 1 to enable it
 # you can temporarily override this with
 # sudo service apport start force_start=1
 enabled=0
mtime.conffile..etc.default.apport: 2015-11-27T14:42:56.796345

Revision history for this message
Joseph Maillardet (jokx) wrote :
Revision history for this message
Joseph Maillardet (jokx) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected focal
description: updated
Revision history for this message
Joseph Maillardet (jokx) wrote : CRDA.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : IwConfig.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : Lspci.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : Lspci-vt.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : Lsusb.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : Lsusb-t.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : Lsusb-v.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : ProcModules.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : PulseList.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : RfKill.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : UdevDb.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : WifiSyslog.txt

apport information

Revision history for this message
Joseph Maillardet (jokx) wrote : acpidump.txt

apport information

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Chris Chiu (mschiu77) wrote :

Could you take a snapshot while you run into a "panic" since even the journal log you post doesn't have the information of the panic. And does it only happen 100% when 2 monitors connected? And it's AMDGPU, could you help verify the latest mainline kernel here? https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14-rc7/

Please download and install amd64/build generic deb files (files marked A, see section "How do I install an upstream kernel?" in https://wiki.ubuntu.com/Kernel/MainlineBuilds)

Revision history for this message
Joseph Maillardet (jokx) wrote :

Ok, but I don't know how to "take a snapshot" while my machine run into a kernel panic.

Can you explain me how to do that ?

Revision history for this message
Joseph Maillardet (jokx) wrote :

5.14.0-rc7 boot without problem even if the dock and the two additional screen are plugged in.
I put the boot log in attachment

Revision history for this message
Joseph Maillardet (jokx) wrote :
Revision history for this message
Joseph Maillardet (jokx) wrote :

If it helps, I also post the result of `journalctl -k -b all` here.

I have the impression that the kernel panic of kernel 5.11 does not appear, the crash occurs systematically between seconds 2 and 4 of the boot sequence.

However, this extraction contains logs of successful boot on kernel 5.11, when the dock is not connected. Hoping that could be useful...

Revision history for this message
Joseph Maillardet (jokx) wrote :

Ok, sorry, I went a little too fast!
The attachment #23 does not contain the right log. Apologize.

Fortunately, attachment #25 contains interesting items. Look for the sequence "-- Reboot --" and there, at occurrence #54 and #55, there seems to be traces of kernel panic. There are also two boot sequences logs of kernel 5.14 at occurrence #70 and #71.

Revision history for this message
Joseph Maillardet (jokx) wrote :

@Chris Chiu (mschiu77): "And does it only happen 100% when 2 monitors connected?" → Yes !

Revision history for this message
Chris Chiu (mschiu77) wrote :

So the mainline 5.14 works for you even with 2 monitors connected, right?
Then could you help me find out which version starts to have this problem by testing the kernel in the url down below?
https://people.canonical.com/~mschiu77/lp1940113/H/
https://people.canonical.com/~mschiu77/lp1940113/I/

Just let me know the test result and paste the journal log here if the system freeze after reboot. Thanks

Revision history for this message
Joseph Maillardet (jokx) wrote :

I just tested both kernels and here are the results:

I first rebooted on the 5.11.0-33 kernel with the dock and the two screens connected.
The system froze after 3s of boot.

I unplugged the dock and rebooted on the 5.11.0-33 kernel.
The boot went well until the desktop. There, I reconnected the dock and everything froze.

I then tried the 5.13.0-14 kernel, dock unplugged.
The boot went well until the desktop. There, I plugged the dock again and the system partially froze. The mouse didn't move anymore, but I could still manipulate the Gnome 3 shell with the keyboard. I wanted to launch a terminal, the window opened but the shell (bash) failed to initialize (no prompt). I had to produce a hard reboot of the computer.

I finally retried the kernel 5.13.0-14, dock connected.
The startup slowed down progressively until Systemd showed me a list of pending services...

Attached, the corresponding logs that journalctl managed to store (2 out of 4 attempts). It seems that these are the ones where I plugged the dock in a second time.

Revision history for this message
Joseph Maillardet (jokx) wrote :

@Chris Chiu (mschiu77): "So the mainline 5.14 works for you even with 2 monitors connected, right?" → Yes !

Revision history for this message
Chris Chiu (mschiu77) wrote :

Thanks for the testing result. From your result the 5.13.12 still have the issue but gone in 5.14, and it helps to find out which commit fix this. https://www.spinics.net/lists/kernel/msg4038892.html could be the patch we need. So I re-build a 5.13 kernel based on it. Please help verify the kernel in https://people.canonical.com/~mschiu77/lp1940113/I_dbg. Thanks

Revision history for this message
Joseph Maillardet (jokx) wrote :

You're welcome. Thank you for taking time to solve this problem! The result was close to the previous version 5.13:

The boot with dock and dual screen crashed at 4s then, after a dozen seconds, a kernel dump was displayed, then nothing.

The startup without the dock went well, then the connection of the dock again caused a partial blocking and an inability to start a new process.

Attached, the log of journalctl.

Revision history for this message
Chris Chiu (mschiu77) wrote :

The dmesg of 5.13 kept showing the error as follow
août 26 11:18:46 syncom kernel: Workqueue: kacpid acpi_os_execute_deferred
août 26 11:18:46 syncom kernel: RIP: 0010:acpi_ds_exec_end_op+0x182/0x779

But it didn't show in kernel 5.11. That's pretty confusing to me. I may need more ACPI debug message for it. Before doing that, can you test the following kernels for me so that I can narrow down the range of culprit.Thanks

https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.13.12/
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14-rc3/

Revision history for this message
Joseph Maillardet (jokx) wrote :

I booted the kernel 5.13.12 with the dock unplugged, everything went well. Once the dock was plugged in, the screen froze but strangely this time, I could move the mouse on the screen (with the trackpad of the laptop). Maybe I didn't notice this possibility before?

I then started the kernel 5.14.rc3 and everything worked fine, even with the dock plugged in.

You will find the logs attached.

Revision history for this message
Chris Chiu (mschiu77) wrote :

The ACPI error didn't show in 5.13.12, so 5.14 may fix the ACPI related issues but there should be AMDGPU related issue between 5.11 and 5.13. Please help me find out ACPI related fix first with the following kernels. Thanks

https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14-rc1/
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14-rc2/

Revision history for this message
Joseph Maillardet (jokx) wrote :

There are no deb packages in the first link (rc1).

The rc2 version started and worked without problem with the dock and the two screens.

Revision history for this message
Chris Chiu (mschiu77) wrote :

There's so build problem with 5.14.0-rc1. Could you try https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.13.13/? Thanks.

Revision history for this message
Joseph Maillardet (jokx) wrote :

And here is the result: complete freeze of the screens. I attached the logs as usual.
So 5.14.0-rc2 works and 5.13.13 crash. It remains to find what has been corrected between these two versions?

Revision history for this message
Chris Chiu (mschiu77) wrote :

Yes. There're some amdgpu update between 5.13.3 and 5.14.0-rc2. I'll pick some of them to narrow down the range. Please help me test the kernel in https://people.canonical.com/~mschiu77/lp1940113/amd_3.2.137/. Thanks

Revision history for this message
Joseph Maillardet (jokx) wrote :

This time, GDM failed to launch (without dock). I ended the startup sequence on a blinking cursor in the upper left corner and that was it.
I was able to connect to the TTY2 where I ran htop. There I plugged the dock and a kernel oops was displayed. Then, htop progressively erased it partially.
From there, impossible to start a new session. I also quit htop but I didn't get the prompt back.

Note: a little problem during the installation, I added the dpkg log at the beginning of the attachment file.

Revision history for this message
Chris Chiu (mschiu77) wrote :

Good. Please help me test the kernel in https://people.canonical.com/~mschiu77/lp1940113/amd_3.2.140/. Thanks

Revision history for this message
Joseph Maillardet (jokx) wrote :

Result: partial freeze of the screens. No mouse but keyboard ok. htop running but unable to launch a new prompt or any other process.

Revision history for this message
Chris Chiu (mschiu77) wrote :

Hmm...It means there's still huge difference.
Please try our 5.14 release candidate https://people.canonical.com/~mschiu77/lp1940113/5_14_next/. Thanks

Revision history for this message
Joseph Maillardet (jokx) wrote :

Sorry for the wait, I took a few days off.
This kernel worked well.

Revision history for this message
Joseph Maillardet (jokx) wrote :

I tried the new kernel 5.11.0-36-generic, but the problem is still there.
I'm attaching the log in case it's helpful.

To post a comment you must log in.