ACPI errors and USB reset loop when hot-plugging a Titan Ridge laptop into a dock (9500, TB16)

Bug #1922336 reported by Georgi Boiko
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Dell Sputnik
New
Undecided
Unassigned
linux (Ubuntu)
Confirmed
Undecided
Unassigned
linux-meta-oem-5.6 (Ubuntu)
Invalid
Undecided
Unassigned
linux-oem-5.10 (Ubuntu)
Won't Fix
Undecided
Unassigned
linux-oem-5.13 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

I've recently upgraded my workhorse from XPS 9560 (2016) to a newer generation XPS 9500 (2020) and ran into several things that feel like regressions, but are probably related to hardware changes. This is one of them.

tl;dr:
When using XPS 9500 with a TB16 dock, hot-plugging the dock causes ACPI errors in dmesg and some kind of reset loop with USB peripherals connected via the dock.

details:
USB peripherals keep reconnecting according to dmesg, but in practice they barely ever finish registering with the system before disappearing again. Maybe 1 out of 40 USB keyboard presses registers while this is happening. At this point my choices are to either reboot with the dock connected or work without the dock and consequently without any peripherals on my desk. Booting with an open lid and the dock connected works fine until I re-plug the dock.

Notably, it hot-plugs fine every time when using XPS 9560 or Precision 5520 in the same setup which I have been using for ages and which had similar problems until around kernel 4.13.

Since it was fine with 9560, 5520 and a friend with a 9570 has no issues either, my gut feeling is that this is due to the upgrade from Alpine Ridge to a Titan Ridge Thunderbolt controller that happened in this generation - something wrong with the driver, or the firmware may have missed some of the "lessons learned" in Alpine Ridge and caused this regression. That would also make it applicable to 9300, which has a "developer edition" option under Project Sputnik.

I have attempted 5.8 Ubuntu generic, 5.6 OEM (-20.04) and 5.10 OEM (-20.04b and -20.04-edge) kernels on the 9500 with same results. I have attempted rebinding the XHCI controller after reconnecting the dock. I have attempted disabling USB autosuspend and ASPM via a GRUB kernel parameter. I have attempted playing with BIOS settings: wake on dell usb-c docks, disable early sign of life for both checkboxes, disabling SGX and SMM, checking all 3 boxes for Thunderbolt and switching off Thunderbolt security. None of these make a noticeable difference.

System config:

XPS 9500, i7, 32GB RAM
BIOS 1.6.1, TB3 firmware NVM60
Ubuntu 20.04.02 LTS, kernels 5.6-oem, 5.8-generic, 5.10-oem (same behaviour)
TB16 dock firmware 1.0.4 (MST 3.12.02)
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.16
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC3: gboiko 1990 F.... pulseaudio
 /dev/snd/controlC2: gboiko 1990 F.... pulseaudio
 /dev/snd/controlC1: gboiko 1990 F.... pulseaudio
 /dev/snd/controlC0: gboiko 1990 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-03-31 (6 days ago)
InstallationMedia: Ubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 (20210209.1)
MachineType: Dell Inc. XPS 15 9500
NonfreeKernelModules: nvidia_modeset nvidia
Package: linux-meta-oem-5.6
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.6.0-1052-oem root=/dev/mapper/vgubuntu-root ro net.ifnames=0 biosdevname=0 ipv6.disable=1 quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.6.0-1052.56-oem 5.6.19
RelatedPackageVersions:
 linux-restricted-modules-5.6.0-1052-oem N/A
 linux-backports-modules-5.6.0-1052-oem N/A
 linux-firmware 1.187.10
Tags: focal
Uname: Linux 5.6.0-1052-oem x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/24/2020
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.6.1
dmi.board.name: 0RHXRG
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 10
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.6.1:bd12/24/2020:svnDellInc.:pnXPS159500:pvr:rvnDellInc.:rn0RHXRG:rvrA03:cvnDellInc.:ct10:cvr:
dmi.product.family: XPS
dmi.product.name: XPS 15 9500
dmi.product.sku: 097D
dmi.sys.vendor: Dell Inc.

Revision history for this message
Georgi Boiko (pandasauce) wrote :
Revision history for this message
Georgi Boiko (pandasauce) wrote :

Kernel logs with the ACPI errors

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1922336

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Georgi Boiko (pandasauce) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected focal
description: updated
Revision history for this message
Georgi Boiko (pandasauce) wrote : CRDA.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : IwConfig.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : Lspci.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : Lspci-vt.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : Lsusb.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : Lsusb-t.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : Lsusb-v.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : ProcEnviron.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : ProcModules.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : PulseList.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : RfKill.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : UdevDb.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : WifiSyslog.txt

apport information

Revision history for this message
Georgi Boiko (pandasauce) wrote : acpidump.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
koba (kobako) wrote :

@Georgi,
would you please dump
#boltctl
#fwupdmgr get-device

Revision history for this message
Georgi Boiko (pandasauce) wrote :

@koba, please find them attached and thanks for looking into this.

Revision history for this message
Georgi Boiko (pandasauce) wrote :
Revision history for this message
koba (kobako) wrote :

@Georgi,
When you try drm-tip, did you find this issue?

https://bugs.launchpad.net/dell-sputnik/+bug/1922334/comments/25

Revision history for this message
Georgi Boiko (pandasauce) wrote :

@koba, if I understood you correctly, yes, ACPI errors still appear on drm-tip. Actually, all hell broke loose on that kernel: after triggering the issue, Bluetooth became unresponsive; calling sudo to edit /etc/default/grub ended up hanging indefinitely; other applications being launched were showing weird rendering glitches and became unresponsive. Logs attached.

Revision history for this message
koba (kobako) wrote :

@Georgi,
would you please update bios first? there's a update.
https://www.dell.com/support/home/en-us/product-support/product/xps-15-9500-laptop/drivers

Revision history for this message
Georgi Boiko (pandasauce) wrote :

@koba, I updated as soon as this BIOS was available. All of my testing over the last few days was done on 1.7.1 already.

Revision history for this message
koba (kobako) wrote : Re: [Bug 1922336] Re: ACPI errors and USB reset loop when hot-plugging a Titan Ridge laptop into a dock (9500, TB16)

@Georgi, thanks your effort.
would you please help to try 5.12 kernel?
please find the generic version with amd64.
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.12/

Revision history for this message
koba (kobako) wrote :

@Georgi,
Please also enable acpi debug

echo enable | sudo tee /sys/module/acpi/parameters/trace_state

echo 0x01000000 | sudo tee /sys/module/acpi/parameters/debug_level

Revision history for this message
Georgi Boiko (pandasauce) wrote :

@koba, please find the logs attached for 5.12 with debug enabled as instructed

Revision history for this message
Georgi Boiko (pandasauce) wrote :

I should add that on 5.12 there is no USB reset loop observed in my original post; USB devices attached to the dock just don't work at all when re-plugging in the dock, even sporadically. After re-plugging in the dock, the built-in laptop keyboard works, but calling anything that relies on sudo/root, even just "sudo dmesg" or "reboot" hangs indefinitely. The terminal keeps blinking, so it's not a complete hard freeze. Removing the dock again does not make it recover. Hard-reset is the only thing that is left at that point to get the system operational again.

Revision history for this message
koba (kobako) wrote :

@Georgi, thanks your collection,
If you just remove all external devices(keyboard, mouse, monitor) from tb16,
would you also get the issue?

Revision history for this message
Georgi Boiko (pandasauce) wrote :

@koba, with everything removed there is no ACPI error on plugging the dock back in. However, if I start plugging external devices back into the dock after it has registered with the system, the USB reset loop behaviour is still exactly the same as in the original post.

In the attached log, I removed the dock (with nothing attached to it) at 22:00:11, plugged it back in at 22:00:23 with nothing attached to it, started attaching external devices to it at 22:01:33.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-meta-oem-5.6 (Ubuntu):
status: New → Confirmed
Revision history for this message
Steve Baker (g-steve-2) wrote :

Hello, I have this problem too - XPS 9500, TB16. No problems using this dock with previous laptop (XPS 9550).

Some extra information that I didn't see mentioned is that ​USB storage connected to the dock (USB-A keyfob, USB-C Samsung T5) seem to work OK, and dock ethernet seems OK. For me, anyway. I also have a DVD-ROM plugged in that seems OK (haven't read a disc, but I can 'eject dvd' and the tray opens).

For me the USB mouse is the problem - it doesn't work. Every 1 min it reconnects, works for <1 second, then stops working. Have tried front USB3, rear USB2 & USB3 ports, all have same symptoms.

I'm on kernel 5.11, have updated TB16 firmware to 1.0.4. I dual-boot this system with Windows, with latest drivers/FW installed. BIOS is 1.7.1.

HOpe this is helpful. I will try to assist if possible.

Revision history for this message
Steve Baker (g-steve-2) wrote :

I have some additional info (this laptop is <1wk old, I'm still discovering behaviour). If I boot the laptop with the dock plugged in, boltctl sees the dock and cable as connected but the attached ports and peripherals are not visible - except for the external monitor, which does work. If I un-plug then re-plug the dock, the port discovery messages show in kern.log and all the peripherals/usb-hubs/etc. appear.

Revision history for this message
Georgi Boiko (pandasauce) wrote :

BIOS 1.8.1, TBT firmware v65, TB16 firmware 1.0.5 and OEM kernel 5.13 still affected

Revision history for this message
Georgi Boiko (pandasauce) wrote :

FWIW, the C-states workaround from this post works for me and resolves the hot-plugging issue with TB16: https://www.dell.com/community/XPS/XPS-13-9300-and-WD19TB-linux-problem/m-p/7842208/highlight/true#M81522

Currently running BIOS 1.9.1, TBT firmware V65, TB16 firmware 1.0.5 and kernel 5.13.0-1012-oem.

I hope this helps pin down the bug or at least becomes useful for others who run into this.

Timo Aaltonen (tjaalton)
Changed in linux-meta-oem-5.6 (Ubuntu):
status: Confirmed → Invalid
Changed in linux-oem-5.10 (Ubuntu):
status: New → Won't Fix
Changed in linux-oem-5.13 (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.