xHCI race conditions 5.15.0-40-generic

Bug #1979886 reported by Harald Rudell
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

In 22.04 there are plenty of usb troubles from xHCI

Some hardware gets the xHCI dead and complete usb shutdown
other hardware works better with first tier of hubs
second hub tier is still unreliable, pointing to race conditions

troubled devices seems to be VIA 2109:0822 2109:0715

Here I have a 5 Gb/port, hub then hub 2109:0822

echoed to console:
Jun 19 23:51:15 kernel: [29817.094981] hub 2-5:1.0: hub_ext_port_status failed (err = -110)
Jun 19 23:51:17 kernel: [29819.142955] usb 2-5.3: Port disable: can't disable remote wake
Jun 19 23:51:18 kernel: [29820.166925] usb 2-5-port3: cannot disable (err = -110)
Jun 19 23:51:20 kernel: [29822.374883] usb 1-8: Failed to suspend device, error -110
Jun 19 23:51:23 kernel: [29825.286836] hub 2-5:1.0: hub_ext_port_status failed (err = -110)

key here: Port disable: can't disable remote wake
-110 is ETIMEDOUT which when plugged in direct host produces xHCI dead

If the device is unplugged, unplug is undetected and leads to infinite errors -110 -71
— missed unplug has also been seen with usb storage 2109:0715

other key errors:
LPM exit latency is zeroed, disabling LPM.
usb_reset_and_verify_device Failed to disable LPM

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-40-generic 5.15.0-40.43
ProcVersionSignature: Ubuntu 5.15.0-40.43-generic 5.15.35
Uname: Linux 5.15.0-40-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D2', '/dev/snd/pcmC0D10p', '/dev/snd/pcmC0D9p', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckResult: unknown
Date: Sat Jun 25 08:49:26 2022
HibernationDevice: RESUME=none
MachineType: Apple Inc. Macmini8,1
ProcEnviron:
 SHELL=/bin/bash
 LANG=en_US.UTF-8
 TERM=screen
 PATH=(custom, no user)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: root=ZFS=rpool/ROOT/ubuntu_mc4at7 ro initrd=EFI\hostname\initrd.img
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-40-generic N/A
 linux-backports-modules-5.15.0-40-generic N/A
 linux-firmware 20220329.git681281e4-0ubuntu3.2
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/24/2022
dmi.bios.release: 0.1
dmi.bios.vendor: Apple Inc.
dmi.bios.version: 1731.120.10.0.0 (iBridge: 19.16.15071.0.0,0)
dmi.board.name: Mac-7BA5B2DFE22DDD8C
dmi.board.vendor: Apple Inc.
dmi.board.version: Macmini8,1
dmi.chassis.type: 9
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-7BA5B2DFE22DDD8C
dmi.modalias: dmi:bvnAppleInc.:bvr1731.120.10.0.0(iBridge19.16.15071.0.0,0):bd04/24/2022:br0.1:svnAppleInc.:pnMacmini8,1:pvr1.0:rvnAppleInc.:rnMac-7BA5B2DFE22DDD8C:rvrMacmini8,1:cvnAppleInc.:ct9:cvrMac-7BA5B2DFE22DDD8C:sku:
dmi.product.family: Mac mini
dmi.product.name: Macmini8,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.

Revision history for this message
Harald Rudell (harald-rudell) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Harald Rudell (harald-rudell) wrote :

This worked on 20.04 although second tier hubs then did not work at all, neither did any kind of storage on hubs

Revision history for this message
Harald Rudell (harald-rudell) wrote :

The success with 22.04 xHCI depends on the hardware vintage. 2016– 3.1 hardware seems to work better than 2012 3.0 hardware

xHCI fails with particular devices and a combination of host hardware and single-tier intermediate hubs has to be tested out. Some host hardware and endpoint device combinations cannot be made to work for xHCI 22.04

To avoid the xHCI dead bug that shuts down all usb on the host, the troubled device should be connected via an intermediate hub, preferably a Realtek chip hub

Although usb supports 7 tiers of hubs, possibly no device works with 2 or more tiers for 22.04 xHCI

04e8:61f5 at 10 Gb/s direct to host with 2018 hardware experiences disconnect within 24 hours 5.15.0-40-generic xHCI
Jul 2 13:18:19 x kernel: [77760.394807] usb 6-2: USB disconnect, device number 3

Although one 0bda:8153 works on a dedicated port 5 Gb/s 2012 hardware 5.15.0-40-generic xHCI, a second device fails within than 7 hours:
Jul 02 21:25:53 c32z kernel: usb 4-2: USB disconnect, device number 3
Jul 02 21:25:53 c32z kernel: xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
— after that the device fails to be recognized even after a reboot although it works on an identical host-clone set-up
— the second device could finally be made working by inserting a 0bda:0423 hub, now exhibiting an uptime exceeding 10 hours

No VIA devices 2109: appears to work with 2012 hardware for any hub layout

It appears xHCI is not tested for more than 1 tier of hubs and I believe this has never worked for xHCI as opposed to ehci

Revision history for this message
Jérôme Poulin (jeromepoulin) wrote :

I can confirm the same kind of behavior on a ASMedia Technology Inc. ASM2142/ASM3142 USB 3.1 Host Controller.

It has been plaguing me for years, xhci_hcd seems broken when using USB hubs such as the one used by the Valve Index (VR headset). I have to use my reset_usb script once in a while to rebind xhci_hcd and reset the whole thing. At least I don't have to reboot anymore but it is more than annoying.

Same behavior as yours, USB works then suddenly one or more device behind the hub stops responding. Then everything is frozen until I unplug a device, then it loops with

[46588.094379] hub 3-2:1.0: hub_ext_port_status failed (err = -110)

This script is able to reset the driver to make everything working again but, of course, will crash my virtual reality session.

```
#!/bin/bash
[ `id -u` -ne 0 ] && exec sudo "$0"
cd /sys/bus/pci/drivers/xhci_hcd
BUS="0000:b3:00.0"
echo "$BUS" > unbind
sleep 5
echo "$BUS" > bind
sleep 5
find "$BUS/" -name control -exec /bin/sh -c "echo on > {}" \;
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.