Comment 81 for bug 1766076

Revision history for this message
Oded Arbel (oded-geek) wrote :

I haven't had that issue for a while, since kernel 4.20, I think, but I just had this issue again, on up to date 19.04 with kernel 5.0.0-13:

[376539.800632] xhci_hcd 0000:0e:00.0: ERROR unknown event type 15
[376544.914548] xhci_hcd 0000:0e:00.0: xHCI host not responding to stop endpoint command.
[376544.914869] xhci_hcd 0000:0e:00.0: xHCI host controller not responding, assume dead
[376544.914937] xhci_hcd 0000:0e:00.0: HC died; cleaning up
[376544.914941] usb 3-1.5: Failed to suspend device, error -22
[376544.914948] usb 4-1.1.1: Failed to suspend device, error -22
[376544.914956] usb 3-1: USB disconnect, device number 2
[376544.914957] usb 3-1.1: USB disconnect, device number 3
[376544.914958] usb 3-1.1.1: USB disconnect, device number 5
[376544.914959] usb 3-1.1.1.4: USB disconnect, device number 8
[376544.915173] usb 4-1: USB disconnect, device number 2
[376544.915174] usb 4-1.1: USB disconnect, device number 3
[376544.915175] usb 4-1.1.1: USB disconnect, device number 5
[376544.915595] usb 4-1.2: USB disconnect, device number 4
[376544.958908] usb 3-1.1.1.5: USB disconnect, device number 11
[376544.959744] usb 3-1.1.2: USB disconnect, device number 7
[376544.959746] usb 3-1.1.2.4: USB disconnect, device number 10
[376545.175240] usb 3-1.1.5: USB disconnect, device number 9
[376545.175890] usb 3-1.5: USB disconnect, device number 6
[376545.176578] usb 3-1.6: USB disconnect, device number 12

Then the USB hub died while DP over Thunderbolt monitors continue to work.

Firmware versions:
Precision M5520 Thunderbolt Controller: 26.01
Dell Thunderbolt Dock: 16.00
Precision 5520 System Firmware: 0.1.13.0

This has happened a few times already - as I was writing up this report - and I worked around them by either power cycling the dock (pulling the power cable - the power button on the dock powers down the laptop), or running the XHCI reset script I'm pasting below.

Regarding other workarounds - at no point did I boot to MS-Windows (or at least not in the last half a year).

This is the XHCI reset script I'm using - it lists XHCI devices and runs the unbind-rebind process specified in the original bug report. The next step would be a SystemD service that monitors the kernel log for the "xHCI assume dead" line and triggers the rebind automatically :-/

```
#!/bin/bash
tbtid="$(ls /sys/bus/pci/drivers/xhci_hcd/ | grep '[0-9]')"
echo "Resetting thunderbolt bus">&2
for id in $tbtid; do
  echo -n $id > /sys/bus/pci/drivers/xhci_hcd/unbind
done
sleep 1
for id in $tbtid; do
  echo -n $id > /sys/bus/pci/drivers/xhci_hcd/bind
done
echo "Done resetting thunderbolt bus">&2
```