xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
- Zesty (17.04)
- Bug #1667750
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
HWE Next |
Fix Released
|
Undecided
|
Unassigned | ||
Linux |
Confirmed
|
High
|
|||
linux (Arch Linux) |
New
|
Undecided
|
Unassigned | ||
linux (Debian) |
New
|
Undecided
|
Unassigned | ||
linux (Fedora) |
Confirmed
|
Undecided
|
|||
linux (Ubuntu) |
Fix Released
|
Medium
|
Kai-Heng Feng | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned | ||
Zesty |
Fix Released
|
Undecided
|
Unassigned | ||
Artful |
Fix Released
|
Medium
|
Kai-Heng Feng |
Bug Description
[SRU Justification]
[Impact]
Dell TB16 docking station has issue to use gigabit ethernet. The ethernet
will disconnect unless it's changed to 100Mb/s.
[Test Case]
Download some big files from the web.
User confirms the patch fixes the issue.
[Regression Potential]
This patch only effects ASMEDIA's ASM1042A.
The regression potential is low, also limited to the specific device.
---
My system contains a Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter which is on usb3 bus in my docking station (Dell TB16) which is attached to my laptop (Dell XPS9550) via Thunderbolt 3.
I get usb related kernel error messages when I initiate a high speed transfer (by issuing wget http://
This does not happened when the Ethernet adapter is connected to a 100Mb/s switch, but only when connected to 1000Mb/s. It also does not happened with slow traffic (e.g. web page browsing). This is not a new bug with kernel 4.10, but has been going on since at least 4.7 and maybe (probably?) since forever. I'm aware of several others with this configuration (RTL8153 on usb3 behind thunderbolt 3) that have the same issue. This bug is also not specific to Ubuntu; I also get it on Arch Linux. I've also tested and seen this bug with several different models of thunderbolt 3 docks.
Here are the relevant kernel log messages:
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9010 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9020 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9030 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9040 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9050 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9060 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9070 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9080 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx timeout
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:09 ubuntu kernel: usb 4-1.2: reset SuperSpeed USB device number 3 using xhci_hcd
I can't seem to make this bug appear with any other type of USB traffic. I've reported it to the realtek kernel dev team and they don't think their RTL8153 driver (in this case the r8152 module) is to blame, but instead that it's an xhci_hcd issue.
If you look through the dmesg log attached here, you'll see that at 45.967025 I plugged the thunderbolt 3 cable from my dock into my laptop.
ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: linux-image-
ProcVersionSign
Uname: Linux 4.10.0-8-generic x86_64
ApportVersion: 2.20.4-0ubuntu2
Architecture: amd64
CasperVersion: 1.380
CurrentDesktop: Unity:Unity7
Date: Fri Feb 24 16:53:35 2017
LiveMediaBuild: Ubuntu 17.04 "Zesty Zapus" - Alpha amd64 (20170224)
MachineType: Dell Inc. XPS 15 9550
ProcEnviron:
TERM=xterm-
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.163
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/22/2016
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.2.19
dmi.board.name: 0N7TVV
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.
dmi.product.name: XPS 15 9550
dmi.sys.vendor: Dell Inc.
CVE References
l3iggs (l3iggs) wrote : | #1 |
- AlsaInfo.txt Edit (33.0 KiB, text/plain; charset="utf-8")
- AudioDevicesInUse.txt Edit (317 bytes, text/plain; charset="utf-8")
- CRDA.txt Edit (426 bytes, text/plain; charset="utf-8")
- CurrentDmesg.txt Edit (101.5 KiB, text/plain; charset="utf-8")
- Dependencies.txt Edit (3.2 KiB, text/plain; charset="utf-8")
- IwConfig.txt Edit (331 bytes, text/plain; charset="utf-8")
- JournalErrors.txt Edit (64.8 KiB, text/plain; charset="utf-8")
- Lspci.txt Edit (24.9 KiB, text/plain; charset="utf-8")
- Lsusb.txt Edit (1.4 KiB, text/plain; charset="utf-8")
- ProcCpuinfo.txt Edit (9.0 KiB, text/plain; charset="utf-8")
- ProcInterrupts.txt Edit (6.4 KiB, text/plain; charset="utf-8")
- ProcModules.txt Edit (9.0 KiB, text/plain; charset="utf-8")
- PulseList.txt Edit (29.4 KiB, text/plain; charset="utf-8")
- RfKill.txt Edit (113 bytes, text/plain; charset="utf-8")
- UdevDb.txt Edit (291.3 KiB, text/plain; charset="utf-8")
- WifiSyslog.txt Edit (146.4 KiB, text/plain; charset="utf-8")
Brad Figg (brad-figg) wrote : Status changed to Confirmed | #2 |
Changed in linux (Ubuntu): | |
status: | New → Confirmed |
description: | updated |
Joseph Salisbury (jsalisbury) wrote : | #3 |
Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?
Would it be possible for you to test the latest upstream kernel? Refer to https:/
If this bug is fixed in the mainline kernel, please add the following tag 'kernel-
If the mainline kernel does not fix this bug, please add the tag: 'kernel-
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".
Thanks in advance.
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
status: | Confirmed → Incomplete |
l3iggs (l3iggs) wrote : | #4 |
Hi Joseph. Are you a robot?
I believe I've answered your questions in my bug report (when I wrote that this bug has been going on since forever). Also, this bug report was made with linux-image-
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
tags: | added: kernel-bug-exists-upstream |
l3iggs (l3iggs) wrote : | #5 |
Here are some more bug reports that might be related:
https:/
https:/
https:/
https:/
Jonathan Booth (svirpridon+ubuntu) wrote : | #6 |
+1, same configuration, same behavior, 16.10.
Harald Nordgård-Hansen (hhansen) wrote : | #7 |
A bit more info, as I've gotten hold of a separate usb network card with the same chip in it (Realtek 8153). Plugging it into the ASMedia ASM1042A USB controller found in the TB16 docking station gives the same result as reported here. But plugging it directly into the computer (Intel Sunrise Point-H USB controller) works without problems.
So the problem seems to be in the combination of the Realtek and ASMedia chips.
David Ibarra (dibarra) wrote : | #8 |
Hey all, just +1'ing this- seeing this on ubuntu 16.10, and Fedora 25 (kernel 4.9). TB16 dock and Dell Precision 5510.
André Düwel (aduewel) wrote : | #9 |
+1, Dell XPS15 9550 + Dell TB16 + Ubuntu 16.10.
Workaround:
Limiting the connection speed to 100MBit FDX via "ethtool eth..... speed 100 duplex full autoneg on" also circumvents the problem.
On Windows 10 its working without issues at full speed (Gigabit).
André Düwel (aduewel) wrote : | #10 |
reloading the Realtek kernel module r8152 and restarting the network-manager also fixes the problem temporary:
sudo rmmod r8152.ko
sudo modprobe r8152.ko
sudo service network-manager restart
Kaz Wolfe (kazwolfe) wrote : | #11 |
Seems related to Bug #1663975. Same problem, I'd think...
Also, just for the sake of completeness, yet another error log: http://
4.8.0-41-generic #44~16.04.1-Ubuntu, HWE because reasons. Kernel is tainted (NVIDIA, VirtualBox), but this issue seems to exist anyways.
Posted same comment over on the other bug report, sorry for any spam that may report.
Hordur Heidarsson (hordur-z) wrote : | #12 |
+1, Dell Precision M5510 + Dell TB16 + Ubuntu 16.10 + 4.8.0-41-generic #44~16.04.1-Ubuntu SMP
@aduewel: thanks for the speed downgrade workaround!
Karlyn Fielding (karlyn) wrote : | #13 |
I can confirm that I have the same issue being reported here. I have a Dell XPS13 Developer Edition (9360) with Ubuntu 16.04, TB16 Dock and an upgraded Ubuntu Mainline build kernel of 4.10.4.
Additionally, I tried downloading the source code from Realtek for their v2.08.0 r8152 driver. I compiled that driver and manually removed and re-inserted the new kernel module into my running system. With that driver running, I see the exact same behavior described here.
I'd be happy to volunteer for any testing on my hardware that might help debug the issue.
I can also confirm that changing the connection speed to 100Mb with ethtool provides a work around to the problem.
Also, I have some kernel output I have saved while experiencing the issue if there is interest in it.
l3iggs (l3iggs) wrote : | #14 |
I'm starting to think this issue stems not from our Realtek RTL8153 Ethernet chip but rather from something upstream of it.
My best guess now is that there's something wrong with the handling of the "ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller" that's in our docks. This is a usb3.0 <--> PCIe bridge which the RTL8153 hangs off of.
Robert Sandberg (srobban) wrote : | #15 |
Can confirm same issue on Dell Precision 5520 + TB16 running pre-installed Ubuntu 16.04 LTS.
Same issue on KDE Neon with different kernels 4.8.x, 4.10.x
Are also experience other USB issues when connecting various devices on the TB-16 e.g. all other USB devices freezes.
l3iggs (l3iggs) wrote : | #16 |
Also, I wonder if this could somehow be a thunderbolt 3 bandwidth allocation issue.
This is pure uneducated speculation though ;-)
l3iggs (l3iggs) wrote : | #17 |
By the way, removing and reinserting the r8152 module as suggested above does not seem to prevent or work around this issue.
l3iggs (l3iggs) wrote : | #18 |
Higher transfer rates seem to have some impact here:
wget http://
errors out in a few seconds
wget --limit-rate=10k http://
might run for a few 10s of seconds before erroring out
wget --limit-rate=1k http://
continues to work until I run out of patience (forever?). I've not waited the 18 days required for this to complete though :P
l3iggs (l3iggs) wrote : | #19 |
- 4.11.trace.txt Edit (4.2 MiB, text/plain)
I've attached a trace for the 4.11 kernel.
First
echo xhci-hcd >> /sys/kernel/
Then initiate network transport to create the bug.
/sys/kernel/
and
dmesg (as 4.11.dmesg.txt) are attached.
l3iggs (l3iggs) wrote : | #20 |
Alex Shchagin (qalex) wrote : | #21 |
+1 here Precision 5510 + TB16 + 16.10 4.8.0-41
However I think this is not a r8152 bug too. When I plug in Dell USB-C Ethernet adapter into TB16 it works fine at the full speed with the same r8152 module.
l3iggs (l3iggs) wrote : | #22 |
Alex, What's model number of that Dell USB-C Ethernet adapter?
Alex Shchagin (qalex) wrote : | #23 |
@l3iggs Nothing is written on it, but it seems to be 470-ABQJ. It came with my Precision.
I thing the culprit is a USB host controller 'ASM1042A USB 3.0 Host Controller' embedded into TB16. See here:
> lshw -short
...
/0/100/1d.6/0 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/0 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/0/0 generic DSL6340 Thunderbolt 3 NHI [Alpine Ridge 2C 2015]
/0/100/1d.6/0/1 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/1/0 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/1 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
/0/100/
...
By the way, Dell listed some special driver for Windows at the TB16 page for this ASM controller.
Karlyn Fielding (karlyn) wrote : | #24 |
I can confirm that the issue remains with the latest 4.10.7 mainline kernel build for Ubuntu.
Same specs as before:
Dell XPS13 DE (9360)
TB16 Dock
Ubuntu 16.04 ( installed as shipped from Dell )
imperia (imperia777) wrote : | #25 |
I have the same problem with my USB 3.1 controller:
ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller.
I passthru the controller to XEN VM. I then connect to it USB TV Tuner card.
I am using VDR software which is TV software that when not in use is scanning for new channels.
Sometimes after few hours, sometimes after few days it crashes with following error:
[131382.068144] xhci_hcd 0000:00:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 3
[131382.068182] xhci_hcd 0000:00:00.0: Looking for event-dma 0000000210196600 trb-start 0000000210196740 trb-end 0000000210196760 seg-start 0000000210196000 seg-end 0000000210196ff0
Then same problem is not present with the onboard USB 3.0 controller.
If I passhtru it to the XEN VM it working without any problems.
So this must be some problem with USB 3.1 driver (not 3.0) or ASMedia firmware.
I can provide whatever information is necessary to fix this bug.
I can provide shell account to my VM also if somebody wants to debug it.
Li Dongyang (dongyang-li) wrote : | #26 |
Could someone try:
ethtool --offload <eth interface> tx off
ethtool --offload <eth interface> rx off
And then see if it works?
Robert Sandberg (srobban) wrote : | #27 |
I've tried:
ethtool --offload <eth interface> tx off
ethtool --offload <eth interface> rx off
But the issue remains.
The only workaround that works is to limit speed to 100, as suggested previously.
André Düwel (aduewel) wrote : | #28 |
Since I upgraded to Ubuntu 17.04 (fresh install), I can confirm that this bug also affects the (now) current release and therefore kernel version 4.10.0-19-generic.
I also now implemented an other "workaround" and bought an 7€ USB3->1Gb Ethernet dongle, this works without issues.
Additional Information:
lsusb
Bus 004 Device 004: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 004 Device 002: ID 0424:5537 Standard Microsystems Corp.
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 007: ID 03f0:094a Hewlett-Packard Optical Mouse [672662-001]
Bus 003 Device 006: ID 046d:c31c Logitech, Inc. Keyboard K120
Bus 003 Device 005: ID 2109:2811 VIA Labs, Inc. Hub
Bus 003 Device 004: ID 2109:2811 VIA Labs, Inc. Hub
Bus 003 Device 003: ID 0bda:4014 Realtek Semiconductor Corp.
Bus 003 Device 002: ID 0424:2137 Standard Microsystems Corp.
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f3:21d5 Elan Microelectronics Corp.
Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp.
Bus 001 Device 004: ID 0c45:6713 Microdia
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
lshw --short: (see attachment)
André Düwel (aduewel) wrote : | #29 |
- lshw-short.txt Edit (6.5 KiB, text/plain)
sorry, attached wrong file in last comment. here is the right one
André Düwel (aduewel) wrote : | #30 |
I need to correct me: Having issues during high load on the USB3 Ethernet adapter, too.
Only workaround is limiting to 100MBit.
Bram Biesbrouck (b-m) wrote : | #31 |
André,
I followed your advice and bought an inexpensive USB3 1gbit ethernet adapter and noticed the same drops and corruptions as the built-in ethernet port. However, when I plug it in the USB-C port of the dock (using a little USB3 to USB-C cable), everything seems to work correctly.
B.
Alex Shchagin (qalex) wrote : | #32 |
Bram,
This is because USB3 and USB-C ports in TB16 are connected to different controllers. See my lshw output here - https:/
/0/100/
/0/100/
Alex
André Düwel (aduewel) wrote : | #33 |
Ohh okay, thanks for this advise I will order an adapter and try it out.
This seems to verify that the problem exists somewhere in the usb3 controller/driver (ASM1042A) in the TB16 and not in the Ethernet controller/driver itself.
Bram Biesbrouck (b-m) wrote : | #34 |
Ah, cool, didn't know that, thanks!
zwigno (zwigno) wrote : | #35 |
I have the same issue. I'm using a Dell Precision 5510 with the Dell TB16 Dock. I'm running Ubuntu 17.04 with kernel version 4.10.0-21-generic. What I first noticed is that some SSL-enabled websites failed to load with errors like, "SSL_ERROR_
Does anyone have a solution for setting the speed to 100Mb upon plugin of the Thunderbolt connector or when the interface comes up? Setting it at boot time isn't ideal because I don't often have the dock plugged in at first boot.
Zhenfang Wei (kopkop) wrote : | #36 |
the same issue, xps9360 + ubuntu 16.04 with kernel 4.4.78 + tb16
Mario Limonciello (superm1) wrote : | #37 |
This is an issue with the host controller. The vendor (ASMedia) has submitted a patch here that fixes the issue:
http://
Kai-Heng Feng (kaihengfeng) wrote : | #38 |
I can confirm the patch works on the TB15 at my hand, can you guys try patched 4.11 kernel [1] on TB16?
I applied the patch to 4.11 - the patch cannot be cleanly applied to Xenial/
I'll do the proper backport when the patch is being accepted by upstream maintainers.
André Düwel (aduewel) wrote : | #39 |
Hi Kai-Heng,
I can confirm your Kernel is working on my XPS 15 9550 + TB16 running Zesty and it fixes the Ethernet issue.
But, the whole TB16 USB3 Controller including Keyboard, Ethernet and other USB devices are still not working when connected during system start. I need to disconnect and reconnect it after booting.
Thanks again! :)
Kai-Heng Feng (kaihengfeng) wrote : | #40 |
Sounds like another issue. Can you file another bug?
André Düwel (aduewel) wrote : | #41 |
Bram Biesbrouck (b-m) wrote : | #42 |
I can also confirm this kernel solves my ethernet issues on my Dell TB16.
Karlyn Fielding (karlyn) wrote : | #43 |
I can confirm that the patched 4.11.0-6 kernel from Kai-Heng does in fact resolve my issue. I'll be interested in hearing when this is available in the upstream kernel.
In Red Hat Bugzilla #1460789, Christian (christian-redhat-bugs) wrote : | #96 |
This is a Dell XPS 13 connected to the network via the TB16 dock.
Kernel is: Linux ag13.local 4.12.0-
Host controller of the dock:
09:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller
USB network interface in the dock:
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
|__ Port 2: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
[32930.573816] usb 4-1.2: new SuperSpeed USB device number 3 using xhci_hcd
[32930.591744] usb 4-1.2: New USB device found, idVendor=0bda, idProduct=8153
[32930.591752] usb 4-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[32930.591757] usb 4-1.2: Product: USB 10/100/1000 LAN
[32930.591761] usb 4-1.2: Manufacturer: Realtek
[32930.591766] usb 4-1.2: SerialNumber: 000001000000
[32930.739428] usb 4-1.2: reset SuperSpeed USB device number 3 using xhci_hcd
I *sometimes* get the following in the log and with that the ethernet port stops working.
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec010 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec020 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec030 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec040 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec050 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec060 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09...
In Red Hat Bugzilla #1460789, Christian (christian-redhat-bugs) wrote : | #97 |
There is an upstream patch for the ASM1042A host controller[1] that has been reported to help with the issue (see corresponding launchpad issue[2]).
[1] http://
[2] https:/
Zhenfang Wei (kopkop) wrote : | #44 |
Hi, Kai-Heng
I tried to install the kernel on my XPS 9360 with factory-equipped xenial image, some drivers such as hid-multitouch and intel-vbutton are not compatible with kernerl 4.11 (They only support kernel 4.4).
Is it possible to help building kernel 4.4 which is widely supported by Ubuntu 16.04 LTS?
Thanks
Mario Limonciello (superm1) wrote : Re: [Bug 1667750] Re: xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13 | #45 |
Fwiw you don't need to use those drivers with 4.11 (they're backport
drivers). This will eventually need to be backported to 4.4 too, but it
needs to be accepted upstream first. Upstream developers had some feedback
that will need to be addressed and it resubmitted first.
On Tue, Jun 13, 2017, 11:36 Zhenfang Wei <email address hidden> wrote:
> Hi, Kai-Heng
> I tried to install the kernel on my XPS 9360 with factory-equipped
> xenial image, some drivers such as hid-multitouch and intel-vbutton are not
> compatible with kernerl 4.11 (They only support kernel 4.4).
> Is it possible to help building kernel 4.4 which is widely supported by
> Ubuntu 16.04 LTS?
>
> Thanks
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD
> ep_index 2 comp_code 13
>
> To manage notifications about this bug go to:
> https:/
>
Mario Limonciello (superm1) wrote : | #46 |
v4 was resubmitted here (by a different author):
http://
tags: | added: originate-from-1696057 somerville |
Patrick Doyle (wpdster) wrote : | #47 |
- /etc/NetworkManager/dispatcher.d/pre-up.d script to limit the ethernet rate to 100Mbps Edit (184 bytes, text/plain)
I too have this problem with a Dell Precision 7520 attached to a TB16. After reading this bug report, I wrote a simple script, tb16_speed_limit, which I have attached. Place this in /etc/NetworkMan
This might help @zwigno
Mario Limonciello (superm1) wrote : | #48 |
ASMedia submitted v5 here:
http://
In Red Hat Bugzilla #1460789, Christian (christian-redhat-bugs) wrote : | #98 |
I added the v5 of the patch[1] to a kernel, scratch build:
https:/
dragon788 (dragon788) wrote : | #49 |
Patrick, I definitely like your script as a quick workaround (as even 100Mb/s is better than our busy wireless), and I'm trying to make it a little more generic.
If you run `udevadm info -e | grep -A 10 '^P.*enx'` what does it show for ID_MODEL_ID and ID_NET_DRIVER and ID_NET_NAME? My system reports 8153 and 8152 and enp14s0u1u2 respectively for the TB16 and the TB15. The only difference when testing with a Dell D59GG USB-C ethernet adapter was the ID_NET_NAME changed because it wasn't going through the ASMedia USB hub anymore.
Using the ID_NET_NAME may be a better way to identify the device as the MAC address changes per host (my MAC was the same when connected to the TB15 and TB16 and the USB-C adapter thanks to some previous kernel patches that have gone upstream integrating the host fixed MAC, but the ID_NET_NAME should be the same for all the docks (and hopefully a low probability to overlap with USB2/USB3 gigabit ethernet adapters commonly in use). The fixed MAC is nice as it keeps the same IP if I change between docks or the USB-C within our LAN.
Mario Limonciello (superm1) wrote : | #50 |
This landed in the USB maintainer's tree targeted to be pulled into 4.13-rcX, and also CC'ed to stable..
https:/
Can this be backported to Ubuntu now for the next SRU?
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #99 |
v5 has landed in the maintainer's tree (to target to 4.13-rcX) along with CC to stable.
https:/
Changed in linux (Fedora): | |
importance: | Undecided → Unknown |
status: | New → Unknown |
Changed in linux (Ubuntu): | |
assignee: | nobody → Kai-Heng Feng (kaihengfeng) |
Kai-Heng Feng (kaihengfeng) wrote : | #51 |
Guys, please test [1].
I compiled Xenial (4.4), Zesty (4.10), Artful (4.11) for testing.
André Düwel (aduewel) wrote : | #52 |
Thanks! I will test it on Zesty.
André Düwel (aduewel) wrote : | #53 |
Works for Zesty, great job! :)
l3iggs (l3iggs) wrote : | #54 |
I've tested 4.13rc1 and I don't see the fix there. Let's hope it hits rc2.
Bram Biesbrouck (b-m) wrote : | #55 |
I'm testing on Xenial and so far so good, even downloading large files at high speeds.
description: | updated |
Changed in linux (Ubuntu Artful): | |
status: | Confirmed → Fix Committed |
Bram Biesbrouck (b-m) wrote : | #56 |
I don't want to spoil the party, but the 4.11 kernel André provided works (downloading large files-wise) far better than the 4.4 one Kai-Heng prepared on my Ubuntu Xenial laptop connected to a TB16 docking station. I can only hope I'm the only one though, so I'll be waiting for a stable 4.13 release with the patch embedded to try again.
Kai-Heng Feng (kaihengfeng) wrote : | #57 |
Bram, can you be more specific?
Linux kernel 4.4 in comment #51 doesn't work for you?
Bram Biesbrouck (b-m) wrote : | #58 |
Yes, it works (had been running it for a few days now), but needed to download a large file today and I noticed the speed dropped to +/- 100kbs, where, switching to Andrés kernel, it have me +/- 700kbs (tried twice, same results, nothing else changed).
Like I said, I don't know if it's something to do with other stuff fixed in kernel 4.11 (compared to 4.4, since I'm on a fairly new Dell XPS 15 2017), but I found it remarkable the speed accelerated dramatically when switching to the 4.11 kernel.
Kai-Heng Feng (kaihengfeng) wrote : | #59 |
I think it's a separate issue, there are some changes to the r8152 driver, tons of changes on the general network stack.
Can you provide the iperf output under 4.4 and 4.11?
I'll try to find what changes made the speed increase, on the TB15 at my hand.
Also, can you point out where's Andres kernel? I can't find it.
André Düwel (aduewel) wrote : | #60 |
Hi Bram, I never uploaded or provided a kernel :D
imperia (imperia777) wrote : | #61 |
Hello,
I still have the bug with the patch presented here and kernel 4.12.2. (I am on debian):
https:/
If that is the latest patch, it applies successfully on 4.12.2 but xhci_hcd still crashes:
[41806.785462] xhci_hcd 0000:00:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 3
[41806.785494] xhci_hcd 0000:00:00.0: Looking for event-dma 00000001a05eac90 trb-start 00000001a05eacb0 trb-end 00000001a05eacd0 seg-start 00000001a05ea000 seg-end 00000001a05eaff0
device is 1b21:1242
If that is he correct patch then I have /sys/kernel/
Just tell me where to upload them because they are big.
I am using USB TV Tuner card!!! Not Ethernet device.
Kai-Heng Feng (kaihengfeng) wrote : | #62 |
imperia,
If it's ASM1142 from comment #25, then it's different to ASM1042A (which is used in TB16).
Also, ftrace won't do much here - as you can see the patch writes magic values to registers, which only hardware vendor knows.
Bram Biesbrouck (b-m) wrote : | #63 |
My bad André, you're right. I was referring to the 4.11 kernel from #38, which is indeed also submitted by Kai-Heng.
Do you have some specific iperf commands you want me to run, or is a simple server/client test (-s and -c) enough?
imperia (imperia777) wrote : | #64 |
Thanks for your answer Kai-Heng.
I am not sure what ftrace is. I used these commands to get the logs:
mount -t debugfs none /sys/kernel/debug
echo xhci-hcd >> /sys/kernel/
cat /sys/kernel/
Can they be used to debug this? Or there is another procedure?
I understand that my device is different but I believe it has the same or very similar bug.
And I am seeking help. I don't know who to contact for this bug to be finally fixed.
Any suggestions are welcome.
Thanks.
Kai-Heng Feng (kaihengfeng) wrote : | #65 |
The way you get log is via ftrace.
No, they are not useful if the issue you have is similar to this one.
You should raise the issue to linux-usb mailing list.
imperia (imperia777) wrote : | #66 |
Ok.
Can you please tell me the homepage of linux-usb mailing list so I can subscribe. Google returned many results:
spinics linux-usb marc.info .. which one is the correct and current linux-usb mailing list home page.
Bram Biesbrouck (b-m) wrote : | #67 |
Kai-Heng, here's the iperf of kernel 4.11:
bram@escher:~$ uname -a
Linux escher 4.11.0-6-generic #11~dell+tb+dock SMP Mon Jun 12 11:52:04 CST 2017 x86_64 x86_64 x86_64 GNU/Linux
bram@escher:~$ iperf -c 192.168.10.5
-------
Client connecting to 192.168.10.5, TCP port 5001
TCP window size: 85.0 KByte (default)
-------
[ 3] local 192.168.10.124 port 38706 connected with 192.168.10.5 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 730 MBytes 612 Mbits/sec
bram@escher:~$
Bram Biesbrouck (b-m) wrote : | #68 |
And here it is of kernel 4.4:
bram@escher:~$ uname -a
Linux escher 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
bram@escher:~$ iperf -c 192.168.10.5
-------
Client connecting to 192.168.10.5, TCP port 5001
TCP window size: 85.0 KByte (default)
-------
[ 3] local 192.168.10.124 port 42372 connected with 192.168.10.5 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 670 MBytes 561 Mbits/sec
bram@escher:~$
Seems like there's not much of a difference, so I might be wrong (initially tested with a public FTP of a client in Spain I have no access to)
Kai-Heng Feng (kaihengfeng) wrote : | #69 |
@imperia
<email address hidden>
@Bram
Seems so. Probably there were other factors that affected the speed.
Mario Limonciello (superm1) wrote : | #70 |
The most important part is that there are no longer kernel oops, tracebacks
or errors when the USB host controller has been configured this way as is
in the patch.
On Tue, Jul 25, 2017, 08:58 Kai-Heng Feng <email address hidden>
wrote:
> @imperia
> <email address hidden>
>
> @Bram
> Seems so. Probably there were other factors that affected the speed.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD
> ep_index 2 comp_code 13
>
> To manage notifications about this bug go to:
> https:/
>
l3iggs (l3iggs) wrote : | #71 |
I've just tested the 4.13-rc2 kernel and the issue appears to be solved.
I assume the patch that fixes it was merged upstream sometime between 4.13-rc1 and 4.13-rc2.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #100 |
After an initial hiccup with the LAN cable in the dock (and plugging it into a different socket), the performance is now much better (not sure if I can say it's perfect, yet) using the patched kernel.
Thanks!
Changed in linux (Ubuntu Xenial): | |
status: | New → In Progress |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Zesty): | |
status: | New → Fix Committed |
Bram Biesbrouck (b-m) wrote : | #72 |
Aw, I seem to have accidentally changed the Xenial status to Released which was not my intention at all. Can somebody with more rights than me revert this back to Committed please?
Changed in linux (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
Changed in linux (Ubuntu Xenial): | |
status: | Fix Released → Fix Committed |
Launchpad Janitor (janitor) wrote : | #73 |
This bug was fixed in the package linux - 4.11.0-13.19
---------------
linux (4.11.0-13.19) artful; urgency=low
* CVE-2017-7533
- dentry name snapshots
linux (4.11.0-12.18) artful; urgency=low
* linux: 4.11.0-12.18 -proposed tracker (LP: #1707635)
- no change rebuild to pick up the new binutils.
* Adt tests of src:linux time out often on armhf lxc containers (LP: #1705495)
- [Packaging] tests -- reduce rebuild test to one flavour
- [Packaging] tests -- reduce rebuild test to one flavour -- use filter
* [ARM64] config EDAC_GHES=y depends on EDAC_MM_EDAC=y (LP: #1706141)
- [Config] set EDAC_MM_EDAC=y for ARM64
* [Hyper-V] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
(LP: #1690174)
- hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
* ath10k doesn't report full RSSI information (LP: #1706531)
- ath10k: add per chain RSSI reporting
* ideapad_laptop don't support v310-14isk (LP: #1705378)
- platform/x86: ideapad-laptop: Add several models to no_hw_rfkill
* Ubuntu 16.04.3: Qemu fails on P9 (LP: #1686019)
- KVM: PPC: Pass kvm* to kvmppc_find_table()
- KVM: PPC: Use preregistered memory API to access TCE list
- KVM: PPC: VFIO: Add in-kernel acceleration for VFIO
- powerpc/
- powerpc/
- powerpc/
- powerpc/mmu: Add real mode support for IOMMU preregistered memory
- KVM: PPC: Reserve KVM_CAP_
- KVM: PPC: Book3S HV: Add radix checks in real-mode hypercall handlers
* hns: ethtool selftest crashes system (LP: #1705712)
- net/hns:bugfix of ethtool -t phy self_test
* ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
(LP: #1673564)
- KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
registers
- KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
- arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
- KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
- KVM: arm64: Make kvm_condition_
- KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
- KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
- KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
- KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
- KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
- KVM: arm64: vgic-v3: Add misc Group-0 handlers
- KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
- KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
- arm64: Add MIDR values for Cavium cn83XX SoCs
- arm64: Add wor...
Changed in linux (Ubuntu Artful): | |
status: | Fix Committed → Fix Released |
Kleber Sacilotto de Souza (kleber-souza) wrote : | #74 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-xenial |
tags: | added: verification-needed-zesty |
Kleber Sacilotto de Souza (kleber-souza) wrote : | #75 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
Corey Schuhen (cschuhen) wrote : | #76 |
The issue appears solve for me with 4.10.0-33 on zesty. I assume that an open invitation for anybody to change the tag to verified.
tags: |
added: verification-done-zesty removed: verification-needed-zesty |
Iron Davey (idb151) wrote : | #77 |
I've tried this with the proposed xenial 4.11.0-13 kernel and it seems to be working for me. I'm running a Precision 7510 with bios 1.13.6 . I initially had a problem where the network card wasn't showing up, but re-read earlier posts about disabling the thunderbolt dock security feature in the BIOS. The only test I've done is to download the 16.04.3 iso image, and saw a peak of 11MB/s and a low of ~6MB/s.
As an aside, the TB16 I'm testing just showed up today, so I'm assuming it has the latest firmware ;)
=======
HTTP request sent, awaiting response... 200 OK
Length: 1587609600 (1.5G) [application/
Saving to: ‘ubuntu-
ubuntu-16.04.3-desk 100%[==
2017-08-18 15:13:37 (7.72 MB/s) - ‘ubuntu-
Iron Davey (idb151) wrote : | #78 |
I need to slightly amend my previous post. When I plugged the dock in this morning, the network card showed that it picked up an IP address, but wasn't routing any packets. I could disable and re-enable it to my heart's content, but network activity wasn't working. A reboot cleared it up. I'm not sure what was causing it, but I never had this problem with the E-Port 2 dock, and I don't have any special firewall, routing, or network settings.
Mario Limonciello (superm1) wrote : | #79 |
@Iron Davey I believe that to be a separate issue and unrelated. The issue reported here specifically is resolved from the patch. If you can reproduce that problem again, please file a separate issue for it.
tags: |
added: verification-done-xenial removed: verification-needed-xenial |
In Red Hat Bugzilla #1460789, Christian (christian-redhat-bugs) wrote : | #101 |
For future reference, the mentioned patch git merged upstream, as commit 9da5a1092b13468
It also made it into stable, 4.12.4 I believe, as 5cc9b698a494827
So I think this should be fixed (or at least better) in F26, because we currently ship 4.12.5-
In Red Hat Bugzilla #1460789, Jiri (jiri-redhat-bugs) wrote : | #102 |
(In reply to Christian Kellner from comment #5)
> For future reference, the mentioned patch git merged upstream, as commit
> 9da5a1092b13468
> It also made it into stable, 4.12.4 I believe, as
> 5cc9b698a494827
>
> So I think this should be fixed (or at least better) in F26, because we
> currently ship 4.12.5-
The network works, but sadly it corrupts packets. Martin says because of it he has difficulties to download things, connect to services...
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #103 |
@Jiri,
Are you sure that's a result of this patch? This is the first report i've heard of that.
In Red Hat Bugzilla #1460789, Christian (christian-redhat-bugs) wrote : | #104 |
@Mario, I think what Jiri means is that without the patch it doesn't work well at all but even with the patch the situation is not perfect. Let me cc Benjamin, maybe we can add a test in our Fedora Hardware test suit for that. We still have the TB16 dock in Munich right now, maybe we can be of help.
In Red Hat Bugzilla #1460789, Jiri (jiri-redhat-bugs) wrote : | #105 |
I'll let Martin speak for himself because it was him who complained about it to me.
I've been using kernel 4.12.8 which should have the patch included since the morning and haven't experienced any noticeable problems with the network.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #106 |
Yes, for me, the Ethernet on the Docks is pretty broken. For example, when downloading a whole Koji build with about 13 packages, each time the download got broken at about 4th or 5th package, with (I think) a SSL handshake error. Also when downloading a Fedora ISO 4 times in a row, each of them got corrupted (md5 check just didn't pass).
Also, the USB performance of the dock is terrible, I'm not sure if this is related to the issue the patch in question is supposed to solve but after updating the laptop firmware to 1.2.1.0, my mouse and keyboard get disconnected very often. On the other hand, dock audio works just fine and one would assume all of these devices are on the same USB hub.
I'm currently working around this by plugging a USB-C adapter with ethernet into the Thunderbolt port on the docking station.
In Red Hat Bugzilla #1460789, Benjamin (benjamin-redhat-bugs) wrote : | #107 |
Martin, could you maybe try disabling RC checksum offloading and see if that helps? Then the corrupted packages should be discarded by the kernel (even if they are only corrupted during the transfer over USB). i.e. try again after running:
ethtool --offload $DEVICE rx off
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #108 |
@Martin
Just to make sure - this is a TB16 not TB15 right? This is sounding suspiciously like a hardware problem to me.
In Red Hat Bugzilla #1460789, Jiri (jiri-redhat-bugs) wrote : | #109 |
(In reply to Mario Limonciello from comment #12)
> @Martin
>
> Just to make sure - this is a TB16 not TB15 right? This is sounding
> suspiciously like a hardware problem to me.
It's TB16.
You mean the ethernet or USB problem? I think we've started mixing two (most likely) unrelated problems. I have not been able to reproduce the ethernet problem for the whole day. Martin also has Windows 10 installed on his XPS 13, so he could try it there and if the problem still occurs it's very likely a hardware problem.
The USB one doesn't seem like a hardware problem because I'm affected by that, too, after the last firmware update. Devices connected to the USB ports don't work at all or just for a short period of time after they're plugged in.
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #110 |
Well i'm not sure if they're related, but since the Ethernet device is a USB device on the hub, I would suspect them to be.
Can you please clarify which XPS machine you guys are affected? There are at least 4 different XPS models that support TB16.
Please comment your last working and last failed BIOS versions too.
In Red Hat Bugzilla #1460789, Jiri (jiri-redhat-bugs) wrote : | #111 |
We both have XPS 13 9360. I had problems with Ethernet from the very beginning until I used a patched kernel. But after updating the firmware to 1.3.7 USB devices stopped working*. Now we're on 2.1.0 and they still don't work, no matter if we use the kernel patch or not. I have to have a USB hub connected directly to the laptop. The last working firmware for me was 1.3.5.
* It really depends on the type of the device. The mouse and keyboard don't work at all or just for a very short time after plugging in. I also have a USB sound card. It seems to work, the system identifies the sound card as an audio output, it plays sound, but there are audible corruptions (cracks etc) which don't occur when the sound card is connected directly to the laptop. What I'm experiencing with sound may be similar to what Martin is experiencing with the Ethernet.
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #112 |
Ah OK thanks. I just poked around the Dell forums a little bit and you guys aren't the first ones reporting this on 9360 after upgrade.
http://
I'll poke some of the Dell support guys to look at this, it sounds like it might have slipped through the cracks.
I also checked internally on what went into 1.3.6/1.3.7.
At least 1.3.6 had some tweaks for adressing noise which would be most suspicious to me as a possible impact.
For now, can you two downgrade to 1.3.5? Fwupd probably won't let you, but you can place the .EXE file on a FAT32 partition and do it from F12 menu at POST I expect.
In Red Hat Bugzilla #1460789, Jiri (jiri-redhat-bugs) wrote : | #113 |
We'll try to downgrade for the time being. BTW I also reported the issue to @DellCaresPRO like Barton George instructed me on Twitter. They said 10 days ago they had people looking into it, but there hasn't been any update since then, so I have no idea if someone is really looking into it and if they've made any progress, and who is "they".
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #114 |
I won't be able to shortcut the process by pinging people, but I understand this is being investigated, it will just take some time.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #115 |
(In reply to Benjamin Berg from comment #11)
> Martin, could you maybe try disabling RC checksum offloading and see if that
> helps? Then the corrupted packages should be discarded by the kernel (even
> if they are only corrupted during the transfer over USB). i.e. try again
> after running:
>
> ethtool --offload $DEVICE rx off
With this, it seems to work alright, thanks! Kernel 4.13.0-
(In reply to Mario Limonciello from comment #16)
> For now, can you two downgrade to 1.3.5? Fwupd probably won't let you, but
> you can place the .EXE file on a FAT32 partition and do it from F12 menu at
> POST I expect.
I'm able to function this way so I'll probably not go for that - unless it'll be necessary to verify it actually happened between the mentioned versions.
I'd rather track if there's a new release and then upgrade when it's out and see if it fixes the USB problem.
Launchpad Janitor (janitor) wrote : | #80 |
This bug was fixed in the package linux - 4.4.0-93.116
---------------
linux (4.4.0-93.116) xenial; urgency=low
* linux: 4.4.0-93.116 -proposed tracker (LP: #1709296)
* Creating conntrack entry failure with kernel 4.4.0-89 (LP: #1709032)
- Revert "Revert "netfilter: synproxy: fix conntrackd interaction""
- netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_
* CVE-2017-1000112
- Revert "udp: consistently apply ufo or fragmentation"
- udp: consistently apply ufo or fragmentation
* CVE-2017-1000111
- Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
- packet: fix tp_reserve race in packet_set_ring
* kernel BUG at [tty_ldisc_reinit] mm/slub.c! (LP: #1709126)
- tty: Simplify tty_set_ldisc() exit handling
- tty: Reset c_line from driver's init_termios
- tty: Handle NULL tty->ldisc
- tty: Move tty_ldisc_kill()
- tty: Use 'disc' for line discipline index name
- tty: Refactor tty_ldisc_reinit() for reuse
- tty: Destroy ldisc instance on hangup
* atheros bt failed after S3 (LP: #1706833)
- SAUCE: Bluetooth: Make request workqueue freezable
* The Precision Touchpad(PTP) button sends incorrect event code (LP: #1708372)
- HID: multitouch: handle external buttons for Precision Touchpads
* Set CONFIG_
- [Config] CONFIG_
* xfs slab objects (memory) leak when xfs shutdown is called (LP: #1706132)
- xfs: fix xfs_log_ticket leak in xfs_end_io() after fs shutdown
* Adt tests of src:linux time out often on armhf lxc containers (LP: #1705495)
- [Packaging] tests -- reduce rebuild test to one flavour
* CVE-2017-7495
- ext4: fix data exposure after a crash
* ubuntu/rsi driver downlink wifi throughput drops to 5-6 Mbps when BT
keyboard is connected (LP: #1706991)
- SAUCE: Redpine: enable power save by default for coex mode
- SAUCE: Redpine: uapsd configuration changes
* [Hyper-V] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
(LP: #1690174)
- hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
* ath10k doesn't report full RSSI information (LP: #1706531)
- ath10k: add per chain RSSI reporting
* ideapad_laptop don't support v310-14isk (LP: #1705378)
- platform/x86: ideapad-laptop: Add several models to no_hw_rfkill
* [8087:0a2b] Failed to load bluetooth firmware(might affect some other Intel
bt devices) (LP: #1705633)
- Bluetooth: btintel: Create common Intel Version Read function
- Bluetooth: Use switch statement for Intel hardware variants
- Bluetooth: Replace constant hw_variant from Intel Bluetooth firmware
filename
- Bluetooth: hci_intel: Fix firmware file name to use hw_variant
- Bluetooth: btintel: Add MODULE_FIRMWARE entries for iBT 3.5 controllers
* xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2
comp_code 13 (LP: #1667750)
- xhci: Bad Ethernet performance plugged in ASM1042A host
* OpenPower: Some multipaths temporarily have only a single path
(LP: #1696445)
- scsi: ses: don't get power status of SES device slot on probe
...
Changed in linux (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #81 |
This bug was fixed in the package linux - 4.10.0-33.37
---------------
linux (4.10.0-33.37) zesty; urgency=low
* linux: 4.10.0-33.37 -proposed tracker (LP: #1709303)
* CVE-2017-1000112
- Revert "udp: consistently apply ufo or fragmentation"
- udp: consistently apply ufo or fragmentation
* CVE-2017-1000111
- Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
- packet: fix tp_reserve race in packet_set_ring
* ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
(LP: #1673564)
- irqchip/gic-v3: Add missing system register definitions
- arm64: KVM: Do not use stack-protector to compile EL2 code
- KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
registers
- KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
- arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
- KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
- KVM: arm64: Make kvm_condition_
- KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
- KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
- KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
- KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
- KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
- KVM: arm64: vgic-v3: Add misc Group-0 handlers
- KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
- KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
- arm64: Add MIDR values for Cavium cn83XX SoCs
- [Config] CONFIG_
- arm64: Add workaround for Cavium Thunder erratum 30115
- KVM: arm64: vgic-v3: Add ICV_DIR_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_RPR_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_CTLR_EL1 handler
- KVM: arm64: vgic-v3: Add ICV_PMR_EL1 handler
- KVM: arm64: Enable GICv3 common sysreg trapping via command-line
- KVM: arm64: vgic-v3: Log which GICv3 system registers are trapped
- arm64: KVM: Make unexpected reads from WO registers inject an undef
- KVM: arm64: Log an error if trapping a read-from-
- KVM: arm64: Log an error if trapping a write-to-read-only GICv3 access
* ibmvscsis: Do not send aborted task response (LP: #1689365)
- target: Fix unknown fabric callback queue-full errors
- ibmvscsis: Do not send aborted task response
- ibmvscsis: Clear left-over abort_cmd pointers
- ibmvscsis: Fix the incorrect req_lim_delta
* hisi_sas performance improvements (LP: #1708734)
- scsi: hisi_sas: define hisi_sas_
- scsi: hisi_sas: optimise the usage of hisi_hba.lock
- scsi: hisi_sas: relocate sata_done_v2_hw()
- scsi: hisi_sas: optimise DMA slot memory
* hisi_sas...
Changed in linux (Ubuntu Zesty): | |
status: | Fix Committed → Fix Released |
status: | Fix Committed → Fix Released |
Johann Hartwig Hauschild (hardy) wrote : | #83 |
Hi,
I don't have the errormessages in the kern.log, but flipping bits ...
$ uname -a
Linux elaine 4.10.0-33-generic #37-Ubuntu SMP Fri Aug 11 10:55:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Ethernet via TB16:
$ for i in 1 2 3 4; do curl -s http://
807c26430cc62c8
1971174b82abbfe
28e15ec270bdc4b
f7149fe9015467e
Ethernet via builtin:
$ for i in 1 2 3 4; do curl -s http://
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
Same cable, same switch
dino99 (9d9) wrote : | #84 |
:::::::
4.12 IS NOT FIXED
so the needed patch is not automatically added with the latest kernels
https:/
Kai-Heng Feng (kaihengfeng) wrote : | #85 |
Johann Hartwig Hauschild,
Can you file a new bug?
Johann Hartwig Hauschild (hardy) wrote : | #86 |
Will do, we're verifying that it's not firmware-related.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #116 |
Every now and then (especially when downloading large files), the ethernet simply stops working with the following log in dmesg.
Unloading the r8152 module results in gnome-shell dying. After reloading it, ethernet still doesn't work. Disconnecting the Dock in this state kills everything from GDM down to my user session.
[159642.248648] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[159642.248666] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[159642.248680] pcieport 0000:00:1c.0: device [8086:9d10] error status/
[159642.248690] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[159661.087306] xhci_hcd 0000:0a:00.0: port 1 resume PLC timeout
[159667.687492] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687514] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc010 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687610] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687627] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc020 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687722] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687735] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc030 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687829] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687838] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc040 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687971] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687988] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc050 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.723135] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.723158] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc060 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.723202] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.723219] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc070 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 ...
In Red Hat Bugzilla #1460789, Mario (mario-redhat-bugs) wrote : | #117 |
As I understand the particular problem linked with the issue in BIOS 1.3.6/1.37 adjusts a voltage regulator (to fix something else; this was an unanticipated/
Daniel Aden (bigcmos) wrote : | #87 |
Same issue in kernel 4.4. Is there a backport for 16.04?
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
$ uname -a
Linux rig01 4.4.0-92-generic #115-Ubuntu SMP Thu Aug 10 09:04:33 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ dmesg
[81862.720112] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.720120] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2110 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
[81862.720299] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.720302] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2120 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
[81862.720687] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.720694] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2130 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
[81862.720911] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.720914] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2140 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
[81862.722506] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.722513] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2150 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
[81862.722584] xhci_hcd 0000:0d:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[81862.722588] xhci_hcd 0000:0d:00.0: Looking for event-dma 0000001fa64e2160 trb-start 0000001fa64e20f0 trb-end 0000001fa64e20f0 seg-start 0000001fa64e2000 seg-end 0000001fa64e2ff0
$ lsusb -s 006:002 -v
Bus 006 Device 002: ID 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 3.00
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 255 Vendor Specific Subclass
bDeviceProtocol 0
bMaxPacketSize0 9
idVendor 0x0b95 ASIX Electronics Corp.
idProduct 0x1790 AX88179 Gigabit Ethernet
bcdDevice 1.00
iManufacturer 1 ASIX Elec. Corp.
iProduct 2 AX88179
iSerial 3 0000249B1E94E5
bNumConfigura
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 57
bNumInterfaces 1
bConfigurat
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 124mA
Interface Descriptor:
bLength 9
bDescript
bInterfac
...
André Düwel (aduewel) wrote : | #88 |
@Daniel Aden: The fix was release in Kernel 4.4.0-93.116, please update ;)
You posted that you are using 4.4.0-92-generic #115.
Mario Limonciello (superm1) wrote : | #89 |
FYI there are two separate issues. The first is the poor performance of the ethernet on the TB16. That's the original reason this bug was opened and has been fixed in kernel upgrades.
There is a second issue that a BIOS update causes problems with USB on the TB16 (such as corrupted packets). Dell support is aware of it and a new BIOS is on it's way out very soon to resolve it. You can track the status of that with Dell here:
http://
Corey Schuhen (cschuhen) wrote : | #90 |
I do not see the same behaviour as Johann:
cschuhen@
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
cschuhen@
Linux loriel 4.10.0-33-generic #37-Ubuntu SMP Fri Aug 11 10:55:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
This is using the port on TB16.
André Düwel (aduewel) wrote : | #91 |
Thank @Mario for this clarification, I recently updated my BIOS, too. Since then under Linux my mouse and keyboard (connected to the USB ports of the TB16) stop working from time to time. In Windows Mouse and Keyboard are just sometimes laggy but don't stop completely. I will try to downgrade the BIOS today to an earlier version.
André Düwel (aduewel) wrote : | #92 |
So instead of downgrading my BIOS, I've updated it from 1.2.29 to version 1.3.0 which was released two days ago by DELL for the XPS15 9550. USB ports at the TB16 is now working without issues again.
I don't get any checksum errors, too:
for i in 1 2 3 4; do curl -s http://
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #118 |
It got really annoying lately. How do I downgrade to 1.3.5, please? I can't find it on the Dell website and fwupd doesn't provide anything too.
Lance Parsons (lparsons) wrote : | #93 |
I had similar issues with checksum errors on Precision 5520. Updating to the recently released BIOS version 1.4 has resolved those issues. Finally, all is well with Ubuntu, TB16, and Precision 5520. Thanks all.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #119 |
Running kernel-
BIOS 2.2.1 finally hit the Dell website. I can confirm that with this, the USB overall experience is now much much better (except the occasional mouse stutter but that may as well be on the OS side). There seems to be no problem at all with the dock Ethernet adapter.
In Red Hat Bugzilla #1460789, Martin (martin-redhat-bugs) wrote : | #120 |
On 4.13.4-
Georgi Boiko (pandasauce) wrote : | #94 |
I have same checksum issues on Precision 5520 and BIOS 1.4 using TB16. This is on Ubuntu 16.04.3.
$ for i in 1 2 3 4; do curl -s de.releases.
116b2649ec67507
fd81a7fda3fcf5a
4672ce371fb3c11
4672ce371fb3c11
BIOS 1.3.2:
$ for i in 1 2 3 4; do curl -s de.releases.
4672ce371fb3c11
00683eb3f831c11
4672ce371fb3c11
4672ce371fb3c11
Luis Alvarado (luisalvarado) wrote : | #95 |
It I can add, for the Logitech G930 Headset, it also gets the similar error, just before it drops connection with it (It is a USB Dongle that wirelessly (Not using Bluetooth but actual 2.4Ghz) connecst with the headset):
[12278.974880] perf: interrupt took too long (2512 > 2500), lowering kernel.
[16639.737569] perf: interrupt took too long (3165 > 3140), lowering kernel.
[19369.252915] usb 1-5: USB disconnect, device number 3
[19369.254678] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.254685] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b60 trb-start 000000101b3d3b70 trb-end 000000101b3d3b70 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.255673] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.255679] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b70 trb-start 000000101b3d3b80 trb-end 000000101b3d3b80 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.256674] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.256681] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b80 trb-start 000000101b3d3b90 trb-end 000000101b3d3b90 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.257672] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.257677] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b90 trb-start 000000101b3d3ba0 trb-end 000000101b3d3ba0 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19374.558284] usb 1-7: new full-speed USB device number 8 using xhci_hcd
[19375.274480] usb 1-7: New USB device found, idVendor=046d, idProduct=0a1f
[19375.274483] usb 1-7: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[19375.274485] usb 1-7: Product: Logitech G930 Headset
[19375.274487] usb 1-7: Manufacturer: Logitech
[19375.292170] input: Logitech Logitech G930 Headset as /devices/
[19375.354891] hid-generic 0003:046D:
[22613.309243] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Changed in linux (Fedora): | |
importance: | Unknown → Undecided |
status: | Unknown → Confirmed |
Luis Alvarado (luisalvarado) wrote : | #121 |
This bugs still present in the following kernels I have tested:
4.13.7
4.13.8
4.13.9
4.13.10
4.14-RC6
The error that typically show is:
12115.066777] retire_capture_urb: 23 callbacks suppressed
[12183.003052] usb 1-7: USB disconnect, device number 3
[12183.004602] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.004604] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7d0 trb-start 00000010243ef7e0 trb-end 00000010243ef7e0 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.005603] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.005605] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7e0 trb-start 00000010243ef7f0 trb-end 00000010243ef7f0 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.006602] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.006604] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7f0 trb-start 00000010243ef800 trb-end 00000010243ef800 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.007603] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.007606] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef800 trb-start 00000010243ef810 trb-end 00000010243ef810 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.008601] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.008602] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef810 trb-start 00000010243ef820 trb-end 00000010243ef820 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.009599] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.009600] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef820 trb-start 00000010243ef830 trb-end 00000010243ef830 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.010626] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.010628] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef830 trb-start 00000010243ef840 trb-end 00000010243ef840 seg-start 00000010243ef000 seg-end 00000010243efff0
[12186.007322] usb 1-4: new full-speed USB device number 7 using xhci_hcd
[12186.723745] usb 1-4: New USB device found, idVendor=046d, idProduct=0a1f
[12186.723746] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[12186.723747] usb 1-4: Product: Logitech G930 Headset
[12186.723747] usb 1-4: Manufacturer: Logitech
[12186.740338] input: Logitech Logitech G930 Headset as /devices/
[12186.799495] hid-generic 0003:046D:
My hardware is:
sudo lshw -sanitize -numeric
computer
description: Desktop Computer
product: System Product Name (SKU)
vendor: System manufacturer
version: System Version
serial: [REMOVED]
...
Changed in hwe-next: | |
status: | New → Fix Released |
In Red Hat Bugzilla #1460789, Gerben (gerben-redhat-bugs) wrote : | #122 |
This is happening even on my 9560 with 4.13.9 vanilla; when running a background rsync backup job, packages downloaded in a Debian docker build frequently do not match their checksum and need multiple runs to succeed.
In Red Hat Bugzilla #1460789, Gerben (gerben-redhat-bugs) wrote : | #123 |
And just to illustrate my point, on 4.14.0 vanilla:
while true; do
dd if=/nfsmount/
With rx offload on (default):
489ed92b17aa9a4
742462292c76189
f11ba5f624dbab5
742462292c76189
e925ff013c99a1b
With rx offload off:
742462292c76189
742462292c76189
742462292c76189
742462292c76189
742462292c76189
In Red Hat Bugzilla #1460789, Kai-Heng (kai-heng-redhat-bugs) wrote : | #124 |
Please try this patch:
https:/
In Red Hat Bugzilla #1460789, Gerben (gerben-redhat-bugs) wrote : | #125 |
Applied to 4.14.14. Offload:
tcp-segmentatio
udp-fragmentati
generic-
generic-
rx-vlan-offload: on
tx-vlan-offload: on
dd | sha1sum loop:
742462292c76189
742462292c76189
742462292c76189
742462292c76189
742462292c76189
742462292c76189
Ran for 10 minutes, so looks like that patch works (doing around 90mbit/s of traffic).
Luis Alvarado (luisalvarado) wrote : | #126 |
If this helps, this is happening in Ubuntu 17.10 with the 4.13 and 4.15 Kernels. I am using a Logitech G930
This was the output with dmesg
[ 15.303655] logitech-
[ 49.671534] usb 3-2: USB disconnect, device number 2
[ 49.672965] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.672975] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eec0 trb-start 000000101f51eed0 trb-end 000000101f51eed0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.673966] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.673980] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eed0 trb-start 000000101f51eee0 trb-end 000000101f51eee0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.674932] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.674943] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eee0 trb-start 000000101f51eef0 trb-end 000000101f51eef0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.675952] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.675962] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eef0 trb-start 000000101f51ef00 trb-end 000000101f51ef00 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.676939] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.676947] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51ef00 trb-start 000000101f51ef10 trb-end 000000101f51ef10 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 52.527220] usb 1-3: new full-speed USB device number 7 using xhci_hcd
[ 53.240955] usb 1-3: New USB device found, idVendor=046d, idProduct=0a1f
[ 53.240960] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 53.240963] usb 1-3: Product: Logitech G930 Headset
[ 53.240965] usb 1-3: Manufacturer: Logitech
[ 53.259535] input: Logitech Logitech G930 Headset as /devices/
[ 53.316838] hid-generic 0003:046D:
Georgi Boiko (pandasauce) wrote : | #127 |
Update to my October post:
Dell Precision 5520 and BIOS 1.7 using TB16. This is on Ubuntu 16.04.3, kernel 4.13.0
The issue is still present. I tried limiting the bandwidth using `ethtool -s eth0 speed 100 duplex full autoneg on` and also as described in this blog post: http://
$ for i in 1 2 3 4; do curl -s http://
2641b55ed2e2038
63f41e8b8e4e5ad
^C%
$ sudo ethtool -s eth0 speed 100 duplex full autoneg on
$ for i in 1 2 3 4; do curl -s http://
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
$ for i in 1 2 3 4; do curl -s http://
ed13e9c6c45f027
4672ce371fb3c11
^C%
Next, I tried disabling offloading as described above. I haven't reset the device to 1 Gbps before doing so. It seems to be working fine so far. I will leave it running for an hour over lunch today to be completely sure.
$ sudo ethtool --offload eth0 tx off
Actual changes:
tx-checksumming: off
tx-
tx-
tcp-segmentatio
tx-
tx-
$ sudo ethtool --offload eth0 rx off
$ for i in 1 2 3 4 5 6; do curl -s http://
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
In about a week I will be able to test this on a 2017 XPS 9560 (non-DE) too.
Kai-Heng Feng (kaihengfeng) wrote : | #128 |
It's another bug. Please refer to LP: #1729674.
Guilherme G. Piccoli (gpiccoli) wrote : | #129 |
As an informative note: it was observed that adapter "ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller [1b21:1242]" has a similar bug, but seems the quirk (from kernel commit 9da5a1092b13), even if applied to right PCI_ID, doesn't fix the issue.
I guess this was the case from the user imperia above (comment #61).
There's another LP for issues with the adapter "ASM1142 USB 3.1 Host Controller [1b21:1242]": https:/
Thanks,
Guilherme
In Red Hat Bugzilla #1460789, gerben (gerben-redhat-bugs-1) wrote : | #141 |
On 4.15.4 I see a lot of:
Feb 21 15:43:31 localhost kernel: [18401.483078] pcieport 0000:00:1d.6: AER: Corrected error received: id=00ee
Feb 21 15:43:31 localhost kernel: [18401.483095] pcieport 0000:00:1d.6: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00ee(Transmitter ID)
Feb 21 15:43:31 localhost kernel: [18401.483097] pcieport 0000:00:1d.6: device [8086:a11e] error status/
Feb 21 15:43:31 localhost kernel: [18401.483099] pcieport 0000:00:1d.6: [12] Replay Timer Timeout
Which may or may not be related. However, randomly, r8152 stops working entirely. Most recent dmesg:
Feb 21 15:43:42 localhost kernel: [18412.136941] ------------[ cut here ]------------
Feb 21 15:43:42 localhost kernel: [18412.136947] NETDEV WATCHDOG: enxa44cc8d0edff (r8152): transmit queue 0 timed out
Feb 21 15:43:42 localhost kernel: [18412.136969] WARNING: CPU: 1 PID: 0 at net/sched/
Feb 21 15:43:42 localhost kernel: [18412.136972] Modules linked in: sg uas usb_storage rfcomm nf_conntrack_
HECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
Feb 21 15:43:42 localhost kernel: [18412.137037] i2c_designware_core iwlmvm input_leds i2c_hid mac80211 dell_smm_hwmon x86_pkg_
Feb 21 15:43:42 localhost kernel: [18412.137101] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U 4.15.4 #5
Feb 21 15:43:42 localhost kernel: [18412.137104] Hardware name: Dell Inc. XPS 15 9560/05FFDN, BIOS 1.7.0 12/15/2017
Feb 21 15:43:42 localhost kernel: [18412.137108] RIP: 0010:dev_
Feb 21 15:43:42 localhost kernel: [18412.137112] RSP: 0018:ffff88087e
Feb 21 15:43:42 localhost kernel: [18412.137116] RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000103
Feb 21 15:43:42 localhost kernel: [18412.137119] RDX: 0000000080000103 RSI: ffffffff82063a3a RDI: 000...
Georgi Boiko (pandasauce) wrote : | #130 |
@kaihengfeng
Thanks, I will repost it there. Can confirm the adapter dropping out with the same errors on 5520/TB16 at 1Gbps with latest 16.04 LTS though.
In Red Hat Bugzilla #1460789, jarod (jarod-redhat-bugs-1) wrote : | #142 |
Looks like this is more of a firmware issue with these docks and/or a driver issue with the 8152, so I'm throwing this back onto the queue where it was.
In Red Hat Bugzilla #1460789, marianne (marianne-redhat-bugs) wrote : | #143 |
I think I have the same issue with my laptop and dock (Dell TB16).
Laptop is new and installed in Fedora 28. All firmware are up-to-date.
Ethernet works fine unless I want to transfert a large amount of data. Session (sftp, rsync or scp) cut abruptly after a few seconds. Nothing relevant appears in system logs.
If I offload the RC checksums (as suggested above) using : ethtool --offload enp11s0u1u2 rx off
Everything works fine.
Tell me if you need more logs or informations
In Red Hat Bugzilla #1460789, mario_limonciello (mariolimonciello-redhat-bugs) wrote : | #144 |
FYI this commit ended up landing related to this. I would recommend to backport it.
https:/
In Red Hat Bugzilla #1460789, jcline (jcline-redhat-bugs) wrote : | #145 |
Hi Mario, thanks for the pointer. Fedora stable releases are currently on 4.16.15 so that fix should be in place. I've got a TB16 at home so I can also try to reproduce this on Fedora 28 this evening.
marianne, adding the dmesg logs would be helpful. Thanks!
In Red Hat Bugzilla #1460789, ondrej.kolin (ondrej.kolin-redhat-bugs) wrote : | #146 |
Our bug report from Launchpad:
Hi.
Large amount of data gets corrupted when using the TB16 ethernet port. (rsync synchronization, etc... )
Linux E7490 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
On my Fedora is this still an issue even with announced bugfix (link copied from this discussion #78.
Linux username-
It's fixed by turning the checksum offload off (tested on the Fedora .
sudo ethtool --offload enp11s0u1u2 rx off
https:/
related in bugzilla:
lepirlouit (lepirlouit) wrote : | #131 |
I have no issue with my ethernet adapter, but I see the same logs :
for i in 1 2 3 4 5 6; do curl -s http://
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
4672ce371fb3c11
^C
dmesg
[ 1659.619538] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 1659.619545] xhci_hcd 0000:00:14.0: Looking for event-dma 000000032a3eeb80 trb-start 000000032a3eeb90 trb-end 000000032a3eeb90 seg-start 000000032a3ee000 seg-end 000000032a3eeff0
[ 1669.859587] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 1669.859601] xhci_hcd 0000:00:14.0: Looking for event-dma 000000032a3eedf0 trb-start 000000032a3eee00 trb-end 000000032a3eee00 seg-start 000000032a3ee000 seg-end 000000032a3eeff0
[ 1803.744811] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 1803.744827] xhci_hcd 0000:00:14.0: Looking for event-dma 000000032a3ee0c0 trb-start 000000032a3ee0d0 trb-end 000000032a3ee0d0 seg-start 000000032a3ee000 seg-end 000000032a3eeff0
lsusb
Bus 002 Device 002: ID 8087:8000 Intel Corp.
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:8008 Intel Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 002: ID 0424:5434 Standard Microsystems Corp. Hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 003: ID 05c8:0374 Cheng Uei Precision Industry Co., Ltd (Foxlink)
Bus 003 Device 002: ID 138a:003f Validity Sensors, Inc. VFS495 Fingerprint Reader
Bus 003 Device 004: ID 8087:07dc Intel Corp.
Bus 003 Device 009: ID 04e8:6860 Samsung Electronics Co., Ltd Galaxy (MTP)
Bus 003 Device 010: ID 046d:c328 Logitech, Inc.
Bus 003 Device 008: ID 046d:c062 Logitech, Inc. M-UAS144 [LS1 Laser Mouse]
Bus 003 Device 007: ID 046d:0a01 Logitech, Inc. USB Headset
Bus 003 Device 006: ID 0424:5434 Standard Microsystems Corp. Hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
lepirlouit (lepirlouit) wrote : | #132 |
ubuntu 18.04
kernel : 4.15.0-29-generic
In Red Hat Bugzilla #1460789, ondrej.kolin (ondrej.kolin-redhat-bugs) wrote : | #147 |
https:/
In Red Hat Bugzilla #1460789, tomastrnka (tomastrnka-redhat-bugs) wrote : | #148 |
The issue is not unique to the integrated NIC in the dock (so the current workaround in r8152 is not sufficient). I have a r8152-based TP-LINK UE300 USB3-to-GigE dongle connected to my TB16 dock and I'm getting the same packet corruption when I don't turn off rx checksum offloading.
usb 4-1.1.1: new SuperSpeed Gen 1 USB device number 5 using xhci_hcd
usb 4-1.1.1: New USB device found, idVendor=2357, idProduct=0601, bcdDevice=30.00
usb 4-1.1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
usb 4-1.1.1: Product: USB 10/100/1000 LAN
usb 4-1.1.1: Manufacturer: TP-LINK
usb 4-1.1.1: SerialNumber: 000001000000
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
|__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 5, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
|__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/2p, 5000M
|__ Port 2: Dev 4, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
The dongle is plugged into the internal USB hub in my Dell U2715H screen, which is in turn plugged into the TB16 (latest firmware 1.0.0), connected to my XPS 15 9560 (latest BIOS 1.11.0, Linux 4.18.7-
I've also seen someone mentioning that (some) USB3 ports on the TB16 are in fact Alpine Ridge pass-through. That does not seem to be the case here, all three ports on my TB16 go through the ASMedia host controller:
0e:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller
The r8152 workaround triggers just fine for the integrated NIC in the dock:
usb 4-1.2: reset SuperSpeed Gen 1 USB device number 4 using xhci_hcd
usb 4-1.2: Dell TB16 Dock, disable RX aggregation
In Red Hat Bugzilla #1460789, mario_limonciello (mariolimonciello-redhat-bugs) wrote : | #149 |
@Tomas,
It sounds like the topology needs to be looked at then for applying this quirk.
Can you connect the dongle to the USB-C port with C-A adapter? That is the AR pass through port.
In Red Hat Bugzilla #1460789, tomastrnka (tomastrnka-redhat-bugs) wrote : | #150 |
Indeed, I found the mention of the pass-through only applying to the USB-C like a minute after I wrote my previous comment. Sorry for the noise.
I don't have a C-A adapter at hand, but I've tried using the Dell DA200 adapter instead (not exactly the same thing as it's an extra hub, but hopefully it helps anyway). So the topology is:
Dongle -> DA200 (hub) -> USB-C port on the TB16 -> AR host controller
/: Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 10000M
|__ Port 2: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 5, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
|__ Port 4: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
0f:00.0 USB controller: Intel Corporation DSL6540 USB 3.1 Controller [Alpine Ridge]
This setup works fine without any corruption with all offloads on (default).
In Red Hat Bugzilla #1460789, kai.heng.feng (kai.heng.feng-redhat-bugs) wrote : | #151 |
IIRC, I tested this scenario, and I didn't observe the issue on external r8152 dongle over the ASMedia xHC host.
The v1 patch I sent was using topology to check, but maintainers didn't like it.
I'll see if I can come up a "better" version of it so maintainers will accept it.
patrick brown (mpatalberta) wrote : | #133 |
Hello I am running in to a similar error using the 18.10 cuttlefish with the usb 3.1 driver.
What kernel has this code change
[ 719.522273] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2bdf0 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
[ 719.522452] xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[ 719.522454] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2be30 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
[ 719.522602] xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[ 719.522604] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2be70 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
[ 719.522768] xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[ 719.522770] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2beb0 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
[ 719.522938] xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[ 719.522940] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2bef0 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
[ 719.523102] xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[ 719.523104] xhci_hcd 0000:02:00.0: Looking for event-dma 0000000463f2bf30 trb-start 00000004650cbfb0 trb-end 00000004650cbfe0 seg-start 00000004650cb000 seg-end 00000004650cbff0
tnl@tnl-
Linux tnl-NUC8i7HNK 4.18.0-10-generic #11-Ubuntu SMP Thu Oct 11 15:13:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #162 |
After upgrading to the 4.20 Kernel(was using 4.19 previously) my usb wifi stick doesn´t work until I reboot the system. This issue happens every time I start my pc(only when the system was shut down, it doesn´t happen after rebooting). The wifi driver in use is rt2800usb. I tried restarting the NetworkManager, but this didn´t change anything.
In Red Hat Bugzilla #1460789, torel (torel-redhat-bugs) wrote : | #152 |
cc
In Red Hat Bugzilla #1460789, torel (torel-redhat-bugs) wrote : | #153 |
Ref. bug # 1600126
I updated r8152 to v2.11 per https:/
# cd /usr/src/
# patch -p1 <./linux-
# more /usr/src/
PACKAGE_
PACKAGE_
BUILT_MODULE_
DEST_MODULE_
AUTOINSTALL="yes"
# ll /var/lib/
lrwxrwxrwx. 1 root root 21 Mar 1 15:22 /var/lib/
# dracut -f
At least my kbd is still working after 30 minutes. A record on kernels above 4.18.18-300.fc29.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #163 |
Hmm, that's strange perhaps this is some USB host problem. Please provide dmesg of your system.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #164 |
Created attachment 281677
dmesg output before reboot
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #165 |
Created attachment 281679
dmesg output after reboot
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #166 |
We have this xhci_hcd warning on bad case:
xhci_hcd 0000:15:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
Not sure where it come from. But I notice you are using AMD IOMMU which we have troubles with with different drivers.
You could try to disable iommu via kerenl boot parameter and check if that improve things. You could also try test this patch if possible:
https:/
If none of that helps I will prepare some rt2800 patches to see if this not caused by some of v4.19 .. v4.20 rt2800 commits:
0240564430c0 rt2800: flush and txstatus rework for rt2800mmio
adf26a356f13 rt2x00: use different txstatus timeouts when flushing
5022efb50f62 rt2x00: do not check for txstatus timeout every time on tasklet
0b0d556e0ebb rt2800mmio: use txdone/txstatus routines from lib
5c656c71b1bf rt2800: move usb specific txdone/txstatus routines to rt2800lib
f483039cf51a rt2x00: use simple_
But I would rather suspect problem introduced in AMD IOMMU or usb/xhci drivers.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #167 |
I tried disabling iommu, and I also compiled the 4.20.15 kernel from source with that patch applied, but the wifi didn´t work in both cases either.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #168 |
Created attachment 281711
rt2x00_
Please test this patch and report if it makes problem gone or not.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #169 |
The problem is still there after applying that patch.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #170 |
You need to report this bug usb maintainers. I'm changing the topic and component, but USB bugs should be reported directly to mailing list.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #171 |
Please send bug report to <email address hidden>
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #172 |
I can confirm this issue. Also I can confirm that other USB devices are effected, too (mostly if plugged into an USB3 port).
For example:
ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
dmesg doesn't show IOMMU warnings, so I assume it is a problem introduced in usb/xhci driver.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #173 |
(In reply to Michael from comment #10)
> I can confirm this issue. Also I can confirm that other USB devices are
> effected, too (mostly if plugged into an USB3 port).
> For example:
> ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
>
> dmesg doesn't show IOMMU warnings, so I assume it is a problem introduced in
> usb/xhci driver.
I think this affects only a specific hardware configuration(I've tried using my wifi stick on a different machine and it worked without problems).
Which hardware are you using? Maybe there are some parts we have in common.
My hardware configuration:
CPU: AMD Ryzen 3 2200G, Motherboard: MSI B350 PC MATE
GPU: AMD Radeon RX 580 8GB
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #174 |
@ Bernhard
The parts we have in common : AMD RYZEN
AMD RYZEN 1700 MSI X370 KRAIT, MSI AERO GTX1080Ti, 5.0.6-arch1-1-ARCH (system was also affected by IOMMU issue - but that is fixed)
Affected USB WiFi devices (tested):
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter (ALFA AWUS036NH - rt2800usb)
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter (ipTime/ zioncom - rt2800usb)
ID 7392:7710 Edimax Technology Co., Ltd (mt7601u)
ID 7392:a812 Edimax Technology Co., Ltd (Edimax EW-7811USC - rtl88xxau)
ID 148f:761a Ralink Technology, Corp. MT7610U ("Archer T2U" 2.4G+5G WLAN Adapter - mt76x0)
ID 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode)
I'm sure there are more.
After he has fixed some driver / IOMMU issues, Stanislaw has found out, that it possibly could be a xhci/driver issue. I share his opinion.
You can read more about the issues here:
https:/
and the fixed IOMMU issue here:
https:/
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #175 |
FTR: I think those two commits could help:
commit 6cbcf596934c8e1
Author: Mathias Nyman <email address hidden>
Date: Fri Mar 22 17:50:15 2019 +0200
xhci: Fix port resume done detection for SS ports with LPM enabled
commit d92f2c59cc2cbca
Author: Mathias Nyman <email address hidden>
Date: Fri Mar 22 17:50:17 2019 +0200
xhci: Don't let USB3 ports stuck in polling state prevent suspend
Also I'm not sure if if issue was reported to proper maintainer. If not and problem is not already fixed on latest upstream, either bisection will be needed to precede with this bug or fill properly informative bug report to proper maintainer.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #176 |
@ Stanislaw, thanks for additional information.
@ Bernhard, have you already sent this bug report to linux-usb mailing list?
can we change affected kernel version from 4.20 to >= 4.20, because 5.0.6 is affected, too?
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #177 |
Yes, I already sent this to the mailing list, but I got no response unfortunately.
I've changed the affected kernel version btw.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #178 |
@ Bernhard, thanks for your answer. So there is no need for me to report this issue, too.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #179 |
I just tried the two patches Stanislaw mentioned, but the problem is still there.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #180 |
Tried them, too, some days ago, but the didn't solve the issue.
Just downloaded 5.1rc3, but I don't expect a working driver (usb/host), inside.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #181 |
Tested an ASUS X555U (Intel i5-6200 - 5.0.6-arch1-1-ARCH) and that system is affected, if the device is plugged into one of the USB3 ports. The device is working, if plugged into the USB2 port.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #182 |
I just tried replacing the xhci_ring.c file with the version from the 4.19 kernel, that solved the problem. Then I started patching the code until the problem occurs again.
The change in the function "static int process_
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #183 |
Berna(In reply to Bernhard from comment #20)
> I just tried replacing the xhci_ring.c file with the version from the 4.19
> kernel, that solved the problem. Then I started patching the code until the
> problem occurs again.
> The change in the function "static int process_
> problem, it's part of this patch:
> https:/
> drivers/
Good findings, great. This seems to be part of
commit f8f80be501aa2f1
Author: Mathias Nyman <email address hidden>
Date: Thu Sep 20 19:13:37 2018 +0300
xhci: Use soft retry to recover faster from transaction errors
Just add information you found in the posted linux-usb email and CC "Mathias Nyman <email address hidden>" to make sure he is aware of the problem.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #184 |
The issue isn't fixed in 5.1rc3, so it look's like Mathias Nyman isn't aware of the problem, yet.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #185 |
Still present in 5.1.2
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #186 |
This issue is really funny:
running
D 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
on kernel
$ uname -r
5.1.7-arch1-1-ARCH
will spam the log after the know WARN
43163.034783] mt76x0u 1-10.2:1.0 wlp3s0f0u10u2: renamed from wlan0
[43163.351656] usb 1-10.2: USB disconnect, device number 6
[43163.352176] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
with tons of failed vendor requests:
[43160.683383] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3dc failed:-71
[43160.813398] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e0 failed:-71
[43160.943415] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e4 failed:-71
[43161.073440] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3e8 failed:-71
[43161.203439] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3ec failed:-71
[43161.333458] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f0 failed:-71
[43161.463468] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f4 failed:-71
[43161.593561] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3f8 failed:-71
[43161.723502] mt76x0u 1-10.2:1.0: vendor request req:06 off:c3fc failed:-71
[43161.853512] mt76x0u 1-10.2:1.0: vendor request req:06 off:108c failed:-71
....
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #187 |
If the same device is connected to an Intel Core I5-6200 system (USB3 port), the log looks different to the AMD RYZEN system.
[ 204.231872] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231901] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231940] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.231980] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232020] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232188] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232226] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232275] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232304] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.232345] mt76x0u 1-1:1.0: rx urb failed: -71
[ 204.233284] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[ 204.233291] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
[ 204.263427] mt76x0u 1-1:1.0: TX DMA did not stop
[ 207.596726] mt76x0u 1-1:1.0: Warning: MAC TX did not stop!
[ 209.650050] mt76x0u 1-1:1.0: Warning: MAC RX did not stop!
[ 209.651133] mt76x0u 1-1:1.0: RX DMA did not stop
Also I noticed some changes in xhci-ring.c between 5.1.7 and 5.2_rc4. Maybe they'll fix the problem. I didn't tested it, yet.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #188 |
I already tried the 5.2-rc3 kernel and the problem isn't fixed yet. There were no changes in the xhci driver between rc3 and rc4, so it's very unlikely that the problem doesn't occur in the 5.2-rc4 kernel.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #189 |
Thanks for the information. I skipped 5.2rc1 ... rc3.
But with your information, there is no real need for me to run some more tests.
Unfortunately it looks like the issue is back ported to older kernel versions (4.19), because I got some issue reports here, too:
https:/
and 90% of my devices doesn't work any longer.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #190 |
When did it get back ported? I'm on 4.19.48 and haven't had a problem with this version...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #191 |
It's just a guess, because of this post:
https:/
But it looks like the device was working before that post.
I cant test it, because I have not such a device.
I tested a TP-LINK Archer T2UH and this device is not working on 4.19.46 arm (Raspberry Pi).
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #192 |
Yes, rt2800usb is working fine on 4.19.46.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #193 |
hcxdumptool running on kernel 4.19.46 arm doesn't receive packets on several different devices. In this case:
ID 0b05:17d1 ASUSTek Computer, Inc. AC51 802.11a/b/g/n/ac Wireless Adapter [Mediatek MT7610U]
INFO: cha=1, rx=0, rx(dropped)=0, tx=18, err=0, aps=0 (0 in range)
while a few other devices still working
INFO: cha=1, rx=805, rx(dropped)=0, tx=93, err=0, aps=29 (21 in range)
BTW:
I'm running/testing only devices on which driver support monitor mode and packet injection.
Very interesting on that arm kernel is that dmesg doesn't show any WARNs.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #194 |
Still no fix?
$ uname -r
5.1.11-arch1-1-ARCH
and most of the USB devices WiFI, BLUETOOTH,....) are still not working:
32942.700591] usb 1-10.4: new full-speed USB device number 7 using xhci_hcd
[32944.721410] usb 1-10.4: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=52.76
[32944.721412] usb 1-10.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[32945.069015] Bluetooth: hci0: hardware error 0x37
How about kernel 5.2?
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #195 |
Some USB card readers are also affected (connected to USB 3 port):
$ uname -r
5.1.12-arch1-1-ARCH
[ 3510.100114] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd
[ 3510.134121] usb 2-2: New USB device found, idVendor=058f, idProduct=6387, bcdDevice= 0.02
[ 3510.134126] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 3510.134128] usb 2-2: Product: Intenso Ultra Line
[ 3510.134130] usb 2-2: Manufacturer: ALCOR
...
[ 5129.997608] usb 1-1: reset high-speed USB device number 7 using xhci_hcd
[ 5130.218618] sd 9:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=
[ 5130.218631] sd 9:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 20 c3 c0 00 00 20 00
[ 5130.218637] print_req_error: I/O error, dev sdb, sector 2147264 flags 80700
I really wonder why that issue hasn't been fixed, yet, because many, many devices are affected.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #196 |
The list of changes for 5.2-rc6 contains this two commits:
Mathias Nyman (2):
usb: xhci: Don't try to recover an endpoint if port is in error state.
xhci: detect USB 3.2 capable host controllers correctly
I think this could be the fix for this issue.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #197 |
Great, thanks for the information. The issue is really ugly, because many USB devices are affected (hdd, card reader, bleutooth, wlan, ... - this list is long)
I'll check 5.2-rc6.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #198 |
Just tried 5.2-rc6, but unfortunately I still have the same issue.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #199 |
Thanks for the information. I tested 5.2-rc6, too. Even an USB 3.0 HDD isn't working.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #200 |
Now running mainline kernel 5.2 and the issue still exists.
Tested on this device:
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
but the same applies to many other devices, too
dmesg after plug in the device:
[75.482165] usb 1-2: new high-speed USB device number 6 using xhci_hcd
[75.639236] usb 1-2: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[75.639238] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[75.639239] usb 1-2: Product: 802.11 n WLAN
[75.639240] usb 1-2: Manufacturer: Ralink
[75.639241] usb 1-2: SerialNumber: 1.0
[75.952611] usb 1-2: reset high-speed USB device number 6 using xhci_hcd
[76.107232] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[76.120228] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 0005 detected
[76.121079] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[76.130873] usbcore: registered new interface driver rt2800usb
[76.194447] audit: type=1130 audit(156283349
[76.195313] rt2800usb 1-2:1.0 wlp0s20f0u2: renamed from wlan0
[76.216178] ieee80211 phy1: rt2x00lib_
[76.241382] ieee80211 phy1: rt2x00lib_
[76.544022] ieee80211 phy1: rt2x00usb_
[77.562305] ieee80211 phy1: rt2800_
[77.562316] ieee80211 phy1: rt2800usb_
...
followed by this message on access to the interface:
[341.598563] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[341.598573] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #201 |
BTW:
The tested device is an ALFA AWUS036NH and I really can't see "Unstable hardware" here.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #202 |
I don't really think the problem is caused by the WIFI stick itself, maybe the cause is the xHCI controller from the motherboard? We're both using a 300-series AM4 board(even the same brand), so we probably have the same controller.
Btw. I've already tried the git snapshot from 5.3-rc1, problem isn't fixed there either.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #203 |
No, I don't think it's the controller. I'm running three different systems here:
RYZEN 1700, MSI X370 KRAIT
INTEL I5-6200U, ASUS X555U (notebook)
INTEL i7-3930K, ASUS P9X79
and all of them running into the same issue. Also, not all of the testing devices are affected. Some devices are still working as expected (for example TENDA W311U+), while others failed epically (ALFA AWUSH036NH). The same applies to several bluetooth devices.
Absolutely new (and really funny) is the error message "Unstable hardware" on 5.2.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #204 |
And 5.2 makes things more worse. Most of my adapters are not working.
EDIMAX EW-7711UAN V2
ID 7392:7710 Edimax Technology Co., Ltd
[ 228.451035] usb 1-2: new high-speed USB device number 53 using xhci_hcd
[ 228.629543] usb 1-2: New USB device found, idVendor=7392, idProduct=7710, bcdDevice= 0.00
[ 228.629548] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 228.629550] usb 1-2: Product: Edimax Wi-Fi
[ 228.629552] usb 1-2: Manufacturer: MediaTek
[ 228.629554] usb 1-2: SerialNumber: 1.0
[ 228.779827] usb 1-2: reset high-speed USB device number 53 using xhci_hcd
[ 229.037761] mt7601u 1-2:1.0: ASIC revision: 76010001 MAC revision: 76010500
[ 229.064654] mt7601u 1-2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
[ 230.045089] mt7601u 1-2:1.0: EEPROM ver:0d fae:00
[ 230.055724] mt7601u 1-2:1.0: EEPROM country region 01 (channels 1-13)
[ 230.763955] mt7601u 1-2:1.0: Warning: mt7601u_
[ 231.084339] mt7601u 1-2:1.0: Warning: mt7601u_
[ 231.404311] mt7601u 1-2:1.0: Warning: mt7601u_
[ 231.724294] mt7601u 1-2:1.0: Warning: mt7601u_
[ 232.044298] mt7601u 1-2:1.0: Warning: mt7601u_
[ 232.044301] mt7601u 1-2:1.0: Error: mt7601u_
[ 232.044485] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 232.046810] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 232.197641] mt7601u 1-2:1.0: Vendor request req:07 off:0080 failed:-71
[ 232.347631] mt7601u 1-2:1.0: Vendor request req:02 off:0080 failed:-71
[ 232.497630] mt7601u 1-2:1.0: Vendor request req:02 off:0080 failed:-71
[ 232.497675] mt7601u: probe of 1-2:1.0 failed with error -110
LOGILINK WL0150
ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
[ 527.994013] usb 1-2: new high-speed USB device number 86 using xhci_hcd
[ 528.238517] usb 1-2: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[ 528.238519] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 528.238521] usb 1-2: Product: 802.11 n WLAN
[ 528.238522] usb 1-2: Manufacturer: Ralink
[ 528.238523] usb 1-2: SerialNumber: 1.0
[ 528.495914] usb 1-2: reset high-speed USB device number 86 using xhci_hcd
[ 528.747058] ieee80211 phy81: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[ 529.426163] ieee80211 phy81: rt2x00_set_rf: Info - RF chipset 5370 detected
[ 529.432544] ieee80211 phy81: Selected rate control algorithm 'minstrel_ht'
[ 529.433131] usbcore: registered new interface driver rt2800usb
[ 529.447058] audit: type=1130 audit(156285099
[ 529.447260] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[ 534.453471] audit: type=1131 audit(156285099
[ 560.993915] ieee80211 phy81: r...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #205 |
Sorry, copy and paste error of the last dmesg log. Due to several tests, dmesg log was flooded by warnings and error messages.
I'll stop the tests and will wait for next LTS kernel.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #206 |
BTW:
For me the issue started at this point:
https:/
when the Linux kernel's default i386/x86_64 kernel configurations shiped with USB 3.0 support enabled (CONFIG_
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #207 |
Looks like there was requested a debug tracing, what was ignored:
https://<email address hidden>/
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #208 |
I didn't ignore it, I sent it to Mathias Nyman only, and not to the whole mailing list("Send output to me" didn't sound like I should send it to the whole mailing list but idk). I have to admit that the first traces weren't really useful though, when I ran the commands he told me the traces started too late(because the error happens immediately after system startup, so when I run this commands after startup the important part is missing).
Then he gave me instructions how to enable tracing at startup, which only resulted in this error: [ 0.172042] Failed to enable trace event: xhci-hcd
and the tracing file was empty afterwards.
Just about one week ago I had another idea how I could get it working, and it actually worked. The solution was to just unplug the wifi stick at boot, then enable tracing and plug in the stick again(I don't know why I didn't try that a few months ago tbh). I've sent the two files(dmesg and tracing) to Mathias Nyman again, but this time he didn't respond(I've sent the mail on July 11th).
Should I send the whole tracing file and dmesg log to the mailing list instead? What is the preferred way to send files that are too big for an e-mail(tracing is around 17.6MB in size)?
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #209 |
Bernhard, thanks for the update and provide debug data to the maintainer.
I think you should ping him on mailing list and ask if anything else need to be provided or how to precede otherwise. Maybe we can we just revert the patch?
This issue is annoying and I see more users entering it (and blaming mt76x0u or rt2800usb drivers). It should not be hard to fix since is regression (commit causing it is known) and is reproducible.
Please also point that changes in process_
tags: | added: cscc |
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #210 |
This issue is blaming nearly everything (like this SAMSUNG Galaxy S3):
[34385.294067] usb 1-2: new high-speed USB device number 6 using xhci_hcd
[34385.465017] usb 1-2: New USB device found, idVendor=18d1, idProduct=4ee7, bcdDevice= 2.26
[34385.465022] usb 1-2: New USB device strings: Mfr=2, Product=3, SerialNumber=4
[34385.465025] usb 1-2: Product: GT-I9300
[34385.465028] usb 1-2: Manufacturer: samsung
...
[35074.182055] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #211 |
Bernhard, running my RYZEN for some days and noticed that tha xhci issue also
affected the USB keyboard and the USB mouse:
At this time, the system is allready running for 2 days:
Aug 24 08:38:41.665376 tux1 kernel: usb 1-12: new low-speed USB device number 19 using xhci_hcd
Aug 24 08:38:42.001609 tux1 kernel: usb 1-12: New USB device found, idVendor=046a, idProduct=0011, bcdDevice= 1.00
Aug 24 08:38:42.001850 tux1 kernel: usb 1-12: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Aug 24 08:38:42.098291 tux1 kernel: hid-generic 0003:046A:
Aug 24 08:38:43.631091 tux1 kernel: usb 1-12: input irq status -75 received
Aug 24 08:38:43.631384 tux1 kernel: usb usb1-port12: disabled by hub (EMI?), re-enabling...
Aug 24 08:38:43.631409 tux1 kernel: usb 1-12: USB disconnect, device number 19
Aug 24 08:38:44.025057 tux1 kernel: usb 1-12: new low-speed USB device number 20 using xhci_hcd
Aug 24 08:38:44.361600 tux1 kernel: usb 1-12: New USB device found, idVendor=046a, idProduct=0011, bcdDevice= 1.00
Aug 24 08:38:44.361839 tux1 kernel: usb 1-12: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Aug 24 08:38:44.401604 tux1 kernel: input: HID 046a:0011 as /devices/
Aug 24 08:38:44.458277 tux1 kernel: hid-generic 0003:046A:
Aug 24 08:38:49.031776 tux1 kernel: usb 1-12: input irq status -75 received
Aug 24 08:38:49.032082 tux1 kernel: usb usb1-port12: disabled by hub (EMI?), re-enabling...
Aug 24 08:38:49.032099 tux1 kernel: usb 1-12: USB disconnect, device number 20
Aug 24 08:38:49.425365 tux1 kernel: usb 1-12: new low-speed USB device number 21 using xhci_hcd
Aug 24 08:39:04.905175 tux1 kernel: usb 1-12: device descriptor read/64, error -110
Aug 24 08:39:20.478280 tux1 kernel: usb 1-12: device descriptor read/64, error -110
Aug 24 08:39:20.774967 tux1 kernel: usb 1-12: new low-speed USB device number 22 using xhci_hcd
Aug 24 08:39:36.331757 tux1 kernel: usb 1-12: device descriptor read/64, error -110
Aug 24 08:39:51.838723 tux1 kernel: usb 1-12: device descriptor read/64, error -110
Aug 24 08:39:51.945370 tux1 kernel: usb usb1-port12: attempt power cycle
Aug 24 08:39:52.588394 tux1 kernel: usb 1-12: new low-speed USB device number 23 using xhci_hcd
Aug 24 08:39:57.415723 tux1 kernel: usb 1-12: Device not responding to setup address.
Aug 24 08:40:02.448295 tux1 kernel: usb 1-12: Device not responding to setup address.
Aug 24 08:40:02.655042 tux1 kernel: usb 1-12: device not accepting address 23, error -71
Aug 24 08:40:02.778269 tux1 kernel: usb 1-12: new low-speed USB device number 24 using xhci_hcd
Aug 24 08:40:07.604975 tux1 kernel: usb 1-12: Device not responding to setup address.
Aug 24 08:40:12.638751 tux1 kernel: usb 1-12: Device not responding to setup address.
Aug 24 08:40:12.845561 tux1 kernel: usb 1-12: device not accepting address 24, error -71
Aug 24 08:40:12.845696 tux1 kernel: usb usb1-port12: unable to enumerate USB device
At this time only hard power o...
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #212 |
Was one of the affected USB devices plugged in and you rebooted to get the wifi working? Or did that happen even without the device plugged in?
I've noticed once that even after if I rebooted my system to get wifi working, my external HDD didn't work after plugging it in, so I had to reboot again to get that working...
I'm just using the LTS kernel right now, which works fine for me, but because of that bug I'm kinda limited when choosing a distribution since most distros don't offer different kernel versions and I don't really want to recompile my kernel every time.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #213 |
No, it happened without a warning. Keyboard LED flashed some times, according to the device descriptor errors. This was the first time I noticed something like that and only on the RYZEN machine.
We talked about that xhci issue in other (git) threads, too:
https:/
BTW:
LTS kernel (4.19) still working fine here, too. In my opinion xhci host is unstable since 4.20. I noticed that everytime, when testing/improving a driver.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #214 |
I noticed the same behavior. Not on an USB hdd, but on an USB ram:
This is an INTENSO USB 2 ALU LINE 64 GB USB stick:
[ 1032.600762] usb 1-11.4: new high-speed USB device number 15 using xhci_hcd
[ 1032.626487] hub 1-11:1.0: hub_ext_port_status failed (err = -71)
[ 1032.629487] usb 1-11-port4: cannot reset (err = -71)
[ 1032.632491] usb 1-11-port4: cannot reset (err = -71)
[ 1032.635486] usb 1-11-port4: cannot reset (err = -71)
[ 1032.638482] usb 1-11-port4: cannot reset (err = -71)
[ 1032.638483] usb 1-11-port4: Cannot enable. Maybe the USB cable is bad?
The stick is ok plugged in on another port:
[ 1465.770379] usb 1-11.4: USB disconnect, device number 23
[ 1708.302214] usb 1-2: new high-speed USB device number 24 using xhci_hcd
[ 1708.471933] usb 1-2: New USB device found, idVendor=058f, idProduct=6387, bcdDevice= 1.ff
[ 1708.471938] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1708.471940] usb 1-2: Product: Intenso Alu Line
[ 1708.471943] usb 1-2: Manufacturer: 6989
[ 1708.471945] usb 1-2: SerialNumber: 21F84CE8
[ 1708.479111] usb-storage 1-2:1.0: USB Mass Storage device detected
re-plugged in on 1-11-port4:
[ 1776.661289] usb 1-11.4: new high-speed USB device number 25 using xhci_hcd
[ 1776.810678] usb 1-11.4: New USB device found, idVendor=058f, idProduct=6387, bcdDevice= 1.ff
[ 1776.810684] usb 1-11.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1776.810687] usb 1-11.4: Product: Intenso Alu Line
[ 1776.810691] usb 1-11.4: Manufacturer: 6989
[ 1776.810694] usb 1-11.4: SerialNumber: 21F84CE8
[ 1776.817710] usb-storage 1-11.4:1.0: USB Mass Storage device detected
That leads me to the assumption that the xhci host is unstable, at least in combination with my controller:
[ 1.325164] xhci_hcd 0000:03:00.0: hcc params 0x0200ef81 hci version 0x110 quirks 0x0000000008000410
[ 1.325319] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.02
[ 1.325321] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 1.325322] usb usb1: Product: xHCI Host Controller
[ 1.325323] usb usb1: Manufacturer: Linux 5.2.9-arch1-1-ARCH xhci-hcd
[ 1.325323] usb usb1: SerialNumber: 0000:03:00.0
[ 1.325428] hub 1-0:1.0: USB hub found
[ 1.325443] hub 1-0:1.0: 14 ports detected
[ 1.325922] xhci_hcd 0000:03:00.0: xHCI Host Controller
[ 1.325925] xhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 2
[ 1.325927] xhci_hcd 0000:03:00.0: Host supports USB 3.1 Enhanced SuperSpeed
[ 1.325958] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 1.325974] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.02
[ 1.325976] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 1.325977] usb usb2: Product: xHCI Host Controller
[ 1.325978] usb usb2: Manufacturer: Linux 5.2.9-arch1-1-ARCH xhci-hcd
[ 1.325979] usb usb2: SerialNumber: 0000:03:00.0
[ 1.326046] hub 2-0:1.0: USB hub found
[ 1.326057] hub 2-0:1.0: 8 ports detected
[ 1.326289] usb: port power management may be unreliable
[ 1.326451] xhci_hcd 0000:25:00.0: xHCI Host Controller
[ 1.326454] xhci_hcd 000...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #215 |
tested another USB controller (at this time 3.1) and the results are even worse than on USB 3.0:
USB controller: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset USB 3.1 xHCI Controller (rev 02)
and
TENDA W311U+
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
This device is one of the few that work on an USB 3.0 controller
Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh)
but it failed epically on USB 3.1:
[ 1213.285622] rt2800usb 5-3.1.2:1.0 wlp39s0f3u3u1u2: renamed from wlan0
[ 1218.918384] ieee80211 phy6: rt2x00lib_
[ 1218.918427] ieee80211 phy6: rt2x00lib_
[ 1219.222282] device wlp39s0f3u3u1u2 entered promiscuous mode
[ 1220.797413] rt2800usb_
[ 1220.797417] ieee80211 phy6: rt2800usb_
[ 1220.797452] ieee80211 phy6: rt2800usb_
[ 1220.797531] ieee80211 phy6: rt2800usb_
[ 1220.797611] ieee80211 phy6: rt2800usb_
[ 1220.797692] ieee80211 phy6: rt2800usb_
[ 1220.797772] ieee80211 phy6: rt2800usb_
[ 1220.797851] ieee80211 phy6: rt2800usb_
[ 1220.797931] ieee80211 phy6: rt2800usb_
[ 1220.798011] ieee80211 phy6: rt2800usb_
[ 1220.798091] ieee80211 phy6: rt2800usb_
[ 1220.814661] xhci_hcd 0000:27:00.3: WARN Cannot submit Set TR Deq Ptr
[ 1220.814663] xhci_hcd 0000:27:00.3: A Set TR Deq Ptr command is pending.
[ 1221.378769] ieee80211 phy6: rt2x00queue_
[ 1221.409201] device wlp39s0f3u3u1u2 left promiscuous mode
I really hope it will be fixed until we reach next LTS-KERNEL.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #216 |
and more and more devices are affected:
https:/
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #217 |
@ Stanislaw Gruszka
We once talked about a rt2800usb issue (rt2800usb stops receiving) here:
https:/
Now, I'm not sure, if it is related to this xhci issue or not, because I get it sometimes on kernel 4.19, too.
After doing setsockopt PACKET_MR_PROMISC:
https:/
dmesg will show this warning (in this case running an USB 2.0 controller):
[ 1687.106514] device wlp3s0f0u2 entered promiscuous mode
[ 1687.106551] audit: type=1700 audit(156793211
[ 1718.525815] ieee80211 phy0: rt2x00queue_
[ 1718.558846] device wlp3s0f0u2 left promiscuous mode
[ 1718.558888] audit: type=1700 audit(156793214
The adapter stops working until it is plugged out and plugged in again:
[ 1722.950110] usb 1-2: USB disconnect, device number 5
If you think it is not related to this issue, I can open a new rt2800usb issue.
In Linux Kernel Bug Tracker #202541, k.j.vanmierlo (k.j.vanmierlo-linux-kernel-bugs) wrote : | #218 |
Hi,
a google search led me here. I'm getting the same error on my Lenovo Thinkpad X220 running Kubuntu 19.04. Everytime I plug in an USB memory stick or a SD card I get the following messages in dmesg:
[ 9649.078958] xhci_hcd 0000:05:00.0: WARN Cannot submit Set TR Deq Ptr
[ 9649.078966] xhci_hcd 0000:05:00.0: A Set TR Deq Ptr command is pending.
Linux koen-ThinkPad-X220 5.0.0-29-generic #31-Ubuntu SMP Thu Sep 12 13:05:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Koen
In Linux Kernel Bug Tracker #202541, doug16k (doug16k-linux-kernel-bugs) wrote : | #219 |
Got this issue on 5.0.0-29-generic, host hardware is Ryzen 2700X on B350 chipset (Asus Prime B350-Plus).
USB Device is Samsung Galaxy A5, Model SM-A520W, Android 8.0
[57460.411327] usb 1-4.1.4: USB disconnect, device number 10
[57460.411566] xhci_hcd 0000:02:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[57460.685963] usb 1-4.1.4: new high-speed USB device number 11 using xhci_hcd
[57460.830379] usb 1-4.1.4: New USB device found, idVendor=04e8, idProduct=6860, bcdDevice= 4.00
[57460.830382] usb 1-4.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[57460.830383] usb 1-4.1.4: Product: SAMSUNG_Android
[57460.830385] usb 1-4.1.4: Manufacturer: SAMSUNG
[57460.830386] usb 1-4.1.4: SerialNumber: **withheld**
doug@doug-dt:~$ sudo lspci -s 2:0.0 -vvvvvv
[sudo] password for doug:
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset USB 3.1 xHCI Controller (rev 02) (prog-if 30 [XHCI])
Subsystem: ASMedia Technology Inc. 300 Series Chipset USB 3.1 xHCI Controller
I have this kernel parameter to prevent other USB issues: usbcore.
Linux doug-dt 5.0.0-29-generic #31~18.04.1-Ubuntu SMP Thu Sep 12 18:29:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #220 |
This issue still exists on
$ uname -r
5.3.1-arch1-1-ARCH
Sep 24 08:14:00.374050 tux1 kernel: device wlp3s0f0u2 entered promiscuous mode
Sep 24 08:14:39.757848 tux1 kernel: xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Sep 24 08:14:39.758158 tux1 kernel: xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Sep 24 08:14:39.770950 tux1 kernel: mt7601u 1-2:1.0: Warning: TX DMA did not stop!
xhci host is running completely instable after receiving the first warning:
WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
Ignoring this warning, the whole system freezes. At this time only a "hard" power off will help.
BTW:
Shouldn't we increase importance (next kernel will be LTS - and this issue will reach the major distributions).
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #221 |
I have noticed that I don't get that error("WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state") anymore, even though I still have the same USB issues(maybe something in the rt2800usb driver changed, idk). I've even tried applying all the patches in the "for-usb-linus" branch from Mathias Nyman's git repo - but I still have the same issue.
Maybe more people should send a message to the usb kernel mailing list(<email address hidden>)? I didn't get a response the last time but maybe they will address this issue if they see that more users are affected by this regression.
BTW @Michael:
There is a commit in the for-usb-linus branch that could fix the system freezes you've encountered: https:/
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #222 |
@Bernhard.
Thanks. I'll check it. Also thanks for setting prio to high.
Until the system freezes, I receive the funniest warnings from the xhci system: bad cable, bad device, firmware not loaded,...
"WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" depend also on the device:
Running a
148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
I got no "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state"
Running
148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
I got the warning.
Both of them using the rt2800usb driver.
That and the different warnings let me assume, the xhci host is running completely instable, especially when hcxdumptool doing high workload.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #223 |
It seems that the commit is working - no freeze, up to now.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #224 |
Nop, doesn't work as expected. No freezes, but:
[ 2914.285601] ieee80211 phy77: Atheros AR9271 Rev:1
[ 2914.286229] ath9k_htc 1-3:1.0 wlp0s20f0u3: renamed from wlan0
[ 2914.389748] usb 1-3: USB disconnect, device number 83
[ 2914.749819] ath: phy77: Failed to wakeup in 500us
[ 2914.760221] ath: phy77: Failed to wakeup in 500us
[ 2914.770309] ath: phy77: Failed to wakeup in 500us
[ 2914.780411] ath: phy77: Failed to wakeup in 500us
[ 2915.283332] usb 1-3: ath9k_htc: Firmware ath9k_htc/
[ 2915.531824] usb 1-3: ath9k_htc: Firmware - ath9k_htc/
[ 2915.532206] usb 1-3: ath9k_htc: USB layer deinitialized
[ 2928.339410] ------------[ cut here ]------------
[ 2928.339505] WARNING: CPU: 1 PID: 704 at net/mac80211/
[ 2928.339506] Modules linked in: ath9k_htc ath9k_common ath9k_hw ath nfnetlink_queue nfnetlink_log nfnetlink ccm uas usb_storage rt2800usb rt2x00usb rt2800lib rt2x00lib fuse nls_iso8859_1 nls_cp437 vfat fat nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) uvcvideo snd_soc_skl videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_soc_hdac_hda snd_hda_codec_hdmi videobuf2_common snd_hda_ext_core videodev snd_soc_skl_ipc snd_hda_
.....
At this time xhci crashed TP-LINK TL722WN v1.
And that device worked, before...
xhci is still running completely unstable and the delivered warnings are unpredictable.
Felix Moreno (felix-justdust) wrote : | #134 |
same problem witn 19.04 and usb 3.1 pci and some 10 usb icy box 10 drives.
Felix Moreno (felix-justdust) wrote : | #135 |
Bus 002 Device 004: ID 174c:55aa ASMedia Technology Inc. Name: ASM1051E SATA 6Gb
In Linux Kernel Bug Tracker #202541, viniciuspython (viniciuspython-linux-kernel-bugs) wrote : | #225 |
Just providing some information that could be helpful to debug the issue. It is also affecting me.
Kernel version:
# uname -a
Linux arch 5.3.1-arch1-1-ARCH #1 SMP PREEMPT Sat Sep 21 11:33:49 UTC 2019 x86_64 GNU/Linux
Hardware specs: AMD Ryzen 5 2400G
The issue happens when I plug in an Alfa AWUS036NH (148f:3070 Ralink Technology, Corp. RT2870/RT3070) - It uses the module rt2800usb
Below you can find my dmesg output when I plug in the Alfa device:
---
[ 1130.410091] usb 1-10: new high-speed USB device number 5 using xhci_hcd
[ 1130.653103] usb 1-10: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[ 1130.653108] usb 1-10: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1130.653111] usb 1-10: Product: 802.11 n WLAN
[ 1130.653113] usb 1-10: Manufacturer: Ralink
[ 1130.653114] usb 1-10: SerialNumber: 1.0
[ 1130.864470] usb 1-10: reset high-speed USB device number 5 using xhci_hcd
[ 1131.110058] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[ 1131.788103] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 0005 detected
[ 1131.794331] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[ 1131.833798] rt2800usb 1-10:1.0 wlp1s0f0u10: renamed from wlan0
[ 1131.834234] audit: type=1130 audit(156989634
[ 1131.867763] ieee80211 phy1: rt2x00lib_
[ 1131.867797] ieee80211 phy1: rt2x00lib_
[ 1136.117228] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 1136.840084] audit: type=1131 audit(156989635
---
I don't know if this is useful, but I do have another USB WiFi that uses another module but doesn't trigger the issue when I plug in:
lsusb output: 2357:010c TP-Link TL-WN722N v2
Below is the dmesg output when I plug in the TP-LINK:
---
[ 1697.619576] usb 1-7: new high-speed USB device number 9 using xhci_hcd
[ 1697.846601] usb 1-7: New USB device found, idVendor=2357, idProduct=010c, bcdDevice= 0.00
[ 1697.846603] usb 1-7: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1697.846605] usb 1-7: Product: 802.11n NIC
[ 1697.846606] usb 1-7: Manufacturer: Realtek
[ 1697.846607] usb 1-7: SerialNumber: 00E04C0001
[ 1697.858603] Chip Version Info: CHIP_8188E_
[ 1698.262531] r8188eu 1-7:1.0 wlp1s0f0u7: renamed from wlan0
[ 1711.847379] MAC Address = c0:25:e9:1f:5c:3c
[ 1712.075372] R8188EU: indicate disassoc
---
Additionally, I see the warning when I plug in a Samsung Galaxy S5 device, but the warning appears only when I select certain "USB modes" in Android. Below you can see the dmesg log for each one of the USB modes:
--- dmesg log for "No data transfer" USB mode ---
[ 2523.666729] usb 1-7: USB disconnect, device number 32
[ 2524.157919] usb 1-7: new high-speed U...
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #226 |
@Vinicius
Which motherboard do you have?
Maybe the issue is related to 300-series motherboards...
In Linux Kernel Bug Tracker #202541, viniciuspython (viniciuspython-linux-kernel-bugs) wrote : | #227 |
My motherboard is a Biostar B350GT3.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #228 |
I've sent another mail to the kernel usb mailing list, this time I got a response. I sent them kernel debugging logs/traces from xhci, unfortunately I have one of the devices where the error "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state." doesn't get shown anymore, which makes it harder to find the cause for the problem.
@Michael
Could you do the following steps, upload the dmesg log and trace file somewhere and post the link to the files here(or send them directly to the mailing list yourself, if you prefer that)? When using one of the devices where the error gets shown obviously.
1. start the PC with an affected kernel, but without the affected device plugged in, then run the following commands as root
2. mount -t debugfs none /sys/kernel/debug
3. echo 'module xhci_hcd =p' >/sys/kernel/
4. echo 'module usbcore =p' >/sys/kernel/
5. echo 81920 > /sys/kernel/
6. echo 1 > /sys/kernel/
7. Plug in the affected device
8. Send output of dmesg and the /sys/kernel/
Thanks in advance
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #229 |
Here it goes:
https:/
ALFA AWUS036NH connected to USB 3.x port running stress test using hcxdumptool.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #230 |
If the error occurred once, xhci will be unusable for all other devices:
[20480.414467] usb 1-2: new full-speed USB device number 6 using xhci_hcd
[20480.717690] usb 1-2: New USB device found, idVendor=1546, idProduct=01a7, bcdDevice= 1.00
[20480.717695] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[20480.717698] usb 1-2: Product: u-blox 7 - GPS/GNSS Receiver
[20480.717700] usb 1-2: Manufacturer: u-blox AG - www.u-blox.com
[20480.726485] cdc_acm 1-2:1.0: ttyACM0: USB ACM device
[20480.760327] audit: type=1130 audit(157027496
[20486.732259] usb 1-2: USB disconnect, device number 6
[20486.746846] audit: type=1131 audit(157027496
[20487.027593] usb 1-2: new full-speed USB device number 7 using xhci_hcd
[20487.244298] usb 1-2: device descriptor read/64, error -71
[20487.540954] usb 1-2: device descriptor read/64, error -71
[20487.837571] usb 1-2: new full-speed USB device number 8 using xhci_hcd
[20487.991378] usb 1-2: device descriptor read/64, error -71
[20488.287616] usb 1-2: device descriptor read/64, error -71
[20488.394301] usb usb1-port2: attempt power cycle
[20489.037910] usb 1-2: new full-speed USB device number 9 using xhci_hcd
[20489.065424] usb 1-2: Device not responding to setup address.
[20489.271605] usb 1-2: Device not responding to setup address.
[20489.477900] usb 1-2: device not accepting address 9, error -71
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #231 |
Good News!
After reading a bit in the xhci spec sheet I've figured out what the problem is. I've already created a patch and sent it to the mailing list, so it will hopefully be fixed in 5.4.
If you want to see or try the patch, you can find it here: https:/
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #232 |
Nevermind, I've misunderstood something in the xhci spec sheet, apparently the xhci slot id isn't the same as the "TT Hub slot id".
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #233 |
Created attachment 285501
Patch adding doorbell tracing
Patch that adds even more tracing, this will show if xhci driver
correctly rings endpoint doorbell to start endpoint after soft retry
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #234 |
Created attachment 285505
Dmesg log and trace file
Not sure how useful the logs from my device are, because the error "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" only gets shown after unplugging the device, but it looks like the error messages are mostly the same.
There are still some differences compared to Michaels device though:
Dmesg from him(the one I've also sent to the mailing list):
[ 96.789306] xhci_hcd 0000:03:00.0: Resetting device with slot ID 4
[ 96.789313] xhci_hcd 0000:03:00.0: // Ding dong!
[ 96.791053] xhci_hcd 0000:03:00.0: Completed reset device command.
[ 96.791111] xhci_hcd 0000:03:00.0: Successful reset device command.
compared to mine:
[ 91.777887] xhci_hcd 0000:15:00.0: Resetting device with slot ID 4
[ 91.777892] xhci_hcd 0000:15:00.0: // Ding dong!
[ 91.777940] xhci_hcd 0000:15:00.0: Completed reset device command.
[ 91.777950] xhci_hcd 0000:15:00.0: Can't reset device (slot ID 4) in default state
[ 91.777951] xhci_hcd 0000:15:00.0: Not freeing device rings.
[ 91.777956] xhci_hcd 0000:15:00.0: // Ding dong!
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #235 |
Some times the warning doesn't appear. Instead the the driver crashed:
$ dmidecode
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: X555UB
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
device connected to USB 3:
[10799.155340] usb 1-2: reset high-speed USB device number 12 using xhci_hcd
[10799.310446] ieee80211 phy5: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[10799.364982] ieee80211 phy5: rt2x00_set_rf: Info - RF chipset 0005 detected
[10799.365842] ieee80211 phy5: Selected rate control algorithm 'minstrel_ht'
[10799.412456] rt2800usb 1-2:1.0 wlp0s20f0u2: renamed from wlan0
[10799.432236] ieee80211 phy5: rt2x00lib_
[10799.432263] ieee80211 phy5: rt2x00lib_
[10799.728051] ieee80211 phy5: rt2x00usb_
[10800.745185] ieee80211 phy5: rt2800_
[10800.745197] ieee80211 phy5: rt2800usb_
...
[11237.887923] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[11237.887929] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
xhci is unstable - not the hardware.
The same device, connected to the same notebook, but to a USB 2 port:
[11243.042957] usb 1-3: reset high-speed USB device number 13 using xhci_hcd
[11243.197261] ieee80211 phy6: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[11243.251969] ieee80211 phy6: rt2x00_set_rf: Info - RF chipset 0005 detected
[11243.253036] ieee80211 phy6: Selected rate control algorithm 'minstrel_ht'
[11243.272919] rt2800usb 1-3:1.0 wlp0s20f0u3: renamed from wlan0
[11243.293056] ieee80211 phy6: rt2x00lib_
[11243.293082] ieee80211 phy6: rt2x00lib_
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #236 |
@Michael
Could you apply the patch from Mathias(comment 71) to the kernel, enable xhci tracing(steps in comment 66), and upload the dmesg and trace file?
The patch adds more tracing which will make it easier to find the exact issue.
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #237 |
@Bernhard
Logs with added tracing show that driver does ring the endpoint doorbell, so
host controller should start processing the pending requests. Endpoint is
in stopped state as it should after endpoint reset, before we ring the doorbell.
So this part looks like hardware isn't doing its part.
when class driver starts cancelling transfer requests after some timeout time, we can see that the endpoint is in halted state. Host controller didn't issue any
event when endpoint turned into halted state. so driver is unaware of this state.
There is also a bug in the driver how the error is handled later. After the timeout, when class driver starts cancelling transfers, and xhci driver tries to stop the endpoint to cancel tranfers, it sohuld react to the context state error,
and check endpoint state, and handle the halted endpoint p
Driver should react to this, it should detect and handle the halted endpoint before attempting to set a new dequeue pointer. Now it just bluntly tries to set
a new dequeue pointer, and fails.
Details:
* We get a transaction error event, for transfer request (TRB) at 0xf61a0000
96.985254: xhci_handle_event: EVENT: TRB 00000000f61a0000 status 'USB Transaction Error' len 3860 slot 4 ep 3 type 'Transfer Event' flags e:C
96.985262: xhci_handle_
* We issue a Reset endpoint command to resolve the halted endpoint
(move endpoint from halted to stopped state)
96.985264: xhci_queue_trb: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 4 ep 3 flags C
96.985265: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e
96.985266: xhci_ring_
96.985268: xhci_inc_deq: EVENT 000000005715d3fc: enq 0x00000000fff7c
* Reset endpoint command successfully, endpoint state is now "stopped"
96.985395: xhci_handle_event: EVENT: TRB 00000000fff7e540 status 'Success' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:C
96.985396: xhci_handle_
96.985397: xhci_handle_
trb len 0
* We ring the doorbell, xHC hardware should start processing events on ring,
96.985402: xhci_ring_
* but nothing happends, this endpoint i silent until class driver starts cancelling Transfers ~25 seconds later
122.813121: xhci_urb_dequeue: ep1in-bulk: urb 00000000790ce3f7 pipe 3221259648 slot 4 length 0/3860 sgs 0/0 stream 0 flags 00010200
122.813134: xhci_dbg_
* stop the endpoint to cancel the pending transfers
122.813137: xhci_queue_trb: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C
122.813137: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #238 |
you could try to flush the endpoint ringing PCI write, and see if it helps
starting the endpint, but I don't have high hopes for this, a PCI write should
be flushed anyway, especially in 25 seconds.
maybe also add trace to re-read the endpoint state after flushing pci write:
(untested)
diff --git a/drivers/
index e74518e7de6a.
--- a/drivers/
+++ b/drivers/
@@ -408,6 +408,7 @@ void xhci_ring_
+ readl(db_addr);
/* The CPU has better things to do at this point than wait for a
* write-posting flush. It'll get there soon enough.
*/
@@ -1176,6 +1177,8 @@ static void xhci_handle_
/* if this was a soft reset, then restart */
if ((le32_
+
+ trace_xhci_
}
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #239 |
Created attachment 285527
Logs after flushing endpoint
I've applied the patch, but it seems like the endpoint doesn't get started even after flushing the endpoint.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #240 |
@Bernhard, can't do further going tests at the moment, because I'm on vacation until November.
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #241 |
Created attachment 285709
Patch handling halted endpoints at completion of stop endpoint command
Patch to handle a context state error at stop endpoint completion
where a endpoint TRB processing had a error/stall, and hardware halted the
endpoint just before completing normal stop endpoint command.
This won't fix the initial issue about endpoint not restarting after
soft retry, but it should resolve the flood of "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" messages
Code is completely untested as I can't trigger this codepath manually.
It requires hardware halting a endpoint just before completing a stop
endpoint command
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #242 |
Created attachment 285713
Logs after applying the patch
After applying the patch the "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" messages are indeed gone, and the issue is (as expected) still there.
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #243 |
(In reply to Bernhard from comment #80)
> Created attachment 285713 [details]
> Logs after applying the patch
Did you by mistake attach some old logs?
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #244 |
Created attachment 285717
Logs after applying the patch
Yes, looks like I've uploaded the zip file from the wrong folder. The new file should be the right one.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #245 |
The "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" doesn't flood the log file. The message appear only if the device is disconnected (after xhci died):
Connected the device:
[42407.193511] usb 1-2: ath9k_htc: USB layer deinitialized
[42410.956671] usb 1-2: new high-speed USB device number 9 using xhci_hcd
[42411.214091] usb 1-2: New USB device found, idVendor=0cf3, idProduct=9271, bcdDevice= 1.08
[42411.214095] usb 1-2: New USB device strings: Mfr=16, Product=32, SerialNumber=48
[42411.214098] usb 1-2: Product: USB2.0 WLAN
[42411.214100] usb 1-2: Manufacturer: ATHEROS
[42411.214102] usb 1-2: SerialNumber: 12345
[42411.232116] usb 1-2: ath9k_htc: Firmware ath9k_htc/
[42412.308181] usb 1-2: ath9k_htc: Transferred FW: ath9k_htc/
[42412.558320] ath9k_htc 1-2:1.0: ath9k_htc: HTC initialized with 33 credits
[42412.784721] ath9k_htc 1-2:1.0: ath9k_htc: FW Version: 1.4
[42412.784724] ath9k_htc 1-2:1.0: FW RMW support: On
[42412.784726] ath: EEPROM regdomain: 0x809c
[42412.784727] ath: EEPROM indicates we should expect a country code
[42412.784728] ath: doing EEPROM country->regdmn map search
[42412.784729] ath: country maps to regdmn code: 0x52
[42412.784730] ath: Country alpha2 being used: CN
[42412.784731] ath: Regpair used: 0x52
[42412.788460] ieee80211 phy2: Atheros AR9271 Rev:1
[42412.791852] ath9k_htc 1-2:1.0 wlp3s0f0u2: renamed from wlan0
and everything is looking fine.
after running the device for a few minutes
[42445.806367] device wlp3s0f0u2 entered promiscuous mode
we receive the first indication that xhci died
[42911.706734] ath: phy2: Unable to set channel
and the device stops working. There are absolutely no other error messages, shwon by dmesg or the running application (in this case hcxdumptool).
Now we disconnect the device and got the final warning:
[43082.759737] usb 1-2: USB disconnect, device number 9
[43082.760434] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[43082.760607] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[43082.764275] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[43082.784722] device wlp3s0f0u2 left promiscuous mode
At this point xhci is dead. No other device connected to the same port is working.
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #246 |
(In reply to Michael from comment #83)
> The "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state"
> doesn't flood the log file. The message appear only if the device is
> disconnected (after xhci died):
>
Could you take full logs and traces of this:
mount -t debugfs none /sys/kernel/debug
echo 'module xhci_hcd =p' >/sys/kernel/
echo 'module usbcore =p' >/sys/kernel/
echo 81920 > /sys/kernel/
echo 1 > /sys/kernel/
< Trigger the issue >
Send output of dmesg
Send content of /sys/kernel/
In Bernhards case there were issues both with hardware not starting the
ring after soft retry, and software not handling context state error when stopping an endpoint. Second issue can be fixed in driver.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #247 |
I try to trigger it. That isn't so easy, because different devices showing different behavior and the occurrence of the issue is totally random. Sometimes it happens immediately after connecting the device and sometimes it happens after a while or heavy stressing the device.
BTW:
mount -t debugfs none /sys/kernel/debug
is done by default here.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #248 |
I'm doing several runs, using different devices. So we have the chance to compare them against each other.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #249 |
Here go.
https:/
Unfortunately it looks like this stress test was to heavy for dmesg's ringbuffer.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #250 |
After several tests, I assume that this warning:
"rt2x00queue_
is also related to the xhci issue. I don't think that the issue is related to powermanagement (https:/
affected: rt2800usb
[ 7384.825764] usb 1-2: new high-speed USB device number 8 using xhci_hcd
[ 7385.069208] usb 1-2: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[ 7385.069211] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 7385.069214] usb 1-2: Product: 802.11 n WLAN
[ 7385.069216] usb 1-2: Manufacturer: Ralink
[ 7385.069217] usb 1-2: SerialNumber: 1.0
[ 7385.280539] usb 1-2: reset high-speed USB device number 8 using xhci_hcd
[ 7385.526260] ieee80211 phy3: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[ 7386.204480] ieee80211 phy3: rt2x00_set_rf: Info - RF chipset 0005 detected
[ 7386.210679] ieee80211 phy3: Selected rate control algorithm 'minstrel_ht'
[ 7386.227147] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[ 7386.227812] audit: type=1130 audit(157261043
[ 7387.737404] ieee80211 phy3: rt2x00lib_
[ 7387.737440] ieee80211 phy3: rt2x00lib_
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #251 |
The bad thing on this issue is that it isn't detectable by an application, while the device is plugged in. The device doesn't start or stops working without any warning. The application says every thing is fine and dmesg showing absolutely no warning.
Only when the device is plugged out, we get a bunch of warnings, depending on the device (tested on INTEL and AMD systems):
"WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state"
"rt2x00queue_
"rx urb failed: -71"
"A Set TR Deq Ptr command is pending."
and more (bad cable, hardware error, ....).
BTW:
I'm running kernel 4.19.80 in parallel and every thing is fine here. This issue appeared for the first time on 4.20.
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #252 |
Seems that it was a known issue that xHCI on AMD platforms can fail to restart an endpoint if it wasn't running when the stop command was issued. This also applies to Berhards case where the endpoint stop command raced with an error halting the endpoint.
See patch:
commit 28a2369f7d72ece
Author: Shyam Sundar S K <email address hidden>
Date: Thu Jul 20 14:48:28 2017 +0300
usb: xhci: Issue stop EP command only when the EP state is running
on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
issued the controller does not respond, when the EP is not in running
state. HW completes the command execution and reports
"Context State Error" completion code. This is as per the spec. However
HW on receiving the second command additionally marks EP to Flow control
state in HW which is RTL bug. This bug causes the HW not to respond
to any further doorbells that are rung by the driver. This makes the EP
to not functional anymore and causes gross functional failures.
As a workaround, not to hit this problem, it's better to check the EP state
and issue a stop EP command only when the EP is in running state.
As a sidenote, even with this patch there is still a possibility of
triggering the RTL bug if the context state races with the stop endpoint
command as described in xHCI spec 4.6.9
[code simplification and reworded sidenote in commit message -Mathias]
Signed-off-by: Shyam Sundar S K <email address hidden>
Signed-off-by: Nehal Shah <email address hidden>
Signed-off-by: Mathias Nyman <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Does anybody have a link to that errata?
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #253 |
I can't confirm that, because this issue happens on all platforms if the device is connected to an USB 3 port:
RYZEN 1700, MSI X370 KRAIT
INTEL I5-6200U, ASUS X555U (notebook)
INTEL i7-3930K, ASUS P9X79
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #254 |
The only systems which are running without this issue are my Raspberry Pi's:
$ uname -r
4.19.80-2-ARCH
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #255 |
Generating a lot of traffic on the socket, causes xhci to die very early.
Here it happened on an AMD RYZEN system, running hcxdumptool:
[ 8316.184018] device wlp3s0f0u2 entered promiscuous mode
[ 8372.392206] ath: phy0: Unable to remove monitor interface at idx: 0
[ 8374.525500] ath: phy0: Unable to remove station entry for monitor mode
[ 8381.692889] usb 1-2: USB disconnect, device number 5
[ 8381.693576] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or
and here on an INTEL notebook running NetworkManager:
[ 166.174157] usb 1-1: new high-speed USB device number 8 using xhci_hcd
[ 166.330703] usb 1-1: New USB device found, idVendor=148f, idProduct=761a, bcdDevice= 1.00
[ 166.330713] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 166.330719] usb 1-1: Product: WiFi
[ 166.330725] usb 1-1: Manufacturer: MediaTek
[ 166.330729] usb 1-1: SerialNumber: 1.0
[ 166.458249] usb 1-1: reset high-speed USB device number 8 using xhci_hcd
[ 166.607874] usb 1-1: ASIC revision: 76100002 MAC revision: 76502000
[ 167.669762] usb 1-1: EEPROM ver:02 fae:01
[ 203.846465] mt76u_complete_rx: 13 callbacks suppressed
[ 203.846479] usb 1-1: rx urb failed: -71
[ 203.846552] usb 1-1: rx urb failed: -71
[ 203.846614] usb 1-1: rx urb failed: -71
[ 203.846667] usb 1-1: rx urb failed: -71
[ 203.846712] usb 1-1: rx urb failed: -71
[ 203.846799] usb 1-1: rx urb failed: -71
[ 203.846874] usb 1-1: rx urb failed: -71
[ 203.846924] usb 1-1: rx urb failed: -71
[ 203.846998] usb 1-1: rx urb failed: -71
[ 203.847069] usb 1-1: rx urb failed: -71
[ 203.848249] usb 1-1: USB disconnect, device number 8
[ 203.850032] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[ 203.850040] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #256 |
Running really heavy traffic, first xhci caused the driver to crash, than the whole system crashed:
System: ASUS X555UB (INTEL)
[ 1564.588784] mt7601u 1-2:1.0: Error: TSSI upper saturation
[ 1614.221860] ------------[ cut here ]------------
[ 1614.221923] WARNING: CPU: 1 PID: 0 at net/mac80211/
[ 1614.221924] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink uas usb_storage ccm mt7601u hid_generic usbhid fuse nls_iso8859_1 nls_cp437 vfat fat nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev snd_soc_skl x86_pkg_
[ 1614.221947] intel_rapl_msr agpgart ecdh_generic crypto_simd sparse_keymap i2c_hid cryptd rfkill iTCO_wdt mei_hdcp hid snd_hwdep glue_helper syscopyarea ecc sysfillrect iTCO_vendor_support sysimgblt fb_sys_fops snd_pcm pcspkr intel_cstate intel_uncore mxm_wmi intel_rapl_perf input_leds elan_i2c tpm_crb snd_timer tpm_tis snd tpm_tis_core tpm int3403_thermal soundcore intel_xhci_
[ 1614.221975] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P W OE 5.3.8-arch1-1 #1
[ 1614.221976] Hardware name: ASUSTeK COMPUTER INC. X555UB/X555UB, BIOS X555UB.301 02/20/2017
[ 1614.221993] RIP: 0010:ieee80211_
[ 1614.221994] Code: 38 48 81 c1 70 04 00 00 48 81 c6 38 01 00 00 e8 0a 40 a1 d1 b8 01 00 00 00 e9 26 4b fb ff 48 c7 c7 60 7b c1 c0 e8 b7 53 4f d1 <0f> 0b 48 89 ef e8 7f 28 b4 d1 e9 d1 5b fb ff 48 c7 c7 60 7b c1 c0
[ 1614.221995] RSP: 0018:ffffa50840
[ 1614.221996] RAX: 0000000000000024 RBX: ffff92206bc407a0 RCX: 0000000000000000
[ 1614.221997] RDX: 0000000000000000 RSI: ffff92207ba97708 RDI: 00000000ffffffff
[ 1614.221998] RBP: ffff922034510400 R08: 0000000000001137 R09: 0000000000000001
[ 1614.221998] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1614.221999] R13: 0000000000000001 R14: 0000000000000006 R15: 0000000000000000
[ 1614.222000] FS: 000000000000000
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #257 |
Michael, I've been looking at the traces and can't find anything xhci related in your logs that could cause this. xhci isn't dying, crashig or causing other drivers to crash in the above logs either. It doesn't seem related to Bernhards case.
Have you tried bisecting what patch causes the problems between 4.19 and 4.20 kernels?
The "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" is related to unplugging of the device. In short, while unplugging the device we get a transaction error for each running endpoint before the hub thread notices the disconnect, so xhci driver tries to recover the endpoint before everything is tore down and returned for the device. It's should be harmless at this stage.
There are several disconnect events initiated by device, or then actual physical
disconnect, could be related to firmware loading?
Traces also show many bulk-in urbs being queued but none completed until cancel at disconnect. so we are waiting 49 seconds to get data from the device before disconnect.
URB b2383f4 TRB is queued from ep4in, waiting for data from device:
13714.468994: xhci_urb_enqueue: ep4in-bulk: urb 000000000b2383f4 pipe 3221360512 slot 14 length 0/4096 sgs 1/1 stream 0 flags 00040200
13714.468996: xhci_queue_trb: BULK: Buffer 00000000ff5df000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c
13714.468996: xhci_inc_enq: BULK 0000000096dfdec9: enq 0x00000000feaec
49 seconds later transaction error on ep4in on disconnect:
13763.472759: xhci_handle_event: EVENT: TRB 00000000feaec000 status 'USB Transaction Error' len 4096 slot 14 ep 9 type 'Transfer Event' flags e:c
...
13763.472787: xhci_handle_event: EVENT: TRB 000000000a000000 status 'Success' len 0 slot 0 ep 0 type 'Port Status Change Event' flags e:c
13763.472792: xhci_handle_
After this urb b2383f4 is canceled and given back:
13763.474221: xhci_urb_dequeue: ep4in-bulk: urb 000000000b2383f4 pipe 3221360512 slot 14 length 0/4096 sgs 1/1 stream 0 flags 00040200
13763.474225: xhci_dbg_
...
13763.474673: xhci_urb_giveback: ep4in-bulk: urb 000000000b2383f4 pipe 3221360512 slot 14 length 0/4096 sgs 1/1 stream 0 flags 00040200
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #258 |
Mathias, it is really hard to find the cause of that issue. dmesg is showing nothing until something crashed. I'm not able to detect the cause:
https:/
At this point, I know:
- the driver stops working (independent of the driver - rt2800usb as well as mt76)
- no warning, no error message)
- the system became instable (AMD as well as INTEL)
- kernel 4.20 up to 5.3
It is very unlikely that the driver caused this, because it doesn't happen on USB2 and it happens on different drivers and different systems.
I can try to bisect to identify the patch, but that will take a while.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #259 |
"WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" only appeared when something went wrong.
If everything's fine and I plug out the device, this warning is not shown.
Here are the results from another device
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
running on an INTEL system.
dmesg output if everything is ok:
[14492.749187] usb 1-1: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[14492.749197] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[14492.749203] usb 1-1: Product: 802.11 n WLAN
[14492.749208] usb 1-1: Manufacturer: Ralink
[14492.749213] usb 1-1: SerialNumber: 1.0
[14492.881097] usb 1-1: reset high-speed USB device number 20 using xhci_hcd
[14493.035766] ieee80211 phy11: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[14493.090480] ieee80211 phy11: rt2x00_set_rf: Info - RF chipset 0005 detected
[14493.091489] ieee80211 phy11: Selected rate control algorithm 'minstrel_ht'
[14493.113656] rt2800usb 1-1:1.0 wlp0s20f0u1: renamed from wlan0
[14493.116525] audit: type=1130 audit(157322759
[14493.141430] ieee80211 phy11: rt2x00lib_
[14493.141456] ieee80211 phy11: rt2x00lib_
[14498.126056] audit: type=1131 audit(157322759
[14506.300174] usb 1-1: USB disconnect, device number 20
[14506.463603] audit: type=1130 audit(157322760
demsg if the device stops working and something went wrong:
[14565.489976] usb 1-1: new high-speed USB device number 21 using xhci_hcd
[14565.648114] usb 1-1: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[14565.648124] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[14565.648130] usb 1-1: Product: 802.11 n WLAN
[14565.648135] usb 1-1: Manufacturer: Ralink
[14565.648140] usb 1-1: SerialNumber: 1.0
[14565.773934] usb 1-1: reset high-speed USB device number 21 using xhci_hcd
[14565.927986] ieee80211 phy12: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[14565.982385] ieee80211 phy12: rt2x00_set_rf: Info - RF chipset 0005 detected
[14565.983295] ieee80211 phy12: Selected rate control algorithm 'minstrel_ht'
[14566.002249] rt2800usb 1-1:1.0 wlp0s20f0u1: renamed from wlan0
[14566.004829] audit: type=1130 audit(157322766
[14566.018308] ieee80211 phy12: rt2x00lib_
[14566.018335] ieee80211 phy12: rt2x00lib_
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #260 |
(In reply to Michael from comment #96)
> I can try to bisect to identify the patch, but that will take a while.
Tbh I would try reverting the commit that caused the problem for me first, just to make sure you're not spending multiple hours bisecting this issue and then find out that you're affected by the same commit.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #261 |
Bernhard, that will be great. I'm not at home and my ASUS notebook is really too slow to perform a bisect.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #262 |
@Bernhard, @Mathias
I'm not sure anymore if the issue is related to xhci, because of the lates WARNINGs and traces.
I tested a PCIe card
Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter
and running into similar issues:
12506.901197] wlp3s0: deauthenticating from 00:24:d4:9e:e8:c4 by local choice (Reason: 3=DEAUTH_LEAVING)
[12506.902535] ------------[ cut here ]------------
[12506.902589] WARNING: CPU: 1 PID: 15941 at net/mac80211/
[12506.902590] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink mt7601u ccm fuse nls_iso8859_1 nls_cp437 vfat fat nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc rtsx_usb_ms snd_soc_sst_dsp rtl8821ae snd_soc_
[12506.902624] asus_wmi drm sparse_keymap mei_hdcp snd_hda_core intel_cstate mxm_wmi intel_uncore intel_rapl_perf intel_gtt agpgart ecdh_generic snd_hwdep pcspkr rfkill syscopyarea snd_pcm sysfillrect ecc sysimgblt fb_sys_fops tpm_crb input_leds snd_timer elan_i2c tpm_tis tpm_tis_core snd int3403_thermal tpm i2c_i801 evdev rng_core soundcore processor_
[12506.902660] CPU: 1 PID: 15941 Comm: Netlink Monitor Tainted: P W OE 5.3.8-arch1-1 #1
[12506.902661] Hardware name: ASUSTeK COMPUTER INC. X555UB/X555UB, BIOS X555UB.301 02/20/2017
[12506.902684] RIP: 0010:ieee80211_
[12506.902687] Code: 38 48 81 c1 70 04 00 00 48 81 c6 38 01 00 00 e8 0a 10 77 dc b8 01 00 00 00 e9 26 4b fb ff 48 c7 c7 60 ab eb c0 e8 b7 23 25 dc <0f> 0b 48 89 ef e8 7f f8 89 dc e9 d1 5b fb ff 48 c7 c7 60 ab eb c0
[12506.902688] RSP: 0000:ffffb624c0
[12506.902690] RAX: 0000000000000024 RBX: ffff8ee22cae07a0 RCX: 0000000000000000
[12506.902691] RDX: 0000000000000000 RSI: ffff8ee23ba97708 RDI: 00000000ffffffff
[12506.902692] RBP: ffff8ee1ab8b8400 R08: 00000000000014eb R09: 0000000000000001
[12506.902692] R10: 0000000000000000 R1...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #263 |
Here is a new log (dmesg and trace):
https:/
Device: ALFA AWUS036NH
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
The device is connected and entered promiscuous mode
[76538.089897] xhci_hcd 0000:03:00.0: Waiting for status stage event
[76541.048223] xhci_hcd 0000:03:00.0: Transfer error for slot 23 ep 2 on endpoint
[76541.048233] xhci_hcd 0000:03:00.0: // Ding dong!
[76541.048356] xhci_hcd 0000:03:00.0: Ignoring reset ep completion code of 1
[76542.194353] device wlp3s0f0u2 entered promiscuous mode
...
we do not receive data via AF_PACKET socket.
...
[76542.194385] audit: type=1700 audit(157363940
[76554.680919] xhci_hcd 0000:03:00.0: Cancel URB 00000000e8c9ee79, dev 2, ep 0x81, starting at offset 0xff05d000
[76554.680929] xhci_hcd 0000:03:00.0: // Ding dong!
I can't find anything that caused it, except of the transfer error at 76541.048223.
If we connect the device to an USB2 port, everything is fine:
https:/
we receive data via AF_PACKET socket.
The device is working as expected.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #264 |
Still present in kernel 5.4:
https:/
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #265 |
Still present running
$ uname -r
5.5.2-arch1-1
[16300.890097] mt76x0u 5-3.1.2:1.0: ASIC revision: 76100002 MAC revision: 76502000
[16301.239555] mt76x0u 5-3.1.2:1.0: EEPROM ver:02 fae:01
[16301.578393] ieee80211 phy6: Selected rate control algorithm 'minstrel_ht'
[16301.595805] mt76x0u 5-3.1.2:1.0 wlp39s0f3u3u1u2: renamed from wlan0
[16316.881303] device wlp39s0f3u3u1u2 entered promiscuous mode
[16316.881347] audit: type=1700 audit(158115863
[16316.882150] mt76x0u 5-3.1.2:1.0: tx urb failed: -71
[16316.882187] mt76u_complete_rx: 1989 callbacks suppressed
[16316.882190] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882227] mt76x0u 5-3.1.2:1.0: tx urb failed: -71
[16316.882267] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882346] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882426] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882505] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882586] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882666] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882745] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882825] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.882905] mt76x0u 5-3.1.2:1.0: rx urb failed: -71
[16316.911559] usb 5-3.1.2: USB disconnect, device number 8
[16316.911980] xhci_hcd 0000:27:00.3: WARN Cannot submit Set TR Deq Ptr
[16316.911982] xhci_hcd 0000:27:00.3: A Set TR Deq Ptr command is pending.
[16316.921294] mt76x0u 5-3.1.2:1.0: mac specific condition occurred
[16316.948240] device wlp39s0f3u3u1u2 left promiscuous mode
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #266 |
From now on this USB port is unusable.
E.g. connecting an USB memory stick to the same USB port
ID 13fe:6300 Kingston Technology Company Inc. USB DISK 3.0
spams dmesg log:
[16924.494936] usb 1-2: device descriptor read/8, error -71
[16924.625947] usb 1-2: device descriptor read/8, error -71
[16925.060354] usb 1-2: new high-speed USB device number 10 using xhci_hcd
[16925.339024] usb 1-2: device descriptor read/8, error -71
[16925.439343] usb usb2-port2: config error
[16925.469057] usb 1-2: device descriptor read/8, error -71
[16925.573848] usb usb1-port2: attempt power cycle
[16926.217012] usb 1-2: new high-speed USB device number 11 using xhci_hcd
[16926.890380] usb 1-2: device descriptor read/64, error -71
[16927.837037] usb 1-2: device descriptor read/64, error -71
[16928.067117] usb 1-2: new high-speed USB device number 12 using xhci_hcd
[16928.390350] usb usb2-port2: config error
[16928.783690] usb 1-2: device descriptor read/64, error -71
[16929.730336] usb 1-2: device descriptor read/64, error -71
I noticed this behavior only on AMD RYZEN systems.
crlshn (carlos-collart) wrote : | #136 |
ccollart@pop-os:~$ uname -a
Linux pop-os 5.3.0-7625-generic #27~1576774560~
ccollart@pop-os:~$ dmesg
[ 2857.871464] CIFS VFS: cifs_mount failed w/return code = -13
[ 2902.938295] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938305] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412daf0 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938375] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938379] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db00 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938460] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938465] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db10 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938547] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938550] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db20 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938633] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938636] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db30 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938719] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938723] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db40 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938805] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938809] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db50 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938891] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938895] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db60 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.938977] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.938981] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db70 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000003c412d000 seg-end 00000003c412dff0
[ 2902.939063] xhci_hcd 0000:0c:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 2902.939067] xhci_hcd 0000:0c:00.0: Looking for event-dma 00000003c412db80 trb-start 00000003c412dad0 trb-end 00000003c412dad0 seg-start 00000...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #267 |
A Garmin eTrex 30 connected to an USB 3.0 port of an AMD RYZEN system showing the same behavior:
[23803.507473] usb 1-2: new high-speed USB device number 14 using xhci_hcd
[23803.547562] usb 1-2: New USB device found, idVendor=05e3, idProduct=0727, bcdDevice= 2.50
[23803.547566] usb 1-2: New USB device strings: Mfr=3, Product=4, SerialNumber=2
[23803.547568] usb 1-2: Product: USB Storage
[23803.547570] usb 1-2: Manufacturer: Generic
[23803.547572] usb 1-2: SerialNumber: 000000000250
[23803.554609] usb-storage 1-2:1.0: USB Mass Storage device detected
[23803.554796] scsi host9: usb-storage 1-2:1.0
[23804.580523] scsi 9:0:0:0: Direct-Access Generic STORAGE DEVICE 0250 PQ: 0 ANSI: 0
[23804.580860] sd 9:0:0:0: Attached scsi generic sg2 type 0
[23804.818580] sd 9:0:0:0: [sdb] 30392320 512-byte logical blocks: (15.6 GB/14.5 GiB)
[23804.820914] sd 9:0:0:0: [sdb] Write Protect is off
[23804.820918] sd 9:0:0:0: [sdb] Mode Sense: 0b 00 00 08
[23804.822987] sd 9:0:0:0: [sdb] No Caching mode page found
[23804.822991] sd 9:0:0:0: [sdb] Assuming drive cache: write through
[23804.849969] sdb: sdb1
[23804.854844] sd 9:0:0:0: [sdb] Attached SCSI removable disk
[24257.645365] usb 1-1: new full-speed USB device number 15 using xhci_hcd
[24257.862068] usb 1-1: device descriptor read/64, error -71
Connected to an USB 2.0 port or to an INTEL system (using the same cable!), everything is fine.
In Linux Kernel Bug Tracker #202541, sapier (sapier-linux-kernel-bugs) wrote : | #268 |
Hello,
I found this bug by google search when looking for the error message. I have a quite similar behaviour when trying to clear a IDE disk by writing urandom data to it. I'm using a usb<->IDE converter. It's working quite fine when using one of the USB2.0 ports but fails with upper error message in most USB3.1 Port scenarios.
System:
Vanilla Kernel 5.4.21 (Debian bullseye configuration)
Ryzen 7 1800X
Gigabyte AX370 Gaming 5
- X370 Series Chipset USB 3.1 xHCI Controller (rev 02)
- ASMedia Technology Inc. ASM1143 USB 3.1 Host Controller (doesn't work at all)
- Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller
The issue seems to have some sort of "cable" component as I do get scenarios which seem to work even on USB3.0 Ports, I'll just write what I observed.
Case 1 USB 3.1:
USB<->SATA device -> USB3.1 Port -> worked at least once
Case 2 USB 3.1:
USB<->SATA device -> 4-Port USB hub (10cm cable) -> worked at least once
Case 3 USB 3.1:
USB<->SATA device -> 4-Port USB hub (40cm cable) -> USB 3.1 Port --> never managed to clean disk
Case 4 USB 3.1:
USB<->SATA device -> 2m usb cable -> USB 3.1 Port --> never managed to clean disk
case 5 USB 2.0:
USB<->SATA devive -> USB 2.0 Port -> works
case 6 USB 2.0:
USB<->SATA device -> 4-Port USB hub (10cm cable) -> USB 2.0 Port -> works
case 7 USB 2.0:
USB<->SATA device -> 4-Port USB hub (40cm cable) -> USB 2.0 Port -> works
case 8 USB 2.0:
USB<->SATA device -> 2m cable -> USB 2.0 Port -> works
case 9 USB 2.0:
USB<->SATA device -> 2m cable 4-Port USB hub (40cm cable) -> USB 2.0 Port -> works
All tested USB hubs are 2.0 hubs. The asmedia usb doesn't work at all port is dead right after booting, yet this seems to be unrelated to this issue here.
To me this does look like the 3.1 Ports are extremely sensitive to cable issues.
In Red Hat Bugzilla #1460789, timur.kristof (timur.kristof-redhat-bugs) wrote : | #154 |
The same issue still happens to me on kernel 5.5.6-201.
Hardware is a Dell XPS 13 9370 with a Lenovo Thunderbolt 3 dock. My dmesg is full of these messages:
[12696.189484] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12702.333456] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12707.965422] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12713.085385] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12718.205360] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12724.349321] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12729.981295] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12735.101256] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12740.221235] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12746.365199] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12751.997171] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12757.117155] r8152 6-1:1.0 enp10s0u1: Tx timeout
In Linux Kernel Bug Tracker #202541, biopsin (biopsin-linux-kernel-bugs) wrote : | #269 |
5.4.25_1 - ROG STRIX B450-I GAMING (RYZEN)
Hi,
bug is still present with external HDD conectected on DELTACO USB3.0 TO SATAII + 3.5*IDE Cable.
dmesg output:
[ 9540.086599] usb 2-3: device descriptor read/8, error -110
[ 9545.717826] usb 2-3: device descriptor read/8, error -110
[ 9551.350614] usb 2-3: device descriptor read/8, error -110
[ 9556.982468] usb 2-3: device descriptor read/8, error -110
[ 9562.614549] usb 2-3: device descriptor read/8, error -110
[ 9568.246248] usb 2-3: device descriptor read/8, error -110
[ 9573.878494] usb 2-3: device descriptor read/8, error -110
[ 9579.510536] usb 2-3: device descriptor read/8, error -110
[ 9579.658663] blk_update_request: I/O error, dev sdc, sector 319807200 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[ 9579.658779] blk_update_request: I/O error, dev sdc, sector 319807456 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[ 9579.658809] blk_update_request: I/O error, dev sdc, sector 319807200 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 9579.658842] blk_update_request: I/O error, dev sdc, sector 2048 op 0x1:(WRITE) flags 0x100000 phys_seg
1 prio class 0
[ 9579.658846] Buffer I/O error on dev sdc1, logical block 0, lost async page write
[ 9580.671946] EXT4-fs error (device sdc1): __ext4_
[ 9580.674408] EXT4-fs error (device sdc1): __ext4_
[ 9585.142455] usb 2-3: device descriptor read/8, error -110
[ 9590.773921] usb 2-3: device descriptor read/8, error -110
[ 9590.773935] xhci_hcd 0000:02:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 9592.045019] Buffer I/O error on dev sdc1, logical block 30441472, lost sync page write
[ 9592.045023] JBD2: Error -5 detected when updating journal superblock for sdc1-8.
[ 9592.045025] Buffer I/O error on dev sdc1, logical block 30441472, lost sync page write
[ 9592.045026] JBD2: Error -5 detected when updating journal superblock for sdc1-8.
[ 9596.406564] usb 2-3: device descriptor read/8, error -110
In Linux Kernel Bug Tracker #202541, erickperez (erickperez-linux-kernel-bugs) wrote : | #270 |
Hello,
This bug is present on ARM64 SBC system too.
uname -r
5.4.26-rockchip64 (Ubuntu 18.04.4 LTS)
Device: Realtek Ethernet 8152 USB 3.0 Gigabit adapter
dmesg:
[11519.368679] xhci-hcd xhci-hcd.1.auto: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[11519.997949] usb 8-1: reset SuperSpeed Gen 1 USB device number 2 using xhci-hcd
[11522.779552] xhci-hcd xhci-hcd.1.auto: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[11523.389784] usb 8-1: reset SuperSpeed Gen 1 USB device number 2 using xhci-hcd
[11528.980290] xhci-hcd xhci-hcd.1.auto: xHCI host not responding to stop endpoint command.
[11528.993885] xhci-hcd xhci-hcd.1.auto: xHCI host controller not responding, assume dead
[11528.994627] xhci-hcd xhci-hcd.1.auto: HC died; cleaning up
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #271 |
Just a small notice: bug is still alive running
$ uname -r
5.6.3-arch1-1
[21762.874883] usb 1-2: new high-speed USB device number 5 using xhci_hcd
[21762.928994] usb 1-2: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[21762.928997] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[21762.929000] usb 1-2: Product: 802.11 n WLAN
[21762.929002] usb 1-2: Manufacturer: Ralink
[21762.929003] usb 1-2: SerialNumber: 1.0
[21763.223013] usb 1-2: reset high-speed USB device number 5 using xhci_hcd
[21763.266041] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[21763.944109] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 5370 detected
[21763.950281] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[21763.950768] usbcore: registered new interface driver rt2800usb
[21763.962966] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[21807.879713] ieee80211 phy0: rt2x00lib_
[21807.879968] ieee80211 phy0: rt2x00lib_
[21811.980951] device wlp3s0f0u2 entered promiscuous mode
[21831.708072] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[21831.813186] device wlp3s0f0u2 left promiscuous mode
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #272 |
Created attachment 288441
Patch v2 1/2 handling halted endpoints at completion of stop endpoint command
patch 1/2 of two patch series to fix this issue
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #273 |
Created attachment 288443
Patch v2 2/2 handling halted endpoints at completion of stop endpoint command
patch 2/2 of series to fix this issue
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #274 |
Thanks for the patches. Added both of them and the issue is still present:
[ 21.783543] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[ 21.837511] usb 1-1: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[ 21.837515] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 21.837518] usb 1-1: Product: 802.11 n WLAN
[ 21.837520] usb 1-1: Manufacturer: Ralink
[ 21.837522] usb 1-1: SerialNumber: 1.0
[ 22.165094] usb 1-1: reset high-speed USB device number 5 using xhci_hcd
[ 22.207584] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[ 22.886244] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 5370 detected
[ 22.891860] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[ 22.892493] usbcore: registered new interface driver rt2800usb
[ 22.904005] rt2800usb 1-1:1.0 wlp3s0f0u1: renamed from wlan0
[ 43.634949] device wlp3s0f0u1 entered promiscuous mode
[ 43.635031] audit: type=1700 audit(158687172
[ 65.039630] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 65.091107] device wlp3s0f0u1 left promiscuous mode
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #275 |
ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #276 |
Not all devices are affected in the same way.
Same USB3 port and not affected:
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
[ 765.080527] usb 1-1: new high-speed USB device number 7 using xhci_hcd
[ 765.133195] usb 1-1: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[ 765.133199] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 765.133201] usb 1-1: Product: 802.11 n WLAN
[ 765.133203] usb 1-1: Manufacturer: Ralink
[ 765.133204] usb 1-1: SerialNumber: 1.0
[ 765.345171] usb 1-1: reset high-speed USB device number 7 using xhci_hcd
[ 765.388223] ieee80211 phy2: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[ 766.066341] ieee80211 phy2: rt2x00_set_rf: Info - RF chipset 0005 detected
[ 766.072528] ieee80211 phy2: Selected rate control algorithm 'minstrel_ht'
[ 766.091676] audit: type=1130 audit(158687244
[ 766.097778] rt2800usb 1-1:1.0 wlp3s0f0u1: renamed from wlan0
[ 771.664919] ieee80211 phy2: rt2x00lib_
[ 771.664959] ieee80211 phy2: rt2x00lib_
[ 775.893631] device wlp3s0f0u1 entered promiscuous mode
[ 775.893663] audit: type=1700 audit(158687245
[ 777.876925] device wlp3s0f0u1 left promiscuous mode
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #277 |
I just applied both patches, the Xhci Error message actually went away BUT the device still didn't work.
Logs after unplugging & plugging in the device with the patches:
[ 72.648791] usb 1-4: USB disconnect, device number 3
[ 72.650675] ieee80211 phy0: rt2x00usb_
[ 72.753779] wlan0: deauthenticating from cc:ce:1e:99:77:ed by local choice (Reason: 3=DEAUTH_LEAVING)
[ 72.781608] audit: type=1130 audit(158688356
[ 72.793722] audit: type=1130 audit(158688356
[ 73.317939] audit: type=1131 audit(158688356
[ 77.799300] audit: type=1131 audit(158688357
[ 80.933744] usb 1-4: new high-speed USB device number 6 using xhci_hcd
[ 80.988187] usb 1-4: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[ 80.988190] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 80.988191] usb 1-4: Product: 802.11 n WLAN
[ 80.988192] usb 1-4: Manufacturer: Ralink
[ 80.988193] usb 1-4: SerialNumber: 1.0
[ 81.131897] usb 1-4: reset high-speed USB device number 6 using xhci_hcd
[ 81.174210] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[ 81.852225] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 5370 detected
[ 81.858386] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[ 81.920689] ieee80211 phy1: rt2x00lib_
[ 81.920711] ieee80211 phy1: rt2x00lib_
Compared to the output without patches:
[ 67.093338] usb 1-4: USB disconnect, device number 3
[ 67.093964] xhci_hcd 0000:15:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 67.096166] ieee80211 phy0: rt2x00usb_
[ 67.168604] wlan0: deauthenticating from cc:ce:1e:99:77:ed by local choice (Reason: 3=DEAUTH_LEAVING)
[ 67.179973] audit: type=1130 audit(158688325
[ 67.231226] audit: type=1130 audit(158688325
[ 72.236839] audit: type=1131 audit(158688325
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #278 |
Tested another device against the applied patches to make sure the issue isn't related to the combination rt2800usb - usb host:
ID 7392:7710 Edimax Technology Co., Ltd Edimax Wi-Fi
[ 68.126337] usb 1-2: new high-speed USB device number 17 using xhci_hcd
[ 68.181565] usb 1-2: New USB device found, idVendor=7392, idProduct=7710, bcdDevice= 0.00
[ 68.181568] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 68.181571] usb 1-2: Product: Edimax Wi-Fi
[ 68.181573] usb 1-2: Manufacturer: MediaTek
[ 68.181575] usb 1-2: SerialNumber: 1.0
[ 68.398420] usb 1-2: reset high-speed USB device number 17 using xhci_hcd
[ 68.446602] mt7601u 1-2:1.0: ASIC revision: 76010001 MAC revision: 76010500
[ 68.473662] mt7601u 1-2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
[ 69.461098] mt7601u 1-2:1.0: EEPROM ver:0d fae:00
[ 69.472103] mt7601u 1-2:1.0: EEPROM country region 01 (channels 1-13)
[ 70.152995] mt7601u 1-2:1.0: Warning: mt7601u_
[ 70.472567] mt7601u 1-2:1.0: Warning: mt7601u_
[ 70.792966] mt7601u 1-2:1.0: Warning: mt7601u_
[ 71.112927] mt7601u 1-2:1.0: Warning: mt7601u_
[ 71.432909] mt7601u 1-2:1.0: Warning: mt7601u_
[ 71.432913] mt7601u 1-2:1.0: Error: mt7601u_
[ 71.433388] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 71.435442] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 71.582930] mt7601u 1-2:1.0: Vendor request req:07 off:0080 failed:-71
[ 71.729217] mt7601u 1-2:1.0: Vendor request req:02 off:0080 failed:-71
After the device is unplugged, dmesg log is spammed:
[ 363.312561] mt7601u 1-2:1.0: Vendor request req:07 off:0730 failed:-71
[ 363.479252] mt7601u 1-2:1.0: Vendor request req:07 off:0730 failed:-71
[ 363.649243] mt7601u 1-2:1.0: Vendor request req:07 off:0730 failed:-71
...
[ 380.069000] mt7601u 1-2:1.0: Vendor request req:02 off:0080 failed:-71
[ 380.069055] mt7601u: probe of 1-2:1.0 failed with error -110
[ 380.069272] usb 1-2: USB disconnect, device number 90
@Bernhard: I can confirm missing Error message on some devices, too. The devices are not working.
CK Cameron (ckcameron) wrote : | #137 |
Bugstill exists in 20.04
[14624.728999] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[14624.729000] xhci_hcd 0000:08:00.0: Looking for event-dma 00000000ff8d8050 trb-start 00000000ff8d7fe0 trb-end 00000000ff8d7fe0 seg-start 00000000ff8d7000 seg-end 00000000ff8d7ff0
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #279 |
Just got a new variant of that issue:
[39799.493322] usb usb1-port9: disabled by hub (EMI?), re-enabling...
[39799.493328] usb 1-9: USB disconnect, device number 5
[39799.833153] usb 1-9: new low-speed USB device number 6 using xhci_hcd
[39815.286287] usb 1-9: device descriptor read/64, error -110
[39826.307239] xhci_hcd 0000:03:00.0: ERROR Transfer event pointed to bad slot 4
[39826.307247] xhci_hcd 0000:03:00.0: @00000000dffed510 dff3d720 00000000 03000005 04038001
[39826.307267] xhci_hcd 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0x60 flags=0x0020]
[39826.307395] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 3
[39826.307399] xhci_hcd 0000:03:00.0: Looking for event-dma 00000000dfeff000 trb-start 00000000dfeff0f0 trb-end 00000000dfeff110 seg-start 00000000dfeff000 seg-end 00000000dfeffff0
Affected: CHERRY RS 6000 USB ON keyboard and Logitech RX1000 mouse stopped working - reboot required.
$ uname -r
5.6.15-arch1-1
In Linux Kernel Bug Tracker #202541, oyvind (oyvind-linux-kernel-bugs) wrote : | #280 |
My Linux server just crashed rather hard with these errors:
juni 02 19:26:17 nori kernel: xhci_hcd 0000:04:00.0: WARN Cannot submit Set TR Deq Ptr
juni 02 19:26:17 nori kernel: xhci_hcd 0000:04:00.0: A Set TR Deq Ptr command is pending.
juni 02 19:26:17 nori kernel: usb 4-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
juni 02 19:26:37 nori kernel: xhci_hcd 0000:04:00.0: WARN Cannot submit Set TR Deq Ptr
juni 02 19:26:37 nori kernel: xhci_hcd 0000:04:00.0: A Set TR Deq Ptr command is pending.
juni 02 19:26:37 nori kernel: usb 4-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Affected a USB3-connected hard-drive (Bus 004 Device 002: ID 059f:1057 LaCie, Ltd), which became unresponsive, and there were several hung processes blocked on I/O to the drive. The drive itself has zero logged SMART-errors, so it's likely not failing. Another USB2-connected drive also was affected, but not in an unrecoverable fashion, i.e. hung processes could be killed. The server has been stable for several years, but this one forced me to do a hard power-off, due to soft reboot not able to complete.
Running Ubuntu 18.04.4.
[ 0.000000] Linux version 5.3.0-53-generic (buildd@
With USB controller:
04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller (prog-if 30 [XHCI])
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer ASM1042 SuperSpeed USB Host Controller
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at f7c00000 (64-bit, non-prefetchable) [size=32K]
Kernel driver in use: xhci_hcd
In Red Hat Bugzilla #1460789, d.bz-redhat (d.bz-redhat-redhat-bugs) wrote : | #155 |
This seems to help for me (Dell XPS13 2-in-1 7390 , kernel 5.6.15-
https:/
# echo 0bda:8153:k > /sys/module/
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #281 |
definitely weak xHCI host system (at this time Logitech keyboard, connected to USB 3 port):
[ 9241.775664] usb 1-9: new low-speed USB device number 7 using xhci_hcd
[ 9242.327943] xhci_hcd 0000:03:00.0: ERROR unknown event type 2
[ 9246.875619] xhci_hcd 0000:03:00.0: ERROR mismatched command completion event
[ 9249.008917] xhci_hcd 0000:03:00.0: Timeout while waiting for setup device command
[ 9264.462045] xhci_hcd 0000:03:00.0: Abort failed to stop command ring: -110
[ 9264.462080] xhci_hcd 0000:03:00.0: xHCI host controller not responding, assume dead
[ 9264.462093] xhci_hcd 0000:03:00.0: HC died; cleaning up
[ 9264.462128] xhci_hcd 0000:03:00.0: Timeout while waiting for setup device command
[ 9264.668691] usb 1-9: device not accepting address 7, error -62
[ 9264.668723] usb usb1-port9: couldn't allocate usb_device
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #282 |
Still present on
$ uname -r
5.7.2-arch1-1
Jeremy Akers (irwinr12) wrote : | #138 |
I'm also seeing this in 20.04:
[ 110.467608] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.467613] xhci_hcd 0000:08:00.0: Looking for event-dma 000000086900cfd0 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.478406] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.478412] xhci_hcd 0000:08:00.0: Looking for event-dma 000000086900cfe0 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.479937] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.479942] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06000 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.482654] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.482660] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06010 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.499173] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.499178] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06020 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505613] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505618] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06030 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505676] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505678] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06040 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505764] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505766] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06050 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.507398] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.507405] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06060 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.509353] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.509359] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06070 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.510017] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.510021] xhci_hcd 0000:08:00.0: Looking for event-dma ...
In Red Hat Bugzilla #1460789, jeremy.akers (jeremy.akers-redhat-bugs) wrote : | #156 |
Seeing a similar issue on a Dell XPS 9300 (2020) with Linux 5.4:
[ 110.467608] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.467613] xhci_hcd 0000:08:00.0: Looking for event-dma 000000086900cfd0 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.478406] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.478412] xhci_hcd 0000:08:00.0: Looking for event-dma 000000086900cfe0 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.479937] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.479942] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06000 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.482654] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.482660] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06010 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.499173] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.499178] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06020 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505613] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505618] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06030 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505676] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505678] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06040 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.505764] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.505766] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06050 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.507398] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.507405] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06060 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.509353] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.509359] xhci_hcd 0000:08:00.0: Looking for event-dma 0000000861c06070 trb-start 000000086900cfb0 trb-end 000000086900cfb0 seg-start 000000086900c000 seg-end 000000086900cff0
[ 110.510017] xhci_hcd 0000:08:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 13
[ 110.510021] xhci_hcd 00...
In Linux Kernel Bug Tracker #202541, himanshu.xt (himanshu.xt-linux-kernel-bugs) wrote : | #283 |
Have the same error
$ dmesg | grep xhci
has
[40557.207677] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state
In Linux Kernel Bug Tracker #202541, R.E.Wolff (r.e.wolff-linux-kernel-bugs) wrote : | #284 |
I'm using "stock Ubuntu 20.04"
I have this happening on kernel 5.4.0 .
[ 4063.051692] usb 3-10.4: New USB device found, idVendor=0483, idProduct=5740, bcdDevice= 2.00
[ 4063.051695] usb 3-10.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 4063.051696] usb 3-10.4: Product: ChibiOS/RT Virtual COM Port
[ 4063.051698] usb 3-10.4: Manufacturer: STMicroelectronics
[ 4063.051699] usb 3-10.4: SerialNumber: 400
[ 4063.058680] cdc_acm 3-10.4:1.0: ttyACM1: USB ACM device
[ 4073.043695] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[ 4073.043697] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
[ 4073.059819] xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
[ 4073.059822] xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
I plugged in my development board that provides a virtual comport. I then hit boot-and-reset buttons on the board to make it boot into DFU bootloader mode.
This has worked the last decade or so. I didn't read everything above, but I saw something about AMD... I have:
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)
as the USB controller: Not AMD. -> Not hardware vendor related imho.
In Linux Kernel Bug Tracker #202541, sixerjman (sixerjman-linux-kernel-bugs) wrote : | #285 |
Happening at ~30 second intervals on Debian kernel 5.7.0-1-amd64 with Dell XHCI Controller and USB 3.0 hub:
xhci_hcd 0000:00:10.0: WARN Cannot submit Set TR Deq Ptr
Jul 4 05:02:32 hostname kernel: [33164.415980] xhci_hcd 0000:00:10.0: A Set TR Deq Ptr command is pending.
Jul 4 05:02:32 hostname kernel: [33164.497202] usb 3-3.1: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
In Linux Kernel Bug Tracker #202541, R.E.Wolff (r.e.wolff-linux-kernel-bugs) wrote : | #286 |
I've now managed to get a workable situation for myself:
Using an old USB2-hub instead of the USB3-hub that I was using before.
The DFU download takes ages when there is no hub between my computer and the STM32. Using the USB3-hub worked a few months ago when I was still on Ubuntu 16.04.
[ 5382.799225] usb 3-4.1: Product: ChibiOS/RT Virtual COM Port
[ 5382.799226] usb 3-4.1: Manufacturer: STMicroelectronics
[ 5382.799227] usb 3-4.1: SerialNumber: 400
[ 5382.807282] cdc_acm 3-4.1:1.0: ttyACM3: USB ACM device
[ 5387.003761] usb 3-4: clear tt 1 (91a1) error -32
About 12 identical messages in the same millisecond deleted.
[ 5387.004976] usb 3-4: clear tt 1 (91a1) error -32
[ 5387.222030] usb 3-4.1: USB disconnect, device number 22
[ 5387.224061] cdc_acm 3-4.1:1.0: failed to set dtr/rts
[ 5387.522299] usb 3-4.1: new full-speed USB device number 23 using xhci_hcd
[ 5387.627345] usb 3-4.1: New USB device found, idVendor=0483, idProduct=df11, bcdDevice=22.00
[ 5387.627348] usb 3-4.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 5387.627350] usb 3-4.1: Product: STM32 BOOTLOADER
[ 5387.627352] usb 3-4.1: Manufacturer: STMicroelectronics
[ 5387.627353] usb 3-4.1: SerialNumber: FFFFFFFEFFFF
So this is a mostly normal switchover from the usercode running ACM USB code and the bootloader. Through an USB2 switch.
Alejandro Mery (amery) wrote : | #139 |
- sudo lshw -short Edit (8.5 KiB, text/plain)
I've got this problem with a shiny new TB3 docking/eGPU (razer core x chroma) on 20.04. it has 3 ASM1142 controllers.
Kai-Heng Feng (kaihengfeng) wrote : | #140 |
The issue was raised to ASMedia and they confirm this can only be fixed by firmware upgrade. Please push the hardware/
In Linux Kernel Bug Tracker #202541, mail2lawi (mail2lawi-linux-kernel-bugs) wrote : | #287 |
I was getting similar freezes with HP ENVY x360 Convertible 15 running OpenSUSE Leap 15.2.
This laptop model doesn't come with an RJ45 (LAN) port so I use a Type C USB ethernet adapter. And it was exhibiting the same problems, after some time it would just fail to work and initially I thought it was the NetworkManager.
For me even running the command 'ip add' would lock up, yet most other commands and even the desktop manager would still be working fine. But I couldn't get network back or even switch to WiFi. Basically every time this happened I had to restart the laptop.
Anyway, after going through the accounts here and other sites I found one which suggested that the issue could be with power management suspending the USB device.
So I added the particular USB to TLP black_list to prevent it from being suspended and so far I've gon 24 hrs without the lockup.
Link to forum: https:/
In Red Hat Bugzilla #1460789, arcadiy (arcadiy-redhat-bugs) wrote : | #157 |
There is Dell TB19 firmware available that is installable via fwupdmgr on Linux: https:/
In Red Hat Bugzilla #1460789, arcadiy (arcadiy-redhat-bugs) wrote : | #158 |
Install via: sudo fwupdmgr install ~/Downloads/
In Red Hat Bugzilla #1460789, alex.gronholm (alex.gronholm-redhat-bugs) wrote : | #159 |
Thanks for the info (I own a WD19TB dock too) but that hardly helps with the TB16 problem. The WD19 series docks have working USB controllers, unlike TB16.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #288 |
Still present on:
$ uname -r
5.8.5-arch1-1
device connected to USB3 port:
[15005.134111] usb 1-2: new high-speed USB device number 13 using xhci_hcd
[15005.311803] usb 1-2: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[15005.311807] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[15005.311810] usb 1-2: Product: 802.11 n WLAN
[15005.311812] usb 1-2: Manufacturer: Ralink
[15005.311814] usb 1-2: SerialNumber: 1.0
[15005.602591] usb 1-2: reset high-speed USB device number 13 using xhci_hcd
[15005.834856] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[15006.513400] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 5370 detected
[15006.519415] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[15006.520103] usbcore: registered new interface driver rt2800usb
[15006.532103] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
...
[15062.425086] Bluetooth: Core ver 2.22
[15062.425100] NET: Registered protocol family 31
[15062.425101] Bluetooth: HCI device and connection manager initialized
[15062.425103] Bluetooth: HCI socket layer initialized
[15062.425105] Bluetooth: L2CAP socket layer initialized
[15062.425107] Bluetooth: SCO socket layer initialized
[15068.677302] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Any ideas, why HCI device and connection manager initialized?
The device doesn't have BT:
$ lsusb
Bus 001 Device 013: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
device connected to USB2 port:
[15317.960151] usb 5-1.1.2: new high-speed USB device number 6 using xhci_hcd
[15318.186487] usb 5-1.1.2: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[15318.186492] usb 5-1.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[15318.186495] usb 5-1.1.2: Product: 802.11 n WLAN
[15318.186497] usb 5-1.1.2: Manufacturer: Ralink
[15318.186498] usb 5-1.1.2: SerialNumber: 1.0
[15318.376954] usb 5-1.1.2: reset high-speed USB device number 6 using xhci_hcd
[15318.593739] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[15318.603488] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 5370 detected
[15318.603596] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[15318.626103] rt2800usb 5-1.1.2:1.0 wlp39s0f3u1u1u2: renamed from wlan0
...
[15336.958194] ieee80211 phy1: rt2x00lib_
[15336.958238] ieee80211 phy1: rt2x00lib_
everything is fine!
In Linux Kernel Bug Tracker #202541, mkj (mkj-linux-kernel-bugs) wrote : | #289 |
Experiencing the same kinda issues with my local Gentoo system on linux 5.4.60 running on my AMD Threadripper ASUS X399-A system.
[59417.351322] Bluetooth: HCI device and connection manager initialized
[59417.351326] Bluetooth: HCI socket layer initialized
[59417.351327] Bluetooth: L2CAP socket layer initialized
[59417.351330] Bluetooth: SCO socket layer initialized
[59417.356567] usbcore: registered new interface driver btusb
[59417.401190] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[59417.401191] Bluetooth: BNEP filters: protocol multicast
[59417.401195] Bluetooth: BNEP socket layer initialized
[59746.085438] debugfs: File 'le_min_key_size' in directory 'hci0' already present!
[59746.085445] debugfs: File 'le_max_key_size' in directory 'hci0' already present!
[59746.085450] debugfs: File 'force_bredr_smp' in directory 'hci0' already present!
[59814.624020] input: WH-1000XM2 (AVRCP) as /devices/
[59902.644488] snd_hda_intel 0000:0b:00.3: Too big adjustment 128
[59962.175286] snd_hda_intel 0000:0b:00.3: Too big adjustment 128
[60729.936646] Bluetooth: RFCOMM TTY layer initialized
[60729.936651] Bluetooth: RFCOMM socket layer initialized
[60729.936655] Bluetooth: RFCOMM ver 1.11
[61319.535471] usb 1-3: USB disconnect, device number 20
[61319.536250] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[61333.523244] usb 1-3: new full-speed USB device number 21 using xhci_hcd
[61333.947814] usb 1-3: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=88.91
[61333.947818] usb 1-3: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[61333.947819] usb 1-3: Product: CSR8510 A10
[61454.719213] usb 1-3: USB disconnect, device number 21
[61454.719807] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
In Linux Kernel Bug Tracker #202541, mkj (mkj-linux-kernel-bugs) wrote : | #290 |
For my system, it seems to trigger after resuming from suspend.
John Dop (vbx) wrote : | #160 |
@arcadiy
fyi, this still happens on a xps 9300 with ubuntu 20.04/Linux 5.4.0-47-generic and wd19tb fw up to date :
sudo fwupdmgr install ~/Downloads/
Decompressing… [******
All updatable firmware is already installed
│ ├─Package level of Dell dock:
│ │ Device ID: xxx
│ │ Summary: A representation of dock update status
│ │ Current version: 01.00.14.01
│ │ Vendor: Dell Inc. (USB:0x413C)
│ │ Install Duration: 5 seconds
│ │ GUID: xxx
│ │ Device Flags: • Updatable
│ │ • Supported on remote server
│ │ • Device can recover flash failures
│ │ • Device is usable for the duration of the update
excerpt from kernel log:
[ 3896.481019] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 13 comp_code 1
[ 3896.481025] xhci_hcd 0000:00:14.0: Looking for event-dma 00000000ffe012d0 trb-start 00000000ffe012e0 trb-end 00000000ffe012e0 seg-start 00000000ffe01000 seg-end 00000000ffe01ff0
[ 3896.481028] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 15 comp_code 1
[ 3896.481029] xhci_hcd 0000:00:14.0: Looking for event-dma 00000000fffc9500 trb-start 00000000fffc9510 trb-end 00000000fffc9510 seg-start 00000000fffc9000 seg-end 00000000fffc9ff0
[ 3896.481990] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 13 comp_code 1
[ 3896.481992] xhci_hcd 0000:00:14.0: Looking for event-dma 00000000ffe012e0 trb-start 00000000ffe012f0 trb-end 00000000ffe012f0 seg-start 00000000ffe01000 seg-end 00000000ffe01ff0
[ 3896.481994] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 15 comp_code 1
[ 3896.481995] xhci_hcd 0000:00:14.0: Looking for event-dma 00000000fffc9510 trb-start 00000000fffc9520 trb-end 00000000fffc9520 seg-start 00000000fffc9000 seg-end 00000000fffc9ff0
[ 3896.482020] usb 3-4.3.4: cannot submit urb (err = -19)
[ 3896.483067] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 13 comp_code 1
[ 3896.483070] xhci_hcd 0000:00:14.0: Looking for event-dma 00000000ffe012f0 trb-start 00000000ffe01300 trb-end 00000000ffe01300 seg-start 00000000ffe01000 seg-end 00000000ffe01ff0
In Linux Kernel Bug Tracker #202541, koparebu (koparebu-linux-kernel-bugs) wrote : | #291 |
Hi,
I'd like to add some information here about this issue, which in one of my computers happens when using a Software Defined Radio (SDR) device on some of the motherboard USB ports:
* The device works OK when using the back panel USB 3.2 Gen 1 ports
* It doesn't work OK when using the back panel USB 3.2 Gen 2 ports, or when using the front headers (either USB 3.2 Gen 1 or USB 2.0)
When I finish using the device, the following message gets logged. I have to disconnect it and plug it in again in order to use it:
xhci_hcd 0000:02:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
This motherboard is an ASUS Prime B550M.
Kernel 5.4.0-45-generic (from Ubuntu 20.04).
I've tried booting with "intel_iommu=off" or "iommu=off" just for testing, but the result is the same.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #292 |
Just received an AMD notebook to run some tests on 5.8.12-arch1-1:
ASUS TUF Gaming FX505D
05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
WiFi adapter:
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
[ 308.749192] usb 1-2: new high-speed USB device number 46 using xhci_hcd
[ 308.909139] usb 1-2: New USB device found, idVendor=148f, idProduct=3070, bcdDevice= 1.01
[ 308.909145] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 308.909148] usb 1-2: Product: 802.11 n WLAN
[ 308.909150] usb 1-2: Manufacturer: Ralink
[ 308.909153] usb 1-2: SerialNumber: 1.0
[ 309.032719] usb 1-2: reset high-speed USB device number 46 using xhci_hcd
[ 309.188373] ieee80211 phy40: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected
[ 309.276422] ieee80211 phy40: rt2x00_set_rf: Info - RF chipset 0005 detected
[ 309.277177] ieee80211 phy40: Selected rate control algorithm 'minstrel_ht'
[ 309.297319] rt2800usb 1-2:1.0 wlp5s0f3u2: renamed from wlan0
[ 309.338896] ieee80211 phy40: rt2x00lib_
[ 309.338980] ieee80211 phy40: rt2x00lib_
[ 310.619294] usb 1-2: USB disconnect, device number 46
[ 311.185849] ieee80211 phy40: rt2x00queue_
Then xhci died.
In Linux Kernel Bug Tracker #202541, github (github-linux-kernel-bugs) wrote : | #293 |
Hi,
I have the same problem with an Argus KVM switch. The mouse stopped atfer the first click to work.
uname -a
Linux sysiphus 5.8.0-2-amd64 #1 SMP Debian 5.8.10-1 (2020-09-19) x86_64 GNU/Linux
Debian Sid
❯ lsusb
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 004: ID 046d:c245 Logitech, Inc. G400 Optical Mouse
Bus 003 Device 003: ID 046d:c326 Logitech, Inc. Washable Keyboard K310
Bus 003 Device 002: ID 1a86:8072 QinHeng Electronics
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
journald
Okt 07 19:29:43 sysiphus kernel: usb 3-2: USB disconnect, device number 2
Okt 07 19:29:43 sysiphus kernel: usb 3-2.1: USB disconnect, device number 3
Okt 07 19:29:43 sysiphus acpid[587]: input device has been disconnected, fd 5
Okt 07 19:29:43 sysiphus kernel: xhci_hcd 0000:08:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Okt 07 19:29:43 sysiphus acpid[587]: input device has been disconnected, fd 6
Okt 07 19:29:43 sysiphus acpid[587]: input device has been disconnected, fd 12
Okt 07 19:29:43 sysiphus kernel: usb 3-2.2: USB disconnect, device number 4
Okt 07 19:29:43 sysiphus kernel: xhci_hcd 0000:08:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
dmesg with debug
[ 2539.095820] xhci_hcd 0000:00:14.0: // Ding dong!
[ 2539.095824] xhci_hcd 0000:00:14.0: // Ding dong!
[ 2539.095840] xhci_hcd 0000:00:14.0: Slot 17 output ctx = 0x3eac96000 (dma)
[ 2539.095841] xhci_hcd 0000:00:14.0: Slot 17 input ctx = 0x3ef030000 (dma)
[ 2539.095843] xhci_hcd 0000:00:14.0: Set slot id 17 dcbaa entry 000000003007eed2 to 0x3eac96000
[ 2539.173763] usb 1-10.2: new full-speed USB device number 18 using xhci_hcd
[ 2539.173772] xhci_hcd 0000:00:14.0: Set root hub portnum to 10
[ 2539.173775] xhci_hcd 0000:00:14.0: Set fake root hub portnum to 10
[ 2539.173778] xhci_hcd 0000:00:14.0: udev->tt = 000000002e0e39e0
[ 2539.173781] xhci_hcd 0000:00:14.0: udev->ttport = 0xa
[ 2539.173789] xhci_hcd 0000:00:14.0: // Ding dong!
[ 2539.174915] xhci_hcd 0000:00:14.0: Successful setup address command
[ 2539.174922] xhci_hcd 0000:00:14.0: Op regs DCBAA ptr = 0x000003ead2c000
[ 2539.174928] xhci_hcd 0000:00:14.0: Slot ID 17 dcbaa entry @000000003007eed2 = 0x000003eac96000
[ 2539.174933] xhci_hcd 0000:00:14.0: Output Context DMA address = 0x3eac96000
[ 2539.174937] xhci_hcd 0000:00:14.0: Internal device address = 17
[ 2539.198845] xhci_hcd 0000:00:14.0: Max Packet Size for ep 0 changed.
[ 2539.198851] xhci_hcd 0000:00:14.0: Max packet size in usb_device = 8
[ 2539.198854] xhci_hcd 0000:00:14.0: Max packet size in xHCI HW = 64
[ 2539.198857] xhci_hcd 0000:00:14.0: Issuing evaluate context command.
[ 2539.198867] xhci_hcd 0000:00:14.0: // Ding dong!
[ 2539.198886] xhci_hcd 0000:00:14.0: Successful evaluate context command
[ 2539.200465] usb 1-10.2: no configurations
[ 2539.200472] usb 1-10.2: can't read configurations, error -22
[ 2539.200634] xhci_hcd 0000:00:14.0: // Ding dong!
[ 2539.200641] usb 1-10-port2: unable to enumerate USB device
In Linux Kernel Bug Tracker #202541, ehoffman (ehoffman-linux-kernel-bugs) wrote : | #294 |
I have same issue with HackRF SDR, and there's a bug on their side
https:/
I connect device:
[ 428.013129] usb 1-10: USB disconnect, device number 4
[ 2163.462098] usb 1-10: new high-speed USB device number 6 using xhci_hcd
[ 2163.699532] usb 1-10: New USB device found, idVendor=1d50, idProduct=6089, bcdDevice= 1.02
[ 2163.699535] usb 1-10: New USB device strings: Mfr=1, Product=2, SerialNumber=4
[ 2163.699536] usb 1-10: Product: HackRF One
[ 2163.699538] usb 1-10: Manufacturer: Great Scott Gadgets
[ 2163.699539] usb 1-10: SerialNumber: 000000000000000
I run test once, and after device close:
[ 2187.589321] xhci_hcd 0000:02:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Then, if I try to run other tests, the device no longer respond until I reset the device (it has a reset button) or disconnect/
Here, I have the same setup as some other people reporting the issue, a B350 chipset (Ryzen 2700X), on ASUS ROG STRIX B350-F GAMING motherboard.
Kernel version: uname -a
Linux lx-ryzen 5.4.0-53-generic #59-Ubuntu SMP Wed Oct 21 09:38:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
This does the issue with USB3 and USB2 ports.
This same device work fine on another computer (a non-Ryzen laptop), same kernel.
In Linux Kernel Bug Tracker #202541, ehoffman (ehoffman-linux-kernel-bugs) wrote : | #295 |
Same result with B450 chipset, same kernel.
ASUS PRIME B450M-A motherboard.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #296 |
Could this issue be caused by USB Controllers / Chipsets made by ASMedia?
B350 / X370 / B450 / X470 Chipsets are all manufactured by ASMedia.
And @Michael mentioned that one of his Intel systems (I'm assuming the one with a Asus P9X79 motherboard?) has issues with USB 3, but not with USB 2. So I looked up that particular mobo and guess what:
ASMedia® USB 3.0 controller :
4 x USB 3.1 Gen 1 port(s) (4 at back panel, blue)
Intel® X79 chipset :
14 x USB 2.0 port(s) (6 at back panel, black+white, 8 at mid-board)
Only the USB 3.1 Gen 1 Ports are using the ASMedia controller.
In Linux Kernel Bug Tracker #202541, github (github-linux-kernel-bugs) wrote : | #297 |
My board has two controllers and both shows the same behaviour.
lspci|grep "USB controller"
00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller
08:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller
Asus PRIME Z270-K, BIOS 1207 06/22/2018
In Linux Kernel Bug Tracker #202541, wgh (wgh-linux-kernel-bugs) wrote : | #298 |
The same problem with HackRF: it stops working after using it once (presumably due to transfers being cancelled on exit). hackrf_info still detects it though, which is probably because only the bulk transfer endpoint becomes broken.
Kernel 5.9.11
ASRock B550 Extreme4, BIOS P1.20 08/13/2020
01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ee
0c:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
The debug messages on stopping the tool:
[ 3944.894483] xhci_hcd 0000:01:00.0: Transfer error for slot 28 ep 2 on endpoint
[ 3944.894494] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3944.894602] xhci_hcd 0000:01:00.0: Ignoring reset ep completion code of 1
[ 3945.396550] xhci_hcd 0000:01:00.0: Cancel URB 000000008085c3f5, dev 5, ep 0x81, starting at offset 0x1fa7ae1c90
[ 3945.396558] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3945.396690] xhci_hcd 0000:01:00.0: Removing canceled TD starting at 0x1fa7ae1c90 (dma).
[ 3945.396710] xhci_hcd 0000:01:00.0: Cancel URB 000000000098c2b5, dev 5, ep 0x81, starting at offset 0x1fa7ae1990
[ 3945.396712] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3945.396836] xhci_hcd 0000:01:00.0: Removing canceled TD starting at 0x1fa7ae1990 (dma).
[ 3945.396839] xhci_hcd 0000:01:00.0: Finding endpoint context
[ 3945.396841] xhci_hcd 0000:01:00.0: Cycle state = 0x1
[ 3945.396843] xhci_hcd 0000:01:00.0: New dequeue segment = 000000008a0bf921 (virtual)
[ 3945.396845] xhci_hcd 0000:01:00.0: New dequeue pointer = 0x1fa7ae1a90 (DMA)
[ 3945.396847] xhci_hcd 0000:01:00.0: Set TR Deq Ptr cmd, new deq seg = 000000008a0bf921 (0x1fa7ae1000 dma), new deq ptr = 00000000e5711e6d (0x1fa7ae1a90 dma), new cycle = 1
[ 3945.396851] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3945.396869] xhci_hcd 0000:01:00.0: Cancel URB 000000008b1032dd, dev 5, ep 0x81, starting at offset 0x1fa7ae1a90
[ 3945.396871] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3945.396904] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 3945.396909] xhci_hcd 0000:01:00.0: Slot state = 3, EP state = 2
[ 3945.397028] xhci_hcd 0000:01:00.0: Removing canceled TD starting at 0x1fa7ae1a90 (dma).
[ 3945.397040] xhci_hcd 0000:01:00.0: Cancel URB 00000000b1b43562, dev 5, ep 0x81, starting at offset 0x1fa7ae1b90
[ 3945.397042] xhci_hcd 0000:01:00.0: // Ding dong!
[ 3945.397172] xhci_hcd 0000:01:00.0: Removing canceled TD starting at 0x1fa7ae1b90 (dma).
Messages on attempts to use the device again:
[ 4076.243019] xhci_hcd 0000:01:00.0: WARN halted endpoint, queueing URB anyway.
[ 4076.243029] xhci_hcd 0000:01:00.0: WARN halted endpoint, queueing URB anyway.
[ 4076.243044] xhci_hcd 0000:01:00.0: WARN halted endpoint, queueing URB anyway.
[ 4076.243051] xhci_hcd 0000:01:00.0: WARN halted endpoint, queueing URB anyway.
[ 4077.749450] xhci_hcd 0000:01:00.0: Cancel URB 0000000063c2cde4, dev 5, ep 0x81, starting at offset 0x1fa7ae1d90
[ 4077.749456] xhci_hcd 0000:01:00.0: // Ding dong!
[ 4077.749592] xhci_hcd 0000:01:00.0: Removing canceled TD starting at 0x1fa7ae1d90 (dma).
[ 4077.749620] xhci_hcd 0000:01:00.0: Cancel URB 00000000564ffbd2, dev 5, ep 0x81, starting at offset 0x1fa7ae1e90
[ 4077.749622] xhci_hcd 0000:01...
In Linux Kernel Bug Tracker #202541, ehoffman (ehoffman-linux-kernel-bugs) wrote : | #299 |
I think I've narrow it down to a minimum.
Using libusb, if you happen to call libusb_
Ex:
...
libusb_
...
libusb_
...
libusb_
...
Or, simply
...
libusb_
...
libusb_
...
libusb_
...
The thing is that in the example above, even though there's an application error, the behavior is different on different chipset driver.
In the case of HackRF application (which I mentioned above), the application called libusb_
In Linux Kernel Bug Tracker #202541, sandro.stross (sandro.stross-linux-kernel-bugs) wrote : | #300 |
Same problems here:
ASUS X470-F Gaming
Ryzen 2700x
most problems i have with rt2800usb driver.
Any progress?
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #301 |
rewritten URB cancel, endpoint stop and set trb deq can be found in my tree
in rewrite_
git://git.
https:/
Does that help?
In Linux Kernel Bug Tracker #202541, t.clastres (t.clastres-linux-kernel-bugs) wrote : | #302 |
Created attachment 294739
Patch from Nyman's rewrite_
In Linux Kernel Bug Tracker #202541, t.clastres (t.clastres-linux-kernel-bugs) wrote : | #303 |
(In reply to Mathias Nyman from comment #139)
> rewritten URB cancel, endpoint stop and set trb deq can be found in my tree
> in rewrite_
>
> git://git.
> rewrite_
>
> https:/
> ?h=rewrite_
>
> Does that help?
Just created the corresponding patch to easily apply your changes to linux 5.10.y.
I don't get "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state." anymore, but the problem is still here.
After connecting my android phone, I start to get `hub 2-8:1.0: hub_ext_port_status failed (err = -110)` and `device descriptor read/8, error -110` spammed for a while.
The immediate issue is the usb port in question not working but what's worrying is the issue seems to propagate to other usb ports like the ones used by my mouse or keyboard. I guess it's because they are part of the same hub?
Maybe this problem is unrelated but in any case let me know.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #304 |
Thanks for your effort. Unfortunately it doesn't fix the issue.
Tested two WiFi devices on
$ uname -r
5.10.7-arch1-1
First device:
$ lsusb
ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
[ 109.165827] usb 1-2: new high-speed USB device number 4 using xhci_hcd
[ 109.410190] usb 1-2: New USB device found, idVendor=148f, idProduct=5370, bcdDevice= 1.01
[ 109.410195] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 109.410197] usb 1-2: Product: 802.11 n WLAN
[ 109.410199] usb 1-2: Manufacturer: Ralink
[ 109.410201] usb 1-2: SerialNumber: 1.0
[ 109.624366] usb 1-2: reset high-speed USB device number 4 using xhci_hcd
[ 109.858679] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected
[ 110.536313] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 5370 detected
[ 110.542761] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[ 110.555906] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[ 117.628420] ieee80211 phy1: rt2x00lib_
[ 117.628459] ieee80211 phy1: rt2x00lib_
Now running WiFi driver test (we use monitor mode to produce heavy workload):
$ sudo hcxdumptool -i wlp3s0f0u2 --check_driver
[ 121.752121] device wlp3s0f0u2 entered promiscuous mode
[ 121.771509] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 121.822851] device wlp3s0f0u2 left promiscuous mode
Second device:
$ lsusb:
ID 148f:5572 Ralink Technology, Corp. RT5572 Wireless Adapter
[ 419.565208] usb 1-2: new high-speed USB device number 5 using xhci_hcd
[ 419.741196] usb 1-2: New USB device found, idVendor=148f, idProduct=5572, bcdDevice= 1.01
[ 419.741201] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 419.741204] usb 1-2: Product: 802.11 n WLAN
[ 419.741206] usb 1-2: Manufacturer: Ralink
[ 419.741208] usb 1-2: SerialNumber: 1.0
[ 419.950046] usb 1-2: reset high-speed USB device number 5 using xhci_hcd
[ 420.181669] ieee80211 phy2: rt2x00_set_rt: Info - RT chipset 5592, rev 0222 detected
[ 420.859692] ieee80211 phy2: rt2x00_set_rf: Info - RF chipset 000f detected
[ 420.868375] ieee80211 phy2: Selected rate control algorithm 'minstrel_ht'
[ 420.887633] rt2800usb 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[ 434.285018] ieee80211 phy2: rt2x00lib_
[ 434.285066] ieee80211 phy2: rt2x00lib_
Now running WiFi driver test (we use monitor mode to produce heavy workload):
$ sudo hcxdumptool -i wlp3s0f0u2 --check_driver
[ 463.468004] device wlp3s0f0u2 entered promiscuous mode
[ 537.382571] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ 537.384485] ieee80211 phy2: rt2x00usb_
[ 537.411446] device wlp3s0f0u2 left promiscuous mode
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #305 |
The message "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state." is gone for me, but the device still doesn't work after unplugging it and plugging it in again.
After unplugging I get this message in dmesg:
ieee80211 phy1: rt2x00usb_
In Linux Kernel Bug Tracker #202541, sandro.stross (sandro.stross-linux-kernel-bugs) wrote : | #306 |
I am on Kali Linux 2020.4 and tried to use the patch @Mathias Nyman released.
but it failed.
did someone know a tutorial on how to do this on Kali?
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #307 |
An ASUS TUF Gaming notebook (AMD Ryzen 5 3550H), showing a different behavior on the same device:
$ lsusb
ID 148f:5572 Ralink Technology, Corp. RT5572 Wireless Adapter
$ uname -r
5.10.7-arch1-1
$ lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
00:01.7 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7
01:00.0 VGA compatible controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10fa (rev a1)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
03:00.0 Non-Volatile memory controller: Micron Technology Inc Device 5410 (rev 01)
04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821CE 802.11ac PCIe Wireless Network Adapter
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso (rev c2)
05:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
05:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
05:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller
We end up in an endless " rt2x00queue_
[ 61.139415] usb 3-2: new high-speed USB device number 3 using xhci_hcd
[ 61.297702] usb 3-2: New USB device found, idVendor=148f, idProduct=5572, bcdDevice= 1.01
[ 61.297708] ...
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #308 |
@ S4ndm4n KALI is not the best choice to do module tests. It is designed to perform penetration tests and many modules are either patched or third party modules.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #309 |
Since issue affects mostly rt2800usb devices, maybe we can add quirk to xhci to restore pre 4.20 behaviour for endpoints that are used by rt2800usb.
Please check if patch like this make the problem gone:
diff --git a/drivers/
index dfa61de7c83f.
--- a/drivers/
+++ b/drivers/
@@ -2568,6 +2568,8 @@ static int process_
case COMP_USB_
+ if (1) /* this will be quirk for disable Soft Retry */
+ break;
if ((ep_ring-
If it does, I could then prepare patch that will change this part only for rt2800usb.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #310 |
This patch fixes the issue for me
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #311 |
Doesn't work on
$ uname -r
5.10.8-arch1-1
and
ID 148f:5572 Ralink Technology, Corp. RT5572 Wireless Adapter
Warning "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state." is gone, but interface freeze.
To reproduce the issue:
$ hcxdumptool -I
wlan interfaces:
dc4ef4086e71 wlp3s0f0u2 (rt2800usb)
$ sudo hcxdumptool -i wlp3s0f0u2 --check_injection
initialization...
interface freeze and must be disconnected
expected result (we use a chipset, known as working regardless of the xhci issue connected to tthe same USB3 port):
ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
$ hcxdumptool -I
wlan interfaces:
c83a35cb08e3 wlp3s0f0u2 (rt2800usb)
$ sudo hcxdumptool -i wlp3s0f0u2 --check_injection
initialization...
starting antenna test and packet injection test (that can take up to two minutes)...
available channels: 1,2,3,4,
packet injection is working on 2.4GHz!
injection ratio: 72% (BEACON: 87 PROBERESPONSE: 63)
your injection ratio is good
antenna ratio: 83% (NETWORK: 24 PROBERESPONSE: 20)
your antenna ratio is excellent, let's ride!
4 driver errors encountered during the test
terminating...
BTW:
Don't worry about the 4 driver errors. The first received packets (via raw socket) after entering monitor mode don't contain a radiotap header. This could be driver related.
BTW:
Now we connect this device
ID 148f:5572 Ralink Technology, Corp. RT5572 Wireless Adapter
to an USB2 port:
$ sudo hcxdumptool -i wlp39s0f3u1u1u2 --check_injection
initialization...
starting antenna test and packet injection test (that can take up to two minutes)...
available channels: 1,2,3,4,
packet injection is working on 2.4GHz!
injection ratio: 31% (BEACON: 16 PROBERESPONSE: 5)
your injection ratio is average, but there is still room for improvement
antenna ratio: 66% (NETWORK: 3 PROBERESPONSE: 2)
your antenna ratio is good
terminating...
No warning, no disconnect, everything is fine.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #312 |
Michael, you experiencing different problem than Bernhard. Perhaps you can bisect this or just check if it ever worked on some older kernel (in this broken case of RT5572 + USB3) .
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #313 |
Stanislaw, are you sure?
Same device, tested on Intel system (xhci unpatched):
Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
06:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
07:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
ASM1042 should be USB3, too - but I'm not sure.
$ uname -r
5.10.8-arch1-1
and
ID 148f:5572 Ralink Technology, Corp. RT5572 Wireless Adapter
$ sudo hcxdumptool -I
wlan interfaces:
dc4ef4086e71 wlp0s26u1u2 (rt2800usb)
$ sudo hcxdumptool -i wlp0s26u1u2 --check_injection
initialization...
starting antenna test and packet injection test (that can take up to two minutes)...
available channels: 1,2,3,4,
packet injection is working on 2.4GHz!
injection ratio: 35% (BEACON: 176 PROBERESPONSE: 62)
your injection ratio is average, but there is still room for improvement
antenna ratio: 35% (NETWORK: 20 PROBERESPONSE: 7)
your antenna ratio is average, but there is still room for improvement
2 driver errors encountered during the test
terminating...
BTW:
5GHz injection not shown as working, because I haven't set up a 5GHz ACCESS POINT to respond to hcxdumptool
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #314 |
I forgot to mention for the RYZEN system:
03:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset USB 3.1 xHCI Controller (rev 02)
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #315 |
Stanislaw, the difference between them
using ehci-pci on the Intel system
vs
using xhci_hcd on the RYZEN
Thanks for pointing me into this direction.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #316 |
> Stanislaw, are you sure?
Well, I asked to check if hardware combination that is now broken for you ever worked. In Bernhard case it worked on 4.19 and stop to work on 4.20 and he was able to identify broken commit.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #317 |
Created attachment 294785
rt2800usb_
This is patch that restore old xhci behaviour only for rt2800usb. I use usb->transfer_flags to add "quirk" flag. Mathias, do you think it's ok to avoid Soft Retry this way, maybe you have some better idea as solution?
Bernhard, please test if it still fixes the issue for you .
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #318 |
/home/zerobeat/
217 | urb->transfer_flags |= URB_SOFT_
| ^~~~~~~
| URB_SHORT_NOT_OK
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #319 |
As Bernhard, I can confirm that this patch
https:/
is working for me.
This device
https:/
now is working after applying that patch.
Also I can confirm the the issue on the RT5572 is related to USB3 and it is a new one.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #320 |
When applying the patch from https:/
[ 194.130691] usb 1-3: BOGUS urb flags, 1000200 --> 200
[ 194.130704] WARNING: CPU: 0 PID: 113 at drivers/
[ 194.130705] Modules linked in: rt2800usb rt2x00usb rt2800lib rt2x00lib snd_usb_audio mac80211 btusb btrtl btbcm snd_usbmidi_lib btintel snd_rawmidi snd_seq_device bluetooth xpad libarc4 mc joydev ff_memless mousedev ecdh_generic ecc cfg80211 ccm algif_aead cbc des_generic libdes ecb edac_mce_amd kvm_amd algif_skcipher rfkill kvm cmac md4 algif_hash af_alg ppdev wmi_bmof irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd snd_hda_
[ 194.130748] fb_sys_fops soundcore ip6t_rt wmi parport_pc parport pinctrl_amd gpio_amdpt video gpio_generic mac_hid nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_
[ 194.130769] CPU: 0 PID: 113 Comm: kworker/u32:8 Tainted: G W 5.10.8-arch1-1 #3
[ 194.130770] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 PC MATE (MS-7A34), BIOS A.J0 01/23/2019
[ 194.130774] Workqueue: phy1 rt2x00usb_
[ 194.130776] RIP: 0010:usb_
[ 194.130777] Code: bc 24 a0 00 00 00 48 89 54 24 08 e8 41 c1 f3 ff 48 8b 54 24 08 45 89 f0 44 89 f9 48 89 c6 48 c7 c7 60 49 ff 96 e8 d1 9c 2c 00 <0f> 0b 83 e3 01 0f 85 f1 00 00 00 8b 74 24 04 48 83 c4 18 48 89 ef
[ 194.130778] RSP: 0018:ffffb97d00
[ 194.130779] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff9a5d0ec18bb8
[ 194.130780] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9a5d0ec18bb0
[ 194.130780] RBP: ffff9a5a337980c0 R08: 0000000000000000 R09: ffffb97d00777ba8
[ 194.130781] R10: ffffb97d00777ba0 R11: ffffffff976ca568 R12: ffff9a5a191a1800
[ 194.130782] R13: 0000000000000002 R14: 0000000000000200 R15: 0000000001000200
[ 194.130783] FS: 000000000000000
[ 194.130783] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 194.130784] CR2: 00007fb1c6c7e7c8 CR3: 000000011c504000 CR4: 00000000003506f0
[ 194.130784] Call Trace:
[ 194.130788] rt2x00usb_
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #321 |
@ Stanislaw, may I ask a question?
I purchased the RT5572 adapter several days before. It never worked due to the xhci issue. Now (after your help), it is working and we (with lots of your help) encountered a new issue. Affected combination exclusively
RT5572 - USB3 - xhci - rt2800usb
Should I report this issue?
If yes, is it a xhci issue or a rt2800usb issue?
@ Bernhard, I didn't notice this messages in combination with monitor mode - but I noticed them when running managed in combination with NetworkManager.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #322 |
Created attachment 294799
rt2800usb_
This one should make "BOGUS urb flags" messages gone. Please test.
Patch is for 4.11-rc , perhaps for 4.10 it requires some changes.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #323 |
(In reply to Michael from comment #159)
> @ Stanislaw, may I ask a question?
> I purchased the RT5572 adapter several days before. It never worked due to
> the xhci issue. Now (after your help), it is working and we (with lots of
> your help) encountered a new issue. Affected combination exclusively
> RT5572 - USB3 - xhci - rt2800usb
> Should I report this issue?
> If yes, is it a xhci issue or a rt2800usb issue?
Taking this happen only on some particular hardware, it can be driver, firmware or even hardware issue (both on rt2800usb or usb host). If you can find if this worked on some older kernel version and bisect it, you could report the issue, otherwise (without bisection) I do not see any chance to fix this problem.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #324 |
@ Stanislaw, thanks for your reply.
At this moment, there are too many screws that are turned.
First I'll wait until the "WARN Set TR Deq Ptr cmd failed" received a final fix. Than I'll dive into the driver code to find out, what is going wrong there.
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #325 |
The last patch fixes the issue for me and the BOGUS messages are now gone too. Thanks
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #326 |
patch v2 causes monitor mode to crash (on ioctl() system calls:
[ 602.100650] usb 1-2: BOGUS urb flags, 208 --> 200
[ 602.100691] WARNING: CPU: 10 PID: 15060 at drivers/
[ 602.100692] Modules linked in: mt7601u rt2800usb(OE) rt2x00usb(OE) rt2800lib(OE) rt2x00lib(OE) mac80211 libarc4 nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) cfg80211 snd_hda_
[ 602.100876] crc16 mbcache jbd2 crc32c_intel sr_mod xhci_pci cdrom xhci_pci_renesas
[ 602.100879] CPU: 10 PID: 15060 Comm: hcxdumptool Tainted: P W OE 5.10.9-arch1-1 #1
[ 602.100880] Hardware name: Micro-Star International Co., Ltd. MS-7A33/X370 KRAIT GAMING (MS-7A33), BIOS 1.F0 11/06/2018
[ 602.100881] RIP: 0010:usb_
[ 602.100882] Code: bc 24 a0 00 00 00 48 89 54 24 08 e8 01 c1 f3 ff 48 8b 54 24 08 45 89 f0 44 89 f9 48 89 c6 48 c7 c7 f8 47 bf b9 e8 51 99 2c 00 <0f> 0b 83 e3 01 0f 85 f1 00 00 00 8b 74 24 04 48 83 c4 18 48 89 ef
[ 602.100882] RSP: 0018:ffffb69648
[ 602.100883] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff88a90ee98bb8
[ 602.100884] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff88a90ee98bb0
[ 602.100884] RBP: ffff88a60f3fb500 R08: 0000000000000000 R09: ffffb69648f5f948
[ 602.100926] R10: ffffb69648f5f940 R11: ffffffffba2c0500 R12: ffff88a6152db800
[ 602.100926] R13: 0000000000000002 R14: 0000000000000200 R15: 0000000000000208
[ 602.100927] FS: 00007fef23ab028
[ 602.101008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 602.101009] CR2: 00007fef2403eff8 CR3: 000000014b076000 CR4: 00000000003506e0
[ 602.101050] Call Trace:
[ 602.101132] rt2x00usb_
[ 602.101175] rt2x00queue_
[ 602.101257] rt2x00lib_
[ 602.101300] rt2x00lib_
[ 602.101391] drv_start+
[ 602.101444] ieee80211_
[ 602.101536] ? ieee80211_
[ 602.101577] __dev_open+
[ 602.101658] __dev_change_
[ 602.101699] ? enqueue_
[ 602.101780] dev_change_
[ 602.101821] devinet_
[ 602.101823] ? preempt_
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #327 |
(In reply to Michael from comment #164)
> patch v2 causes monitor mode to crash (on ioctl() system calls:
>
> [ 602.100650] usb 1-2: BOGUS urb flags, 208 --> 200
> [ 602.100691] WARNING: CPU: 10 PID: 15060 at drivers/
> usb_submit_
[snip]
> [ 602.100879] CPU: 10 PID: 15060 Comm: hcxdumptool Tainted: P W OE
> 5.10.9-arch1-1 #1
Those are same "BOGUS urb flags" messages like reported before by Bernhard. I think you did not correctly apply v2 patch on top of 5.10. Please double check if this hunk is present on your backported patch:
diff --git a/drivers/
index 357b149b20d3.
--- a/drivers/
+++ b/drivers/
@@ -495,7 +495,7 @@ int usb_submit_
/* Check against a simple/standard policy */
allowed = (URB_NO_
- URB_FREE_BUFFER);
+ URB_SOFT_
switch (xfertype) {
case USB_ENDPOINT_
case USB_ENDPOINT_
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #328 |
The patch was applied to urb.c (5.10.9):
/* Check against a simple/standard policy */
allowed = (URB_NO_
URB_
switch (xfertype) {
case USB_ENDPOINT_
case USB_ENDPOINT_
At this moment, I don't know what exactly went wrong. I'll try to identify the issue.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #329 |
Maybe usb layer was compiled in the kernel and you only reload modules.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #330 |
Thanks. Now the modules are loaded correctly and the BOGUS messages disappeared.
Unfortunately monitor mode is not working with v2:
ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
before:
$ sudo hcxdumptool -i wlp3s0f0u2 --check_injection
initialization...
starting antenna test and packet injection test (that can take up to two minutes)...
available channels: 1,2,3,4,
packet injection is working on 2.4GHz!
injection ratio: 54% (BEACON: 123 PROBERESPONSE: 67)
your injection ratio is good
antenna ratio: 45% (NETWORK: 20 PROBERESPONSE: 9)
your antenna ratio is average, but there is still room for improvement
terminating...
after v2:
$ sudo hcxdumptool -i wlp3s0f0u2 --check_injection
initialization...
starting antenna test and packet injection test (that can take up to two minutes)...
available channels: 1,2,3,4,
warning: no PROBERESPONSE received - packet injection is probably not working!
8 driver errors encountered during the test
terminating...
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #331 |
Michael, at this point I really doubt about reliability of your testing.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #332 |
Stanislaw, and you're not the only one. I doubt it, too.
Maybe I patched my kernel to death and it is time for me to compile a fresh one.
But anyway, thanks for your effort an for your patience.
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #333 |
Stanislaw, short notice for you. Now, I'm running the fresh kernel (the RYZEN is really fast compiling it). Patch v2 is applied.
Everything is working fine and all Bogus messages are gone.
Thanks again.
In Linux Kernel Bug Tracker #202541, wgh (wgh-linux-kernel-bugs) wrote : | #334 |
(In reply to Mathias Nyman from comment #139)
> rewritten URB cancel, endpoint stop and set trb deq can be found in my tree
> in rewrite_
>
> git://git.
> rewrite_
>
> https:/
> ?h=rewrite_
>
> Does that help?
I applied the patch to 5.10.11-gentoo, and it did help with my HackRF One (see comment #136 for details and hardware)! No ill effects so far.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #335 |
After discussion on my posted patch here:
https://<email address hidden>/t/#u
it was concluded that this should be rather be xhci quirk instead of rt2800usb driver flag.
If change from comment 147 help for you with the problem, please provide PCI-id of your xHCI controller. This can be done by command:
lspci -k -nn | grep -B2 xhci
If you have more than one xHCI controller please assure you provide PCI-id's of one that actually has the problem ('lspci -t' command can be useful as well)
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #336 |
(In reply to Stanislaw Gruszka from comment #173)
> If you have more than one xHCI controller please assure you provide PCI-id's
> of one that actually has the problem ('lspci -t' command can be useful as
> well)
I meant 'lsusb -t'
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #337 |
USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset USB 3.1 xHCI Controller [1022:43b9] (rev 02)
Subsystem: ASMedia Technology Inc. Device [1b21:1142]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #338 |
Created attachment 295055
0001-usb-
This is next proposed fix. It suppose to disable Soft Retry for affected xHCI controllers. Currently only for xHCI device reported by Michael:
PCI_VENDOR_ID_AMD = 0x1022 , PCI_DEVICE_
If you want to test and have different xHCI host you need to add your PCI-id's to
drivers/
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #339 |
@Stanislaw, I followed the discussion you mentioned here:
https:/
Other devices than rt2800usb devices are affected, too.
Tested this one before applying your patch:
ID 7392:7710 Edimax Technology Co., Ltd Edimax Wi-Fi
and running into the same xhci issue on USB controller mentioned here:
https:/
[10214.423508] usb 1-2: new high-speed USB device number 3 using xhci_hcd
[10214.602833] usb 1-2: New USB device found, idVendor=7392, idProduct=7710, bcdDevice= 0.00
[10214.602838] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[10214.602841] usb 1-2: Product: Edimax Wi-Fi
[10214.602843] usb 1-2: Manufacturer: MediaTek
[10214.602845] usb 1-2: SerialNumber: 1.0
[10214.931553] usb 1-2: reset high-speed USB device number 3 using xhci_hcd
[10215.102895] mt7601u 1-2:1.0: ASIC revision: 76010001 MAC revision: 76010500
[10215.132670] mt7601u 1-2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
[10216.101346] mt7601u 1-2:1.0: EEPROM ver:0d fae:00
[10216.111983] mt7601u 1-2:1.0: EEPROM country region 01 (channels 1-13)
[10217.189574] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[10217.190361] usbcore: registered new interface driver mt7601u
[10217.199429] mt7601u 1-2:1.0 wlp3s0f0u2: renamed from wlan0
[10296.419053] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[10296.419228] xhci_hcd 0000:03:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
In Linux Kernel Bug Tracker #202541, jg.staffel (jg.staffel-linux-kernel-bugs) wrote : | #340 |
The same problem (with ID 04a9:220d Canon, Inc. CanoScan N670U/N676U/LiDE 20):
Feb 03 09:48:54 [kernel] [34974.104606] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Feb 03 09:49:49 [kernel] [35029.419748] usb 1-6: USB disconnect, device number 3
Feb 03 09:49:52 [kernel] [35031.994403] usb 1-6: new full-speed USB device number 6 using xhci_hcd
Feb 03 09:50:45 [kernel] [35085.400634] xhci_hcd 0000:01:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Feb 03 09:50:45 [kernel] [35085.404278] xhci_hcd 0000:01:00.0: WARN Successful completion on short TX
Feb 03 09:50:45 [kernel] [35085.404398] xhci_hcd 0000:01:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 4 comp_code 1
Feb 03 09:50:45 [kernel] [35085.404401] xhci_hcd 0000:01:00.0: Looking for event-dma 00000008146ff050 trb-start 00000008146ff060 trb-end 00000008146ff060 seg-start 00000008146ff000 seg-end 00000008146ffff0
$ lspci -k -nn | grep -B2 xhci
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller [1b21:1142]
Kernel driver in use: xhci_hcd
--
09:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
Subsystem: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:139d]
Kernel driver in use: xhci_hcd
--
0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:7914]
Kernel driver in use: xhci_hcd
$ uname -a
Linux Gentoo 5.4.92-gentoo #1 SMP PREEMPT Thu Jan 28 20:45:52 MSK 2021 x86_64 AMD Ryzen 5 2600 Six-Core Processor AuthenticAMD GNU/Linux
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #341 |
(In reply to Michael from comment #177)
> Other devices than rt2800usb devices are affected, too.
> Tested this one before applying your patch:
> ID 7392:7710 Edimax Technology Co., Ltd Edimax Wi-Fi
> and running into the same xhci issue on USB controller mentioned here:
> https:/
Ok, so it makes sense to disable Soft Retry per xHCI.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #342 |
(In reply to alpir from comment #178)
> The same problem (with ID 04a9:220d Canon, Inc. CanoScan N670U/N676U/LiDE
> 20):
>
> Feb 03 09:48:54 [kernel] [34974.104606] xhci_hcd 0000:01:00.0: WARN Set TR
> Deq Ptr cmd failed due to incorrect slot or ep state.
alpir, does the change from comment 147 help for you ?
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #343 |
alpir, you have different device-id than Michael, but you both have the same subsytem device: ASMedia 1b21:1142. So perhaps patch should be based on subdevice id's. Let's wait for other users reports regarding xHCI controller, we will see then.
In Linux Kernel Bug Tracker #202541, jg.staffel (jg.staffel-linux-kernel-bugs) wrote : | #344 |
I tried patch from comment 147. The error "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" has gone. But behavior USDB3.1 still the same.
Why did I even start looking for the reason for the strange behavior of OSD ports: two my JetFlash Transcend 8GB flash drives connected to the USB3 port is sometimes not detected by the system as being mountable (fat32). When I run a disk check (8 Gb) with the command badblocks -nvs / dev / sdd, then after a while the check ends with the following error: Pass completed, 5662144 bad blocks found. (5662144/0/0 errors). And both flash drives.
But if you connect them to USB2, then there are no errors at all.
At the same time, when looking at the logs, I found errors: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Now, after patch, i get next in logs:
Feb 03 17:47:14 [kernel] [ 52.603587] usb 2-3: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:47:14 [kernel] [ 52.636130] usb-storage 2-3:1.0: USB Mass Storage device detected
Feb 03 17:47:14 [kernel] [ 52.636242] scsi host11: usb-storage 2-3:1.0
Feb 03 17:47:14 [kernel] [ 52.651996] usbcore: registered new interface driver uas
Feb 03 17:47:16 [kernel] [ 54.013780] scsi 11:0:0:0: Direct-Access JetFlash Transcend 8GB 1100 PQ: 0 ANSI: 6
Feb 03 17:47:16 [kernel] [ 54.014688] sd 11:0:0:0: [sdd] 15425536 512-byte logical blocks: (7.90 GB/7.36 GiB)
Feb 03 17:47:16 [kernel] [ 54.015150] sd 11:0:0:0: [sdd] Write Protect is off
Feb 03 17:47:16 [kernel] [ 54.015156] sd 11:0:0:0: [sdd] Mode Sense: 43 00 00 00
Feb 03 17:47:16 [kernel] [ 54.015625] sd 11:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 03 17:47:16 [kernel] [ 54.028165] sdd: sdd1
Feb 03 17:47:16 [kernel] [ 54.045687] sd 11:0:0:0: [sdd] Attached SCSI removable disk
Feb 03 17:48:04 [kernel] [ 102.221862] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:51:52 [kernel] [ 330.009696] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:55:55 [kernel] [ 573.644576] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:01 [kernel] [ 579.149875] usb 2-3: device descriptor read/8, error -110
Feb 03 17:56:01 [kernel] [ 579.254204] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:06 [kernel] [ 584.781836] usb 2-3: device descriptor read/8, error -110
Feb 03 17:56:07 [kernel] [ 585.073435] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:12 [kernel] [ 590.413816] usb 2-3: device descriptor read/8, error -110
Feb 03 17:56:12 [kernel] [ 590.518146] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:18 [kernel] [ 596.046034] usb 2-3: device descriptor read/8, error -110
Feb 03 17:56:18 [kernel] [ 596.336445] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:23 [kernel] [ 601.677932] usb 2-3: device descriptor read/8, error -110
Feb 03 17:56:23 [kernel] [ 601.782091] usb 2-3: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Feb 03 17:56:29 [kernel] [ 607.309722] usb 2-3: device descr...
In Linux Kernel Bug Tracker #202541, bernhard.gebetsberger (bernhard.gebetsberger-linux-kernel-bugs) wrote : | #345 |
My controller has the PCI ID 43bb, so I've added "PCI_DEVICE_
In Linux Kernel Bug Tracker #202541, ZeroBeat (zerobeat-linux-kernel-bugs) wrote : | #346 |
@Stanislaw, I'm running an older mobo and a RYZEN 1700.
I don't need CPU power - GPU power is more important for me (crypto analysis).
In Linux Kernel Bug Tracker #202541, biopsin (biopsin-linux-kernel-bugs) wrote : | #347 |
[Continuing my first report in comment:https:/
$ lspci -k -nn | grep -B2 xhci
02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
Subsystem: ASMedia Technology Inc. Device [1b21:1142]
Kernel driver in use: xhci_hcd
I have adapted the patch by Mr. Gruszka [https:/
$ uname -a
Linux voidx 5.4.95_1 #1 SMP PREEMPT 1612063540 x86_64 GNU/Linux
If someone has some spare time to glance at it or comment on my error ;)
(diff availible for 30 days) @
https:/
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #348 |
(In reply to alpir from comment #182)
> I tried patch from comment 147. The error "WARN Set TR Deq Ptr cmd failed
> due to incorrect slot or ep state" has gone. But behavior USDB3.1 still the
> same.
[snip]
> But if you connect them to USB2, then there are no errors at all.
alpir, I think you experiencing different issue that can not be solved by simply disabling Soft Retry. Some more fixes are possibly needed for handing your xHCI/usb hardware. Maybe you can try patch from comment 139? If this is regression, maybe you can bisect to find offending commit? Anyway your problems, most likely will require expertise of Mathias Nyman - xhci driver maintainer.
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #349 |
(In reply to biopsin from comment #185)
> [Continuing my first report in
> comment:https:/
Similarly like for as for alpir case this most likely will require some different fixes, but you can try if disabling Soft Retry works. You can just disable like showed in comment 147
> $ lspci -k -nn | grep -B2 xhci
> 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series
> Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
> Subsystem: ASMedia Technology Inc. Device [1b21:1142]
> Kernel driver in use: xhci_hcd
>
[snip]
> If someone has some spare time to glance at it or comment on my error ;)
> (diff availible for 30 days) @
> https:/
ASMedia is subsystem_
diff --git a/drivers/
index 906a0e08821e.
--- a/drivers/
+++ b/drivers/
@@ -102,6 +102,9 @@ static void xhci_pci_
id = pci_match_
+ printk("vendor: 0x%04x device 0x%04x subvendor 0x%04x subdevice 0x%04x\n",
+ pdev->vendor, pdev->device, pdev->subsystem
+
if (id && id->driver_data) {
If indeed those are subsystem ID's I think there is bug in existing xhci-pci.c quirks code:
if (pdev->vendor == PCI_VENDOR_
if (pdev->vendor == PCI_VENDOR_
if (pdev->vendor == PCI_VENDOR_
and those check should be replaced by pdev->subsystem
In Linux Kernel Bug Tracker #202541, stf_xl (stfxl-linux-kernel-bugs) wrote : | #350 |
Created attachment 295065
asmedia_
This patch apply existing xhci ASMedia quirks also for ASMedia subdevices .
Looking into changelog history those quirks helped with some usb disk issues, so perhaps patch could help with disk issues reported here i.e. alpir and biopsin cases. Please test.
In Linux Kernel Bug Tracker #202541, jg.staffel (jg.staffel-linux-kernel-bugs) wrote : | #351 |
None of the patches (comments 139, 147, 188) did not solve my problem.
In Linux Kernel Bug Tracker #202541, biopsin (biopsin-linux-kernel-bugs) wrote : | #352 |
@Gruszka
Your patch [https:/
I'm currently testing it with my setup and kernel 5.4.95_x86_64.
Tested against one PATA and one SATA drives, so far I see no ill effects, but I also can't confirm or deny it does anything with this short timespan, and much have change since my initial post last year. I will at least continuing applying it now and then out this year and report any newsworthy. Thank you for your time and help!
In Linux Kernel Bug Tracker #202541, raulvior.bcn (raulvior.bcn-linux-kernel-bugs) wrote : | #353 |
Created attachment 295151
Dmesg of a Toshiba USB 3.0 HDD connected to USB 3.0 front port and back port.
I am having this error on Linux 5.10.10-051010 while trying to connect a USB 3.0 hard disk, Toshiba Touro 4TB (HitachiGST). If I connect the disk to a USB 2.0 port it works flawlessly.
The kernel shows a different kind of error depending on whether I connect the HDD to the front or back USB 3.0 ports of the motherboard MSI X470 Gaming Plus MAX.
lspci -vnnt:
> -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-0fh) Root Complex [1022:1450]
> +-00.2 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-0fh) I/O Memory Management Unit [1022:1451]
> +-01.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-01.1-[01]----00.0 Samsung Electronics Co Ltd NVMe SSD
> Controller SM981/PM981/PM983 [144d:a808]
> +-01.3-
> [1022:43d0]
> | +-00.1 Advanced Micro Devices, Inc. [AMD] 400
> Series Chipset SATA Controller [1022:43c8]
> | \-00.2-
> | +-01.0-[22]----00.0 Realtek
> Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit
> Ethernet Controller [10ec:8168]
> | +-02.0-[23]--
> | +-03.0-[24]--
> | +-04.0-[25]--
> | \-08.0-[26]----00.0 ASMedia
> Technology Inc. ASM1142 USB 3.1 Host Controller [1b21:1242]
> +-02.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-03.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-03.1-[27]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI]
> Ellesmere [Radeon RX 470/480/
> | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI]
> Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
> +-04.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-07.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-07.1-[28]--+-00.0 Advanced Micro Devices, Inc. [AMD]
> Zeppelin/
> | +-00.2 Advanced Micro Devices, Inc. [AMD] Family 17h
> (Models 00h-0fh) Platform Security Processor [1022:1456]
> | \-00.3 Advanced Micro Devices, Inc. [AMD] Zeppelin
> USB 3.0 Host controller [1022:145f]
> +-08.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models
> 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
> +-08.1-[29]--+-00.0 Advance...
In Linux Kernel Bug Tracker #202541, raulvior.bcn (raulvior.bcn-linux-kernel-bugs) wrote : | #354 |
Created attachment 295183
Dmesg of a OnePlus 7 Pro connecting in USB 3.1 gen1 mode. No errors.
(In reply to raul from comment #191)
Connecting a Oneplus 7 Pro smartphone does show any error. This phone has a USB 3.1 gen1 port and connects in that mode without errors. I can navigate the filesystem as one would expect.
Changed in linux: | |
importance: | Unknown → High |
status: | Unknown → Confirmed |
In Linux Kernel Bug Tracker #202541, tisaak (tisaak-linux-kernel-bugs) wrote : | #355 |
Same issue with a Seagate Portable 4 TB USB 3.0 drive that I connect with usb-storage quirks as its UAS implementation is problematic. Random hangs that flood dmesg with errors.
lsusb -tv
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 3: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
ID 0bc2:231a Seagate RSS LLC Expansion Portable
Errors in dmesg start like this...
xhci_hcd 0000:00:10.0: WARN Cannot submit Set TR Deq Ptr
xhci_hcd 0000:00:10.0: A Set TR Deq Ptr command is pending.
usb 3-3: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
sd 5:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=
sd 5:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 a4 01 ed 78 00 00 00 10 00 00
After that:
task:usb-storage state:D stack: 0 pid: 286 ppid: 2 flags:0x00004000
Call Trace:
__schedule+
? usleep_
schedule+
schedule_
? __prepare_
__wait_
usb_sg_
usb_stor_
usb_stor_
usb_stor_
? __prepare_
? __wait_
usb_stor_
? storage_
kthread+
? __kthread_
ret_from_
Søren Rasmussen (sorenrasmussen) wrote : | #161 |
The same happens for me on a Dell XPS15 9560, Ubuntu 20.04/Linux 5.11.10-
The affected devices are connected to a USB3 hub on my monitor (not via a dock)
[ +0.001348] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000017] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec92360 trb-start 00000001dec92370 trb-end 00000001dec92370 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.002982] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000016] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec92370 trb-start 00000001dec92380 trb-end 00000001dec92380 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000016] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000009] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec92380 trb-start 00000001dec92390 trb-end 00000001dec92390 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000012] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000007] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec92390 trb-start 00000001dec923a0 trb-end 00000001dec923a0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.001943] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000013] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923a0 trb-start 00000001dec923b0 trb-end 00000001dec923b0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000014] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000008] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923b0 trb-start 00000001dec923c0 trb-end 00000001dec923c0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.002992] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000010] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923c0 trb-start 00000001dec923d0 trb-end 00000001dec923d0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000008] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000005] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923d0 trb-start 00000001dec923e0 trb-end 00000001dec923e0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000005] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000004] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923e0 trb-start 00000001dec923f0 trb-end 00000001dec923f0 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.001967] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ +0.000010] xhci_hcd 0000:00:14.0: Looking for event-dma 00000001dec923f0 trb-start 00000001dec92400 trb-end 00000001dec92400 seg-start 00000001dec92000 seg-end 00000001dec92ff0
[ +0.000009] xhci_hcd 0000:00:14.0: WARN Event TRB for slot 44 ep 1 w...
In Linux Kernel Bug Tracker #202541, mathias.nyman (mathias.nyman-linux-kernel-bugs) wrote : | #356 |
(In reply to Zak from comment #193)
>
>
> Errors in dmesg start like this...
>
> xhci_hcd 0000:00:10.0: WARN Cannot submit Set TR Deq Ptr
> xhci_hcd 0000:00:10.0: A Set TR Deq Ptr command is pending.
There are recent major changes in this area in the xhci driver.
The above message no longer exists, new message in this case is
"Set TR Deq already pending, don't submit for x"
Can you try this on a 5.12-rc kernel?
Thanks
Mathias
In Linux Kernel Bug Tracker #202541, mlkcampion (mlkcampion-linux-kernel-bugs) wrote : | #357 |
Created attachment 296259
xhci no soft retry for Intel xhci 8086:06ed and 8086:31a8
Hi
I am having this issue on 2 systems when I plug in
a Hoco Hub HB16. The Hoco Hub HB16 is a 6 in 1 adapter that
includes
Type-C to USB3.0 x3
Type-C to HDMI
Type-C to RJ45 Ethernet (RealTek RTL8153, linux loads driver rtl8153b-2)
Type-C to Type-C(PD2.0)
USB Billboard device
Also when the device is plugged into a Windows10 machine
for the first time it presents a disk that contains the RTL8153
drivers, the user is provided with an option to install these. This
"disk" is not visible later.
The 2 systems where this device failed both reported
"WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state."
Both systems have Ubuntu Mate 20.10
$ uname -a
5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
1. Dell XPS 9500 (Intel(R) Core(TM) i5-10300H CPU @ 2.50GHz)
$ sudo lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller [8086:06ed]
Subsystem: Dell Comet Lake USB 3.1 xHCI Host Controller [1028:097d]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
7:00.0 USB controller [0c03]: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [8086:15ec] (rev 06)
Subsystem: Dell JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [1028:097d]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
2. Seed Studio Odyssey J4105 (Intel(R) Celeron(R) J4105 CPU @ 1.50GHz)
$ sudo lspci -k -nn | grep -B3 xhci
00:15.0 USB controller [0c03]: Intel Corporation Device [8086:31a8] (rev 03)
DeviceName: Onboard - Other
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
I applied the changes in Stanislaw's patch at comment 176, I added the
PCI IDs to match both my systems.
I can confirm that with the patch applied both systems no longer reported the
issue ""WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state."
Just to note that on the Dell XPS I use the Dell DA20 Adapter which is a Type-C
to USB and HDMI adapter. This appears to have an ASIX Elec. Corp. AX88179
USB 3.0 to Gigabit Ethernet which I don't have any issues with.
In Linux Kernel Bug Tracker #202541, luke-jr+linuxbugs (luke-jr+linuxbugs-linux-kernel-bugs) wrote : | #358 |
Encountered this with a PCI-e card using ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller
Moved to my native "Intel Corporation Device a3af" USB bus, this error disappeared (though other problems remain in my case)
Linux 5.10.33
Of potential noteworthiness: When I got my Talos II, I tried to move this ASMedia USB PCI-e card to it, and found it was immediately shutdown by the IOMMU whenever I would try to use it at all. It seems the firmware is garbage.
IIRC, someone was getting close to an open source firmware replacement without those issues... would be interesting to see if it helps with this bug as well.
In Linux Kernel Bug Tracker #202541, dront78 (dront78-linux-kernel-bugs) wrote : | #359 |
same problem
5.12.12-arch1-1 #1 SMP PREEMPT Fri, 18 Jun 2021 21:59:22 +0000 x86_64 GNU/Linux
GPD Pocket
00:00.0 Host bridge [0600]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: iosf_mbi_pci
00:02.0 VGA compatible controller [0300]: Intel Corporation Atom/Celeron/
DeviceName: Onboard IGD
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: i915
Kernel modules: i915
00:0b.0 Signal processing controller [1180]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: proc_thermal
Kernel modules: processor_
00:14.0 USB controller [0c03]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:1a.0 Encryption controller [1080]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel modules: mei_txe
00:1c.0 PCI bridge [0604]: Intel Corporation Atom/Celeron/
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Atom/Celeron/
Subsystem: Intel Corporation Device [8086:7270]
Kernel modules: lpc_ich
01:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM4356 802.11ac Wireless Network Adapter [14e4:43ec] (rev 02)
Subsystem: Gemtek Technology Co., Ltd Device [17f9:0036]
Kernel driver in use: brcmfmac
Kernel modules: brcmfmac
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.
Table at 0x5B8DE000.
Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
Vendor: American Megatrends Inc.
Version: 5.11
Release Date: 06/28/2017
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 4 MB
Characteristics:
PCI is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
BIOS ROM is socketed
EDD is supported
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
3.5"/2.88 MB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Targeted content distribution is supported
UEFI is supported
BIOS Revision: 5.11
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: Default string
Product Name: Default string
Version: Default string
Serial Number: Default string
UUID: 03000200-
Wake-up ...
In Linux Kernel Bug Tracker #202541, antdev66 (antdev66-linux-kernel-bugs) wrote : | #360 |
I have same problem with kernels 5.13.12 and 5.14.0-rc7:
dmesg:
xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
journalctl:
ago 24 18:38:40 SERVER kernel: sd 4:0:0:0: [sda] tag#3 FAILED Result: hostbyte=
In Linux Kernel Bug Tracker #202541, stulluk (stulluk-linux-kernel-bugs) wrote : | #361 |
I also experience exactly same issue on multiple USB devices ( USB-WIFI or a USB-Webcam ) only on my brand new AMD Mainboard ( ASRock model: B550M-HDV)
I tried both ubuntu focal and hirsute with latest kernels on my OldPC (ASUSTeK model: M5A78L-M LX3) and on my IntelNUC (NUC8BEB) and this issue does not happen (Tried with same USB-WIFI and USB-Webcam devices).
Issue is easily reproducible by inserting USB-WIFI and then executing "ip a" on a shell.
In Linux Kernel Bug Tracker #202541, dion (dion-linux-kernel-bugs) wrote : | #362 |
I also have exactly same problem, but with a bit different HW.
Now it's USB DAC branded as "Qudelix-5K". As far as I understand it's USB1 device.
[ 174.358189] usb 5-2.3.2.2.1.1: new full-speed USB device number 17 using xhci_hcd
[ 174.475229] usb 5-2.3.2.2.1.1: New USB device found, idVendor=0a12, idProduct=4025, bcdDevice=19.70
[ 174.475232] usb 5-2.3.2.2.1.1: New USB device strings: Mfr=1, Product=8, SerialNumber=3
[ 174.475233] usb 5-2.3.2.2.1.1: Product: Qudelix-5K USB DAC/MIC 48KHz
[ 174.475234] usb 5-2.3.2.2.1.1: Manufacturer: QTIL
[ 174.475235] usb 5-2.3.2.2.1.1: SerialNumber: ABCDEF0123456789
It produces corrupted sound (actually some noise) just after a few seconds of playback if connected to Dell WD19TB thunderbolt dock station. Issue happens with USB-A ports on dock plus one Type-C port (front). Second Type-C port (named as "Type-C with Thunderbolt 3 port" works.
When such noise happens I'm getting followed in dmesg:
xhci_hcd 0000:3a:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 5 comp_code 1
xhci_hcd 0000:3a:00.0: Looking for event-dma 00000000ffe940f0 trb-start 00000000ffe94100 trb-end 00000000ffe94100 seg-start 00000000ffe94000 seg-end 00000000ffe94ff0
xhci_hcd 0000:3a:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 5 comp_code 1
xhci_hcd 0000:3a:00.0: Looking for event-dma 00000000ffe949b0 trb-start 00000000ffe949c0 trb-end 00000000ffe949c0 seg-start 00000000ffe94000 seg-end 00000000ffe94ff0
I've tried to add/remove extra USB hubs (originally Qudelix was plugged to internal USB3 hub of monitor). But even if plugged directly to dock, it produces corrupted sound.
Another important thing: this dock has built-in Ethernet with r8153 chipset like mentioned above.
After reading comments here I've tried to disable soft retry using followed patch:
diff --git a/drivers/
index 1c9a7957c45c.
--- a/drivers/
+++ b/drivers/
@@ -189,10 +189,11 @@ static void xhci_pci_
if (pdev->vendor == PCI_VENDOR_
+ xhci->quirks |= XHCI_NO_SOFT_RETRY;
}
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
And it completely fixed issue for me. DAC produces clear sound even if connected through chain of two hubs!
PS.
lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake PCH-LP USB 3.1 xHCI Host Controller [8086:02ed]
Subsystem: Hewlett-Packard Company Comet Lake PCH-LP USB 3.1 xHCI Host Controller [103c:8724]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
--
37:00.0 USB controller [0c03]: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [8086:15ec] (rev 06)
Subsystem: Hewlett-P...
In Linux Kernel Bug Tracker #202541, raulvior.bcn (raulvior.bcn-linux-kernel-bugs) wrote : | #363 |
Turns out the problem was the cable, it was too long. A shorter USB 3.0 cable (1.8m) allowed a stable connection. On the same Linux 5.13 (the previous dmesg was on Linux 5.10) the longer 3 meters cable kept failing while with the 1.8 meters cable the HDD works without issue.
(In reply to raul from comment #191)
In Linux Kernel Bug Tracker #202541, S.Braendlin (s.braendlin-linux-kernel-bugs) wrote : | #364 |
Hi,
I have also issues with USB3 on my Debian 10 with kernel 5.10.0-
Aug 6 13:20:14 media-server kernel: [ 964.069355] scsi host17: uas_eh_
Aug 6 13:20:14 media-server kernel: [ 964.197532] usb 2-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
Aug 6 13:20:14 media-server kernel: [ 964.219053] scsi host17: uas_eh_
Aug 6 13:20:18 media-server kernel: [ 968.137601] task:sync state:D stack: 0 pid:12237 ppid: 11291 flags:0x00004324
Aug 6 13:20:18 media-server kernel: [ 968.137607] Call Trace:
Aug 6 13:20:18 media-server kernel: [ 968.137621] __schedule+
Aug 6 13:20:18 media-server kernel: [ 968.137630] schedule+0x3c/0xa0
Aug 6 13:20:18 media-server kernel: [ 968.137635] io_schedule+
Aug 6 13:20:18 media-server kernel: [ 968.137644] wait_on_
Aug 6 13:20:18 media-server kernel: [ 968.137651] ? __page_
Aug 6 13:20:18 media-server kernel: [ 968.137657] wait_on_
Aug 6 13:20:18 media-server kernel: [ 968.137663] __filemap_
Aug 6 13:20:18 media-server kernel: [ 968.137673] ? sync_inodes_
Aug 6 13:20:18 media-server kernel: [ 968.137679] filemap_
Aug 6 13:20:18 media-server kernel: [ 968.137684] iterate_
Aug 6 13:20:18 media-server kernel: [ 968.137691] ksys_sync+0x7c/0xb0
Aug 6 13:20:18 media-server kernel: [ 968.137697] __do_sys_
Aug 6 13:20:18 media-server kernel: [ 968.137704] do_syscall_
Aug 6 13:20:18 media-server kernel: [ 968.137709] entry_SYSCALL_
Aug 6 13:20:18 media-server kernel: [ 968.137714] RIP: 0033:0x7fc4ec0529aa
Aug 6 13:20:18 media-server kernel: [ 968.137717] RSP: 002b:00007ffcdd
Aug 6 13:20:18 media-server kernel: [ 968.137723] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc4ec0529aa
Aug 6 13:20:18 media-server kernel: [ 968.137725] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000a8002000
Aug 6 13:20:18 media-server kernel: [ 968.137728] RBP: 0000000000000000 R08: 0000555ba9703dcf R09: 00007ffcddf4afe2
Aug 6 13:20:18 media-server kernel: [ 968.137730] R10: 00007fc4ec01a201 R11: 0000000000000246 R12: 0000000000000001
Aug 6 13:20:18 media-server kernel: [ 968.137733] R13: 0000000000000001 R14: 00007ffcddf49158 R15: 0000000000000000
In Linux Kernel Bug Tracker #202541, pupilla (pupilla-linux-kernel-bugs) wrote : | #365 |
Hello everyone,
I encountered the problem with kernel 6.0.0-rc3 on a lenovo t470 laptop and a usb3 axis card. The system was started with the parameter intel_idle.
I have another similar setup (same laptop and same usb3 network card, but with linux 6.0.0-rc2) that has been active for 8 days started without the parameter intel_idle.
The distribution is Slackware 15 (64 bit).
This is the full output of dmesg.
Any feedback is welcome.
Marco
[ 0.000000] Linux version 6.0.0-rc3 (root@Cherepakha) (gcc (GCC) 11.2.0, GNU ld version 2.37-slack15) #1 SMP PREEMPT_DYNAMIC Tue Aug 30 16:07:18 CEST 2022
[ 0.000000] Command line: auto BOOT_IMAGE=Linux ro root=10303 intel_idle.
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
[ 0.000000] x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
[ 0.000000] x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
[ 0.000000] signal: max sigframe size: 1616
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000
[ 0.000000] BIOS-e820: [mem 0x000000000009d
[ 0.000000] BIOS-e820: [mem 0x00000000000e0
[ 0.000000] BIOS-e820: [mem 0x0000000000100
[ 0.000000] BIOS-e820: [mem 0x0000000040000
[ 0.000000] BIOS-e820: [mem 0x0000000040400
[ 0.000000] BIOS-e820: [mem 0x000000008b79c
[ 0.000000] BIOS-e820: [mem 0x0000000090653
[ 0.000000] BIOS-e820: [mem 0x0000000090654
[ 0.000000] BIOS-e820: [mem 0x000000009b52d
[ 0.000000] BIOS-e820: [mem 0x000000009b59a
[ 0.000000] BIOS-e820: [mem 0x000000009b5ff
[ 0.000000] BIOS-e820: [mem 0x00000000f0000
[ 0.000000] BIOS-e820: [mem 0x00000000fd000
[ 0.000000] BIOS-e820: [mem 0x00000000fec00
[ 0.000000] BIOS-e820: [mem 0x00000000fed00
[ 0.000000] BIOS-e820: [mem 0x00000000fed10
[ 0.000000] BIOS-e820: [mem 0x00000000fed84
[ 0.000000] BIOS-e820: [mem 0x00000000fee00
[ 0.000000] BIOS-e820: [mem 0x00...
In Linux Kernel Bug Tracker #202541, pupilla (pupilla-linux-kernel-bugs) wrote : | #366 |
Hello everyone,
unfortunately it happened again (system started without parameters):
[ 9.561808] br0: port 2(eth1) entered forwarding state
[95735.974041] usb 2-1: USB disconnect, device number 2
[95735.974215] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[95735.974439] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[95735.974471] ax88179_178a 2-1:1.0 eth1: unregister 'ax88179_178a' usb-0000:00:14.0-1, ASIX AX88179 USB 3.0 Gigabit Ethernet
[95735.974523] ax88179_178a 2-1:1.0 eth1: Failed to read reg index 0x0002: -19
[95735.974532] ax88179_178a 2-1:1.0 eth1: Failed to write reg index 0x0002: -19
[95735.974595] br0: port 2(eth1) entered disabled state
[95735.974783] device eth1 left promiscuous mode
[95735.974790] br0: port 2(eth1) entered disabled state
[95735.992489] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0002: -19
[95735.992503] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0001: -19
[95735.992510] ax88179_178a 2-1:1.0 eth1 (unregistered): Failed to write reg index 0x0002: -19
[95736.215301] usb 2-1: new SuperSpeed USB device number 4 using xhci_hcd
[95736.566562] ax88179_178a 2-1:1.0 eth1: register 'ax88179_178a' at usb-0000:00:14.0-1, ASIX AX88179 USB 3.0 Gigabit Ethernet, 00:0e:c6:81:79:01
Marco
In Linux Kernel Bug Tracker #202541, ske5074 (ske5074-linux-kernel-bugs) wrote : | #367 |
I also have the issue. Using Proxmox 7.2 (Debian Bullseye) with a Lenovo M910q core-i7-7700T, using two TPLink UE300 (RTL8153) USB to 1Gbe Ethernet adapters. Each one is stable in a lower USB slot. Swapping the adapters does not change the behavior and only impacts the USB device in the higher slot. Changes to different ports without change.
Easily reproducible with the following commands. Basically I'm trying to plumb bond0 again, which works initially, I get the xhci_hcd warning, and the link is down again. System details are also below.
root@higgins:~# dmesg -C ; ifup -a ; ip link | grep enx ; \
> dmesg -H ; dmesg -C ; sleep 70 ; \
> ip link | grep enx ; dmesg -H
3: enxd03745be5afc: <BROADCAST,
16: enx54af9786ab11: <BROADCAST,
[Sep 3 11:05] device enx54af9786ab11 entered promiscuous mode
[ +0.001236] bond0: (slave enx54af9786ab11): Enslaving as a backup interface with a down link
[ +0.006363] vmbr0: the hash_elasticity option has been deprecated and is always 16
[ +0.013972] r8152 2-4:1.0 enx54af9786ab11: Promiscuous mode enabled
[ +0.001344] r8152 2-4:1.0 enx54af9786ab11: carrier on
3: enxd03745be5afc: <BROADCAST,
17: enx54af9786ab11: <BROADCAST,
[Sep 3 11:05] bond0: (slave enx54af9786ab11): link status definitely up, 1000 Mbps full duplex
[Sep 3 11:06] usb 2-4: USB disconnect, device number 12
[ +0.001544] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[ +0.001435] bond0: (slave enx54af9786ab11): Releasing backup interface
[ +0.029081] device enx54af9786ab11 left promiscuous mode
[ +0.316190] usb 2-4: new SuperSpeed USB device number 13 using xhci_hcd
[ +0.022053] usb 2-4: New USB device found, idVendor=2357, idProduct=0601, bcdDevice=30.00
[ +0.001297] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[ +0.001337] usb 2-4: Product: USB 10/100/1000 LAN
[ +0.001261] usb 2-4: Manufacturer: TP-Link
[ +0.001208] usb 2-4: SerialNumber: 000001
[ +0.137200] usb 2-4: reset SuperSpeed USB device number 13 using xhci_hcd
[ +0.049197] r8152 2-4:1.0: load rtl8153a-4 v2 02/07/20 successfully
[ +0.030905] r8152 2-4:1.0 eth0: v1.12.12
[ +0.007834] r8152 2-4:1.0 enx54af9786ab11: renamed from eth0
root@higgins:~#
-------
System Details
-------
root@higgins:~# uname -a
Linux higgins 5.15.39-4-pve #1 SMP PVE 5.15.39-4 (Mon, 08 Aug 2022 15:11:15 +0200) x86_64 GNU/Linux
root@higgins:~# lspci -k -nn | grep -B2 xhci
00:14.0 USB controller [0c03]: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [8086:a2af]
Subsystem: Lenovo 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [17aa:310b]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
root@higgins:~# lsusb -tv
/: Bus 02.Port 1: D...
In Linux Kernel Bug Tracker #202541, ske5074 (ske5074-linux-kernel-bugs) wrote : | #368 |
(In reply to Sean Kennedy from comment #205)
> I also have the issue. Using Proxmox 7.2 (Debian Bullseye) with a Lenovo
> M910q core-i7-7700T, using two TPLink UE300 (RTL8153) USB to 1Gbe Ethernet
> adapters. Each one is stable in a lower USB slot. Swapping the adapters does
> not change the behavior and only impacts the USB device in the higher slot.
> Changes to different ports without change.
Update - Tried a different dongle - a 2.5Gbe and have two hard drives attached to the system. Doesn't matter where the 2.5Gbe dongle is attached, it eventually errors with "WARN Set TR Deq Ptr cmd failed" And the error rate is only around six times a day right now:
8156 Realtek Semiconductor Corp. USB 10/100/1G/2.5G LAN
# dmesg -T | grep xhci
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: xHCI Host Controller
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810
[Tue Sep 6 13:37:13 2022] usb usb1: Manufacturer: Linux 5.15.39-4-pve xhci-hcd
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: xHCI Host Controller
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[Tue Sep 6 13:37:13 2022] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
[Tue Sep 6 13:37:13 2022] usb usb2: Manufacturer: Linux 5.15.39-4-pve xhci-hcd
[Tue Sep 6 13:37:13 2022] usb 2-1: new SuperSpeed USB device number 2 using xhci_hcd
[Tue Sep 6 13:37:14 2022] usb 2-3: new SuperSpeed USB device number 3 using xhci_hcd
[Tue Sep 6 13:37:14 2022] usb 2-4: new SuperSpeed USB device number 4 using xhci_hcd
[Tue Sep 6 14:39:22 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 14:39:22 2022] usb 2-4: new SuperSpeed USB device number 5 using xhci_hcd
[Tue Sep 6 18:44:01 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 18:44:01 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 18:44:02 2022] usb 2-4: new SuperSpeed USB device number 6 using xhci_hcd
[Tue Sep 6 22:19:06 2022] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[Tue Sep 6 22:19:07 2022] usb 2-4: new SuperSpeed USB device number 7 using xhci_hcd
Since this drops the device from the system and offlines the link, I created a simple script to detect zero UP ethernet devices via cron once a minute and runs a ifnet -a. It's clunky but works.
crontab:
# m h dom mon dow command
* * * * * /root/fixnet.sh >/dev/null 2>&1
fixnet.sh:
#!/bin/sh
STATE=`ip link | grep " enx" | grep UP | wc -l`
if [ $STATE -gt 0 ]; then
# All good. Exit
exit 0
fi
/usr/sbin/ifup -a
sleep 20
ping -c 1 10.0.0.1 | grep "1 received"
if [ $? -eq 0 ]; then
# Network looks good. Exit.
exit 0
fi
sleep 310
ping -c 1 10.0.0.1 | grep "1 received"
if [ $? -ne 0 ]; then
# The network is still down.
systemctl reboot
fi
In Linux Kernel Bug Tracker #202541, james (james-linux-kernel-bugs) wrote : | #369 |
I'm using a 2.5gb ethernet usb device and getting this error intermittently (a dozen times per day).
$ uname -a
Linux hephaestus 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ lsusb
<snip>
Bus 003 Device 016: ID 0bda:8156 Realtek Semiconductor Corp. USB 10/100/1G/2.5G
This is what plays out via /var/log/syslog each time:
Dec 21 10:26:47 hephaestus kernel: [346923.166782] usb 3-4: USB disconnect, device number 15
Dec 21 10:26:47 hephaestus kernel: [346923.166913] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus kernel: [346923.166927] cdc_ncm 3-4:2.0 eth1: unregister 'cdc_ncm' usb-0000:00:14.0-4, CDC NCM
Dec 21 10:26:47 hephaestus kernel: [346923.167071] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus kernel: [346923.170644] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Dec 21 10:26:47 hephaestus dhclient[320734]: receive_packet failed on eth1: Network is down
Dec 21 10:26:47 hephaestus systemd[1]: Stopping ifup for eth1...
Dec 21 10:26:47 hephaestus dhclient[325522]: Killed old client process
Dec 21 10:26:47 hephaestus ifdown[325522]: Killed old client process
Dec 21 10:26:47 hephaestus kernel: [346923.478913] usb 3-4: new SuperSpeed Gen 1 USB device number 16 using xhci_hcd
Dec 21 10:26:47 hephaestus kernel: [346923.499567] usb 3-4: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=31.00
Dec 21 10:26:47 hephaestus kernel: [346923.499573] usb 3-4: New USB device strings: Mfr=1, Product=2, SerialNumber=6
Dec 21 10:26:47 hephaestus kernel: [346923.499577] usb 3-4: Product: USB 10/100/1G/2.5G LAN
Dec 21 10:26:47 hephaestus kernel: [346923.499580] usb 3-4: Manufacturer: Realtek
Dec 21 10:26:47 hephaestus kernel: [346923.499583] usb 3-4: SerialNumber: 001000001
Dec 21 10:26:47 hephaestus kernel: [346923.523736] cdc_ncm 3-4:2.0: MAC-Address: xx:xx:xx:xx:xx:xx
Dec 21 10:26:47 hephaestus kernel: [346923.523742] cdc_ncm 3-4:2.0: setting rx_max = 16384
Dec 21 10:26:47 hephaestus kernel: [346923.523836] cdc_ncm 3-4:2.0: setting tx_max = 16384
Dec 21 10:26:47 hephaestus kernel: [346923.524578] cdc_ncm 3-4:2.0 eth1: register 'cdc_ncm' at usb-0000:00:14.0-4, CDC NCM, xx:xx:xx:xx:xx:xx
Dec 21 10:26:47 hephaestus systemd-
Dec 21 10:26:47 hephaestus systemd-
Dec 21 10:26:47 hephaestus systemd[1]: Found device USB_10_
(then things start back up and the ethernet link goes live again after about 10 seconds)
In Linux Kernel Bug Tracker #202541, james (james-linux-kernel-bugs) wrote : | #370 |
FYI: I have built a kernel with the previously (on this thread) discussed patch (on a 5.4 kernel) and I still have the error multiple times per day.
(In reply to James H from comment #207)
> I'm using a 2.5gb ethernet usb device and getting this error intermittently
> (a dozen times per day).
>
> $ uname -a
> Linux hephaestus 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC
> 2022 x86_64 x86_64 x86_64 GNU/Linux
>
>
> $ lsusb
> <snip>
> Bus 003 Device 016: ID 0bda:8156 Realtek Semiconductor Corp. USB
> 10/100/1G/2.5G
>
>
>
> This is what plays out via /var/log/syslog each time:
>
> Dec 21 10:26:47 hephaestus kernel: [346923.166782] usb 3-4: USB disconnect,
> device number 15
> Dec 21 10:26:47 hephaestus kernel: [346923.166913] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus kernel: [346923.166927] cdc_ncm 3-4:2.0 eth1:
> unregister 'cdc_ncm' usb-0000:00:14.0-4, CDC NCM
> Dec 21 10:26:47 hephaestus kernel: [346923.167071] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus kernel: [346923.170644] xhci_hcd 0000:00:14.0:
> WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> Dec 21 10:26:47 hephaestus dhclient[320734]: receive_packet failed on eth1:
> Network is down
> Dec 21 10:26:47 hephaestus systemd[1]: Stopping ifup for eth1...
> Dec 21 10:26:47 hephaestus dhclient[325522]: Killed old client process
> Dec 21 10:26:47 hephaestus ifdown[325522]: Killed old client process
> Dec 21 10:26:47 hephaestus kernel: [346923.478913] usb 3-4: new SuperSpeed
> Gen 1 USB device number 16 using xhci_hcd
> Dec 21 10:26:47 hephaestus kernel: [346923.499567] usb 3-4: New USB device
> found, idVendor=0bda, idProduct=8156, bcdDevice=31.00
> Dec 21 10:26:47 hephaestus kernel: [346923.499573] usb 3-4: New USB device
> strings: Mfr=1, Product=2, SerialNumber=6
> Dec 21 10:26:47 hephaestus kernel: [346923.499577] usb 3-4: Product: USB
> 10/100/1G/2.5G LAN
> Dec 21 10:26:47 hephaestus kernel: [346923.499580] usb 3-4: Manufacturer:
> Realtek
> Dec 21 10:26:47 hephaestus kernel: [346923.499583] usb 3-4: SerialNumber:
> 001000001
> Dec 21 10:26:47 hephaestus kernel: [346923.523736] cdc_ncm 3-4:2.0:
> MAC-Address: xx:xx:xx:xx:xx:xx
> Dec 21 10:26:47 hephaestus kernel: [346923.523742] cdc_ncm 3-4:2.0: setting
> rx_max = 16384
> Dec 21 10:26:47 hephaestus kernel: [346923.523836] cdc_ncm 3-4:2.0: setting
> tx_max = 16384
> Dec 21 10:26:47 hephaestus kernel: [346923.524578] cdc_ncm 3-4:2.0 eth1:
> register 'cdc_ncm' at usb-0000:00:14.0-4, CDC NCM, xx:xx:xx:xx:xx:xx
> Dec 21 10:26:47 hephaestus systemd-
> naming scheme 'v245'.
> Dec 21 10:26:47 hephaestus systemd-
> is unset or enabled, the speed and duplex are not writable.
> Dec 21 10:26:47 hephaestus systemd[1]: Found device USB_10_
> (then things start back up and the ethernet link goes live again after about
> 10 seconds)
Chris Adams (fkmjo) wrote : | #371 |
I am also having this issue with an ASMedia controller, and I think this is a duplicate bug https:/
In Linux Kernel Bug Tracker #202541, svmohr (svmohr-linux-kernel-bugs) wrote : | #372 |
I also get random disconnects on kernel 6.3.0-7-generic with a Samsung T7 Shield external SSD drive. Unfortunately it is hard to reproduce this error, it usually takes hours before it occurs the first time.
System:
Kernel: 6.3.0-7-generic arch: x86_64 bits: 64 compiler: N/A Console: pty pts/10 Distro: Ubuntu
23.10 (Mantic Minotaur)
Machine:
Type: Server System: Supermicro product: C9Z390-PGW v: 0123456789 serial: <filter>
Mobo: Supermicro model: C9Z390-PGW v: 1.01A serial: <filter> UEFI: American Megatrends v: 1.3
date: 06/03/2020
CPU:
Info: 8-core model: Intel Core i9-9900K bits: 64 type: MT MCP arch: Coffee Lake rev: D cache:
L1: 512 KiB L2: 2 MiB L3: 16 MiB
Speed (MHz): avg: 3687 high: 5002 min/max: 800/5000 cores: 1: 5002 2: 3600 3: 3600 4: 3600
5: 3600 6: 3600 7: 3600 8: 3600 9: 3600 10: 3600 11: 3600 12: 3600 13: 3600 14: 3600 15: 3600
16: 3600 bogomips: 115200
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 4: Dev 10, If 0, Class=Mass Storage, Driver=uas, 10000M
ID 04e8:61fb Samsung Electronics Co., Ltd
BOOT_IMAGE=
io-pci vfio_pci.
[349280.239403] usb 2-4: USB disconnect, device number 9
[349280.239689] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
[349280.239695] usb 2-4: cmd cmplt err -108
[349280.239702] sd 9:0:0:0: [sdh] tag#13 uas_zap_pending 0 uas-tag 1 inflight: CMD
[349280.239705] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239724] sd 9:0:0:0: [sdh] tag#13 FAILED Result: hostbyte=
[349280.239726] sd 9:0:0:0: [sdh] tag#13 CDB: Write(16) 8a 00 00 00 00 00 d3 28 e4 00 00 00 00 d8 00 00
[349280.239728] I/O error, dev sdh, sector 3542672384 op 0x1:(WRITE) flags 0x8800 phys_seg 27 prio class 2
[349280.239741] device offline error, dev sdh, sector 3542674432 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239747] device offline error, dev sdh, sector 3542672640 op 0x1:(WRITE) flags 0x8800 phys_seg 24 prio class 2
[349280.239750] device offline error, dev sdh, sector 3542677504 op 0x1:(WRITE) flags 0x8800 phys_seg 45 prio class 2
[349280.239753] device offline error, dev sdh, sector 3542680576 op 0x1:(WRITE) flags 0x8800 phys_seg 41 prio class 2
[349280.239788] device offline error, dev sdh, sector 3542663168 op 0x1:(WRITE) flags 0x8800 phys_seg 35 prio class 2
[349280.239793] device offline error, dev sdh, sector 3542663680 op 0x1:(WRITE) flags 0x8800 phys_seg 29 prio class 2
[349280.239799] device offline error, dev sdh, sector 3542663936 op 0x1:(WRITE) flags 0x8800 phys_seg 26 prio class 2
[349280.299534] sd 9:0:0:0: [sdh] Synchronizing SCSI cache
[349280.523475] sd 9:0:0:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVE...
This change was made by a bot.