xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13

Bug #1667750 reported by l3iggs on 2017-02-24
210
This bug affects 40 people
Affects Status Importance Assigned to Milestone
HWE Next
Undecided
Unassigned
linux (Arch Linux)
New
Undecided
Unassigned
linux (Debian)
New
Undecided
Unassigned
linux (Fedora)
Confirmed
Undecided
linux (Ubuntu)
Medium
Kai-Heng Feng
Xenial
Undecided
Unassigned
Zesty
Undecided
Unassigned
Artful
Medium
Kai-Heng Feng

Bug Description

[SRU Justification]

[Impact]
Dell TB16 docking station has issue to use gigabit ethernet. The ethernet
will disconnect unless it's changed to 100Mb/s.

[Test Case]
Download some big files from the web.
User confirms the patch fixes the issue.

[Regression Potential]
This patch only effects ASMEDIA's ASM1042A.
The regression potential is low, also limited to the specific device.

---

My system contains a Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter which is on usb3 bus in my docking station (Dell TB16) which is attached to my laptop (Dell XPS9550) via Thunderbolt 3.

I get usb related kernel error messages when I initiate a high speed transfer (by issuing wget http://cdimage.ubuntu.com/daily-live/current/zesty-desktop-amd64.iso) and the download fails.

This does not happened when the Ethernet adapter is connected to a 100Mb/s switch, but only when connected to 1000Mb/s. It also does not happened with slow traffic (e.g. web page browsing). This is not a new bug with kernel 4.10, but has been going on since at least 4.7 and maybe (probably?) since forever. I'm aware of several others with this configuration (RTL8153 on usb3 behind thunderbolt 3) that have the same issue. This bug is also not specific to Ubuntu; I also get it on Arch Linux. I've also tested and seen this bug with several different models of thunderbolt 3 docks.

Here are the relevant kernel log messages:

Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9010 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9020 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9030 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9040 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9050 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:38 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9060 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9070 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Feb 24 16:42:39 ubuntu kernel: xhci_hcd 0000:0e:00.0: Looking for event-dma 00000004777d9080 trb-start 0000000475a14fe0 trb-end 0000000475a14fe0 seg-start 0000000475a14000 seg-end 0000000475a14ff0
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx timeout
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:06 ubuntu kernel: r8152 4-1.2:1.0 enx204747f8f471: Tx status -2
Feb 24 16:43:09 ubuntu kernel: usb 4-1.2: reset SuperSpeed USB device number 3 using xhci_hcd

I can't seem to make this bug appear with any other type of USB traffic. I've reported it to the realtek kernel dev team and they don't think their RTL8153 driver (in this case the r8152 module) is to blame, but instead that it's an xhci_hcd issue.

If you look through the dmesg log attached here, you'll see that at 45.967025 I plugged the thunderbolt 3 cable from my dock into my laptop.

ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: linux-image-4.10.0-8-generic 4.10.0-8.10
ProcVersionSignature: Ubuntu 4.10.0-8.10-generic 4.10.0-rc8
Uname: Linux 4.10.0-8-generic x86_64
ApportVersion: 2.20.4-0ubuntu2
Architecture: amd64
CasperVersion: 1.380
CurrentDesktop: Unity:Unity7
Date: Fri Feb 24 16:53:35 2017
LiveMediaBuild: Ubuntu 17.04 "Zesty Zapus" - Alpha amd64 (20170224)
MachineType: Dell Inc. XPS 15 9550
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/casper/vmlinuz.efi file=/cdrom/preseed/username.seed boot=casper quiet splash ---
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-8-generic N/A
 linux-backports-modules-4.10.0-8-generic N/A
 linux-firmware 1.163
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/22/2016
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.2.19
dmi.board.name: 0N7TVV
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.2.19:bd12/22/2016:svnDellInc.:pnXPS159550:pvr:rvnDellInc.:rn0N7TVV:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: XPS 15 9550
dmi.sys.vendor: Dell Inc.

l3iggs (l3iggs) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
l3iggs (l3iggs) on 2017-02-24
description: updated
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.10 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
l3iggs (l3iggs) wrote :

Hi Joseph. Are you a robot?

I believe I've answered your questions in my bug report (when I wrote that this bug has been going on since forever). Also, this bug report was made with linux-image-4.10.0-8-generic so it seems that your request to test with a newer kernel does not apply.

l3iggs (l3iggs) on 2017-02-25
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
l3iggs (l3iggs) on 2017-02-25
tags: added: kernel-bug-exists-upstream

+1, same configuration, same behavior, 16.10.

A bit more info, as I've gotten hold of a separate usb network card with the same chip in it (Realtek 8153). Plugging it into the ASMedia ASM1042A USB controller found in the TB16 docking station gives the same result as reported here. But plugging it directly into the computer (Intel Sunrise Point-H USB controller) works without problems.

So the problem seems to be in the combination of the Realtek and ASMedia chips.

David Ibarra (dibarra) wrote :

Hey all, just +1'ing this- seeing this on ubuntu 16.10, and Fedora 25 (kernel 4.9). TB16 dock and Dell Precision 5510.

André Düwel (aduewel) wrote :

+1, Dell XPS15 9550 + Dell TB16 + Ubuntu 16.10.

Workaround:
Limiting the connection speed to 100MBit FDX via "ethtool eth..... speed 100 duplex full autoneg on" also circumvents the problem.

On Windows 10 its working without issues at full speed (Gigabit).

André Düwel (aduewel) wrote :

reloading the Realtek kernel module r8152 and restarting the network-manager also fixes the problem temporary:
sudo rmmod r8152.ko
sudo modprobe r8152.ko
sudo service network-manager restart

Kaz Wolfe (kazwolfe) wrote :

Seems related to Bug #1663975. Same problem, I'd think...

Also, just for the sake of completeness, yet another error log: http://pastebin.com/z8U9usDY

4.8.0-41-generic #44~16.04.1-Ubuntu, HWE because reasons. Kernel is tainted (NVIDIA, VirtualBox), but this issue seems to exist anyways.

Posted same comment over on the other bug report, sorry for any spam that may report.

Hordur Heidarsson (hordur-z) wrote :

+1, Dell Precision M5510 + Dell TB16 + Ubuntu 16.10 + 4.8.0-41-generic #44~16.04.1-Ubuntu SMP

@aduewel: thanks for the speed downgrade workaround!

Karlyn Fielding (karlyn) wrote :

I can confirm that I have the same issue being reported here. I have a Dell XPS13 Developer Edition (9360) with Ubuntu 16.04, TB16 Dock and an upgraded Ubuntu Mainline build kernel of 4.10.4.

Additionally, I tried downloading the source code from Realtek for their v2.08.0 r8152 driver. I compiled that driver and manually removed and re-inserted the new kernel module into my running system. With that driver running, I see the exact same behavior described here.

I'd be happy to volunteer for any testing on my hardware that might help debug the issue.

I can also confirm that changing the connection speed to 100Mb with ethtool provides a work around to the problem.

Also, I have some kernel output I have saved while experiencing the issue if there is interest in it.

l3iggs (l3iggs) wrote :

I'm starting to think this issue stems not from our Realtek RTL8153 Ethernet chip but rather from something upstream of it.

My best guess now is that there's something wrong with the handling of the "ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller" that's in our docks. This is a usb3.0 <--> PCIe bridge which the RTL8153 hangs off of.

Robert Sandberg (srobban) wrote :

Can confirm same issue on Dell Precision 5520 + TB16 running pre-installed Ubuntu 16.04 LTS.

Same issue on KDE Neon with different kernels 4.8.x, 4.10.x

Are also experience other USB issues when connecting various devices on the TB-16 e.g. all other USB devices freezes.

l3iggs (l3iggs) wrote :

Also, I wonder if this could somehow be a thunderbolt 3 bandwidth allocation issue.
This is pure uneducated speculation though ;-)

l3iggs (l3iggs) wrote :

By the way, removing and reinserting the r8152 module as suggested above does not seem to prevent or work around this issue.

l3iggs (l3iggs) wrote :

Higher transfer rates seem to have some impact here:

wget http://cdimage.ubuntu.com/daily-live/current/zesty-desktop-amd64.iso
errors out in a few seconds

wget --limit-rate=10k http://cdimage.ubuntu.com/daily-live/current/zesty-desktop-amd64.iso
might run for a few 10s of seconds before erroring out

wget --limit-rate=1k http://cdimage.ubuntu.com/daily-live/current/zesty-desktop-amd64.iso
continues to work until I run out of patience (forever?). I've not waited the 18 days required for this to complete though :P

l3iggs (l3iggs) wrote :

I've attached a trace for the 4.11 kernel.

First
echo xhci-hcd >> /sys/kernel/debug/tracing/set_event

Then initiate network transport to create the bug.

/sys/kernel/debug/tracing/trace (as 4.11.trace.txt)
and
dmesg (as 4.11.dmesg.txt) are attached.

l3iggs (l3iggs) wrote :

4.11.dmesg.txt

Alex Shchagin (qalex) wrote :

+1 here Precision 5510 + TB16 + 16.10 4.8.0-41
However I think this is not a r8152 bug too. When I plug in Dell USB-C Ethernet adapter into TB16 it works fine at the full speed with the same r8152 module.

l3iggs (l3iggs) wrote :

Alex, What's model number of that Dell USB-C Ethernet adapter?

Alex Shchagin (qalex) wrote :

@l3iggs Nothing is written on it, but it seems to be 470-ABQJ. It came with my Precision.

I thing the culprit is a USB host controller 'ASM1042A USB 3.0 Host Controller' embedded into TB16. See here:
> lshw -short
...
/0/100/1d.6/0 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/0 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/0/0 generic DSL6340 Thunderbolt 3 NHI [Alpine Ridge 2C 2015]
/0/100/1d.6/0/1 bridge DSL6340 Thunderbolt 3 Bridge [Alpine Ridge 2C 2015]
/0/100/1d.6/0/1/0 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/1 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4/0 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4/0/1 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4/0/1/0 bus ASM1042A USB 3.0 Host Controller
/0/100/1d.6/0/1/0/4/0/1/0/0 usb3 bus xHCI Host Controller
/0/100/1d.6/0/1/0/4/0/1/0/0/1 bus USB2137B
/0/100/1d.6/0/1/0/4/0/1/0/0/1/5 multimedia USB Audio
/0/100/1d.6/0/1/0/4/0/1/0/1 usb4 bus xHCI Host Controller
/0/100/1d.6/0/1/0/4/0/1/0/1/1 bus USB5537B
/0/100/1d.6/0/1/0/4/0/1/0/1/1/2 generic USB 10/100/1000 LAN <<-- EMBEDDED, NOT WORKING
/0/100/1d.6/0/1/0/4/0/4 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4/0/4/0 bus DSL6540 USB 3.1 Controller [Alpine Ridge]
/0/100/1d.6/0/1/0/4/0/4/0/0 usb5 bus xHCI Host Controller
/0/100/1d.6/0/1/0/4/0/4/0/1 usb6 bus xHCI Host Controller
/0/100/1d.6/0/1/0/4/0/4/0/1/2 generic USB 10/100/1000 LAN <<-- EXTERNAL, WORKING
...

By the way, Dell listed some special driver for Windows at the TB16 page for this ASM controller.

Karlyn Fielding (karlyn) wrote :

I can confirm that the issue remains with the latest 4.10.7 mainline kernel build for Ubuntu.

Same specs as before:
Dell XPS13 DE (9360)
TB16 Dock
Ubuntu 16.04 ( installed as shipped from Dell )

imperia (imperia777) wrote :

I have the same problem with my USB 3.1 controller:
ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller.

I passthru the controller to XEN VM. I then connect to it USB TV Tuner card.
I am using VDR software which is TV software that when not in use is scanning for new channels.

Sometimes after few hours, sometimes after few days it crashes with following error:
[131382.068144] xhci_hcd 0000:00:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 3
[131382.068182] xhci_hcd 0000:00:00.0: Looking for event-dma 0000000210196600 trb-start 0000000210196740 trb-end 0000000210196760 seg-start 0000000210196000 seg-end 0000000210196ff0

Then same problem is not present with the onboard USB 3.0 controller.
If I passhtru it to the XEN VM it working without any problems.

So this must be some problem with USB 3.1 driver (not 3.0) or ASMedia firmware.

I can provide whatever information is necessary to fix this bug.
I can provide shell account to my VM also if somebody wants to debug it.

Li Dongyang (dongyang-li) wrote :

Could someone try:
ethtool --offload <eth interface> tx off
ethtool --offload <eth interface> rx off

And then see if it works?

Robert Sandberg (srobban) wrote :

I've tried:
ethtool --offload <eth interface> tx off
ethtool --offload <eth interface> rx off

But the issue remains.

The only workaround that works is to limit speed to 100, as suggested previously.

André Düwel (aduewel) wrote :

Since I upgraded to Ubuntu 17.04 (fresh install), I can confirm that this bug also affects the (now) current release and therefore kernel version 4.10.0-19-generic.

I also now implemented an other "workaround" and bought an 7€ USB3->1Gb Ethernet dongle, this works without issues.

Additional Information:
lsusb
Bus 004 Device 004: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 004 Device 002: ID 0424:5537 Standard Microsystems Corp.
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 007: ID 03f0:094a Hewlett-Packard Optical Mouse [672662-001]
Bus 003 Device 006: ID 046d:c31c Logitech, Inc. Keyboard K120
Bus 003 Device 005: ID 2109:2811 VIA Labs, Inc. Hub
Bus 003 Device 004: ID 2109:2811 VIA Labs, Inc. Hub
Bus 003 Device 003: ID 0bda:4014 Realtek Semiconductor Corp.
Bus 003 Device 002: ID 0424:2137 Standard Microsystems Corp.
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f3:21d5 Elan Microelectronics Corp.
Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp.
Bus 001 Device 004: ID 0c45:6713 Microdia
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

lshw --short: (see attachment)

André Düwel (aduewel) wrote :

sorry, attached wrong file in last comment. here is the right one

André Düwel (aduewel) wrote :

I need to correct me: Having issues during high load on the USB3 Ethernet adapter, too.

Only workaround is limiting to 100MBit.

Bram Biesbrouck (b-m) wrote :

André,

I followed your advice and bought an inexpensive USB3 1gbit ethernet adapter and noticed the same drops and corruptions as the built-in ethernet port. However, when I plug it in the USB-C port of the dock (using a little USB3 to USB-C cable), everything seems to work correctly.

B.

Alex Shchagin (qalex) wrote :

Bram,

This is because USB3 and USB-C ports in TB16 are connected to different controllers. See my lshw output here - https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750/comments/23 - I've marked Ethernet cards with <<--. Working one is USB-C and it is under this one:
/0/100/1d.6/0/1/0/4/0/4 bridge DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015]
/0/100/1d.6/0/1/0/4/0/4/0 bus DSL6540 USB 3.1 Controller [Alpine Ridge]

Alex

André Düwel (aduewel) wrote :

Ohh okay, thanks for this advise I will order an adapter and try it out.

This seems to verify that the problem exists somewhere in the usb3 controller/driver (ASM1042A) in the TB16 and not in the Ethernet controller/driver itself.

Bram Biesbrouck (b-m) wrote :

Ah, cool, didn't know that, thanks!

zwigno (zwigno) wrote :

I have the same issue. I'm using a Dell Precision 5510 with the Dell TB16 Dock. I'm running Ubuntu 17.04 with kernel version 4.10.0-21-generic. What I first noticed is that some SSL-enabled websites failed to load with errors like, "SSL_ERROR_BAD_MAC_READ." Setting the speed of the r8152 fixes the issue.

Does anyone have a solution for setting the speed to 100Mb upon plugin of the Thunderbolt connector or when the interface comes up? Setting it at boot time isn't ideal because I don't often have the dock plugged in at first boot.

Zhenfang Wei (kopkop) wrote :

the same issue, xps9360 + ubuntu 16.04 with kernel 4.4.78 + tb16

Mario Limonciello (superm1) wrote :

This is an issue with the host controller. The vendor (ASMedia) has submitted a patch here that fixes the issue:
http://www.spinics.net/lists/linux-usb/msg157958.html

Kai-Heng Feng (kaihengfeng) wrote :

I can confirm the patch works on the TB15 at my hand, can you guys try patched 4.11 kernel [1] on TB16?

I applied the patch to 4.11 - the patch cannot be cleanly applied to Xenial/Yakkety/Zesty kernel.

I'll do the proper backport when the patch is being accepted by upstream maintainers.

[1] http://people.canonical.com/~khfeng/lp1667750/

André Düwel (aduewel) wrote :

Hi Kai-Heng,

I can confirm your Kernel is working on my XPS 15 9550 + TB16 running Zesty and it fixes the Ethernet issue.

But, the whole TB16 USB3 Controller including Keyboard, Ethernet and other USB devices are still not working when connected during system start. I need to disconnect and reconnect it after booting.

Thanks again! :)

Kai-Heng Feng (kaihengfeng) wrote :

Sounds like another issue. Can you file another bug?

55 comments hidden view all 128 comments
Download full text (9.2 KiB)

This is a Dell XPS 13 connected to the network via the TB16 dock.
Kernel is: Linux ag13.local 4.12.0-0.rc3.git0.2.fc27.x86_64 #1 SMP Tue May 30 19:36:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Host controller of the dock:
09:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller

USB network interface in the dock:
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
        |__ Port 2: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M

[32930.573816] usb 4-1.2: new SuperSpeed USB device number 3 using xhci_hcd
[32930.591744] usb 4-1.2: New USB device found, idVendor=0bda, idProduct=8153
[32930.591752] usb 4-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[32930.591757] usb 4-1.2: Product: USB 10/100/1000 LAN
[32930.591761] usb 4-1.2: Manufacturer: Realtek
[32930.591766] usb 4-1.2: SerialNumber: 000001000000
[32930.739428] usb 4-1.2: reset SuperSpeed USB device number 3 using xhci_hcd

I *sometimes* get the following in the log and with that the ethernet port stops working.
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec010 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec020 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec030 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec040 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec050 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09:00.0: Looking for event-dma 00000001c3eec060 trb-start 00000001c3eebfe0 trb-end 00000001c3eebfe0 seg-start 00000001c3eeb000 seg-end 00000001c3eebff0
Jun 12 19:00:04 ag13.local kernel: xhci_hcd 0000:09...

Read more...

There is an upstream patch for the ASM1042A host controller[1] that has been reported to help with the issue (see corresponding launchpad issue[2]).

[1] http://www.spinics.net/lists/linux-usb/msg157958.html
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750

AceLan Kao (acelankao) on 2017-06-22
tags: added: originate-from-1696057 somerville
Changed in linux (Fedora):
importance: Undecided → Unknown
status: New → Unknown
Changed in linux (Ubuntu):
assignee: nobody → Kai-Heng Feng (kaihengfeng)
description: updated
Seth Forshee (sforshee) on 2017-07-24
Changed in linux (Ubuntu Artful):
status: Confirmed → Fix Committed

After an initial hiccup with the LAN cable in the dock (and plugging it into a different socket), the performance is now much better (not sure if I can say it's perfect, yet) using the patched kernel.
Thanks!

Changed in linux (Ubuntu Xenial):
status: New → In Progress
status: In Progress → Fix Committed
Changed in linux (Ubuntu Zesty):
status: New → Fix Committed
Bram Biesbrouck (b-m) on 2017-08-07
Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Xenial):
status: Fix Released → Fix Committed
Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
tags: added: verification-needed-xenial
tags: added: verification-needed-zesty
Corey Schuhen (cschuhen) on 2017-08-17
tags: added: verification-done-zesty
removed: verification-needed-zesty
tags: added: verification-done-xenial
removed: verification-needed-xenial

For future reference, the mentioned patch git merged upstream, as commit 9da5a1092b13468839b1a864b126cacfb72ad016
It also made it into stable, 4.12.4 I believe, as 5cc9b698a494827b15f74ef70a31d7911d00e52a

So I think this should be fixed (or at least better) in F26, because we currently ship 4.12.5-300.fc26.x86_64

(In reply to Christian Kellner from comment #5)
> For future reference, the mentioned patch git merged upstream, as commit
> 9da5a1092b13468839b1a864b126cacfb72ad016
> It also made it into stable, 4.12.4 I believe, as
> 5cc9b698a494827b15f74ef70a31d7911d00e52a
>
> So I think this should be fixed (or at least better) in F26, because we
> currently ship 4.12.5-300.fc26.x86_64

The network works, but sadly it corrupts packets. Martin says because of it he has difficulties to download things, connect to services...

@Jiri,

Are you sure that's a result of this patch? This is the first report i've heard of that.

@Mario, I think what Jiri means is that without the patch it doesn't work well at all but even with the patch the situation is not perfect. Let me cc Benjamin, maybe we can add a test in our Fedora Hardware test suit for that. We still have the TB16 dock in Munich right now, maybe we can be of help.

I'll let Martin speak for himself because it was him who complained about it to me.
I've been using kernel 4.12.8 which should have the patch included since the morning and haven't experienced any noticeable problems with the network.

Yes, for me, the Ethernet on the Docks is pretty broken. For example, when downloading a whole Koji build with about 13 packages, each time the download got broken at about 4th or 5th package, with (I think) a SSL handshake error. Also when downloading a Fedora ISO 4 times in a row, each of them got corrupted (md5 check just didn't pass).

Also, the USB performance of the dock is terrible, I'm not sure if this is related to the issue the patch in question is supposed to solve but after updating the laptop firmware to 1.2.1.0, my mouse and keyboard get disconnected very often. On the other hand, dock audio works just fine and one would assume all of these devices are on the same USB hub.

I'm currently working around this by plugging a USB-C adapter with ethernet into the Thunderbolt port on the docking station.

Martin, could you maybe try disabling RC checksum offloading and see if that helps? Then the corrupted packages should be discarded by the kernel (even if they are only corrupted during the transfer over USB). i.e. try again after running:

  ethtool --offload $DEVICE rx off

@Martin

Just to make sure - this is a TB16 not TB15 right? This is sounding suspiciously like a hardware problem to me.

(In reply to Mario Limonciello from comment #12)
> @Martin
>
> Just to make sure - this is a TB16 not TB15 right? This is sounding
> suspiciously like a hardware problem to me.

It's TB16.
You mean the ethernet or USB problem? I think we've started mixing two (most likely) unrelated problems. I have not been able to reproduce the ethernet problem for the whole day. Martin also has Windows 10 installed on his XPS 13, so he could try it there and if the problem still occurs it's very likely a hardware problem.

The USB one doesn't seem like a hardware problem because I'm affected by that, too, after the last firmware update. Devices connected to the USB ports don't work at all or just for a short period of time after they're plugged in.

Well i'm not sure if they're related, but since the Ethernet device is a USB device on the hub, I would suspect them to be.

Can you please clarify which XPS machine you guys are affected? There are at least 4 different XPS models that support TB16.
Please comment your last working and last failed BIOS versions too.

We both have XPS 13 9360. I had problems with Ethernet from the very beginning until I used a patched kernel. But after updating the firmware to 1.3.7 USB devices stopped working*. Now we're on 2.1.0 and they still don't work, no matter if we use the kernel patch or not. I have to have a USB hub connected directly to the laptop. The last working firmware for me was 1.3.5.

* It really depends on the type of the device. The mouse and keyboard don't work at all or just for a very short time after plugging in. I also have a USB sound card. It seems to work, the system identifies the sound card as an audio output, it plays sound, but there are audible corruptions (cracks etc) which don't occur when the sound card is connected directly to the laptop. What I'm experiencing with sound may be similar to what Martin is experiencing with the Ethernet.

Ah OK thanks. I just poked around the Dell forums a little bit and you guys aren't the first ones reporting this on 9360 after upgrade.

http://en.community.dell.com/support-forums/laptop/f/3518/t/20017063?pi41097=1

I'll poke some of the Dell support guys to look at this, it sounds like it might have slipped through the cracks.

I also checked internally on what went into 1.3.6/1.3.7.
At least 1.3.6 had some tweaks for adressing noise which would be most suspicious to me as a possible impact.

For now, can you two downgrade to 1.3.5? Fwupd probably won't let you, but you can place the .EXE file on a FAT32 partition and do it from F12 menu at POST I expect.

We'll try to downgrade for the time being. BTW I also reported the issue to @DellCaresPRO like Barton George instructed me on Twitter. They said 10 days ago they had people looking into it, but there hasn't been any update since then, so I have no idea if someone is really looking into it and if they've made any progress, and who is "they".

I won't be able to shortcut the process by pinging people, but I understand this is being investigated, it will just take some time.

(In reply to Benjamin Berg from comment #11)
> Martin, could you maybe try disabling RC checksum offloading and see if that
> helps? Then the corrupted packages should be discarded by the kernel (even
> if they are only corrupted during the transfer over USB). i.e. try again
> after running:
>
> ethtool --offload $DEVICE rx off

With this, it seems to work alright, thanks! Kernel 4.13.0-0.rc5.git1.1.fc27.x86_64 BTW.

(In reply to Mario Limonciello from comment #16)
> For now, can you two downgrade to 1.3.5? Fwupd probably won't let you, but
> you can place the .EXE file on a FAT32 partition and do it from F12 menu at
> POST I expect.

I'm able to function this way so I'll probably not go for that - unless it'll be necessary to verify it actually happened between the mentioned versions.
I'd rather track if there's a new release and then upgrade when it's out and see if it fixes the USB problem.

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Download full text (4.1 KiB)

Every now and then (especially when downloading large files), the ethernet simply stops working with the following log in dmesg.
Unloading the r8152 module results in gnome-shell dying. After reloading it, ethernet still doesn't work. Disconnecting the Dock in this state kills everything from GDM down to my user session.

[159642.248648] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[159642.248666] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[159642.248680] pcieport 0000:00:1c.0: device [8086:9d10] error status/mask=00001000/00002000
[159642.248690] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[159661.087306] xhci_hcd 0000:0a:00.0: port 1 resume PLC timeout
[159667.687492] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687514] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc010 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687610] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687627] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc020 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687722] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687735] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc030 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687829] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687838] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc040 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.687971] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.687988] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc050 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.723135] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.723158] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc060 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 seg-start 00000003a0cfe000 seg-end 00000003a0cfeff0
[159667.723202] xhci_hcd 0000:09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
[159667.723219] xhci_hcd 0000:09:00.0: Looking for event-dma 00000004694bc070 trb-start 00000003a0cfefe0 trb-end 00000003a0cfefe0 ...

Read more...

As I understand the particular problem linked with the issue in BIOS 1.3.6/1.37 adjusts a voltage regulator (to fix something else; this was an unanticipated/undiscovered regression). I would recommend for now to downgrade to 1.3.5 until a fixed BIOS is issued.

27 comments hidden view all 128 comments
Mario Limonciello (superm1) wrote :

FYI there are two separate issues. The first is the poor performance of the ethernet on the TB16. That's the original reason this bug was opened and has been fixed in kernel upgrades.

There is a second issue that a BIOS update causes problems with USB on the TB16 (such as corrupted packets). Dell support is aware of it and a new BIOS is on it's way out very soon to resolve it. You can track the status of that with Dell here:
http://en.community.dell.com/techcenter/os-applications/f/4613/p/20018487/21017133#21017133

Corey Schuhen (cschuhen) wrote :

I do not see the same behaviour as Johann:

cschuhen@loriel:/tmp$ for i in 1 2 3 4; do curl -s http://de.releases.ubuntu.com/zesty/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
4672ce371fb3c1170a9e71bc4b2810b9 1.iso
4672ce371fb3c1170a9e71bc4b2810b9 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso
cschuhen@loriel:/tmp$ uname -a
Linux loriel 4.10.0-33-generic #37-Ubuntu SMP Fri Aug 11 10:55:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

This is using the port on TB16.

André Düwel (aduewel) wrote :

Thank @Mario for this clarification, I recently updated my BIOS, too. Since then under Linux my mouse and keyboard (connected to the USB ports of the TB16) stop working from time to time. In Windows Mouse and Keyboard are just sometimes laggy but don't stop completely. I will try to downgrade the BIOS today to an earlier version.

André Düwel (aduewel) wrote :

So instead of downgrading my BIOS, I've updated it from 1.2.29 to version 1.3.0 which was released two days ago by DELL for the XPS15 9550. USB ports at the TB16 is now working without issues again.

I don't get any checksum errors, too:
for i in 1 2 3 4; do curl -s http://de.releases.ubuntu.com/zesty/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
4672ce371fb3c1170a9e71bc4b2810b9 1.iso
4672ce371fb3c1170a9e71bc4b2810b9 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso

25 comments hidden view all 128 comments

It got really annoying lately. How do I downgrade to 1.3.5, please? I can't find it on the Dell website and fwupd doesn't provide anything too.

24 comments hidden view all 128 comments
Lance Parsons (lparsons) wrote :

I had similar issues with checksum errors on Precision 5520. Updating to the recently released BIOS version 1.4 has resolved those issues. Finally, all is well with Ubuntu, TB16, and Precision 5520. Thanks all.

25 comments hidden view all 128 comments

Running kernel-4.13.0-1.fc27.x86_64.

BIOS 2.2.1 finally hit the Dell website. I can confirm that with this, the USB overall experience is now much much better (except the occasional mouse stutter but that may as well be on the OS side). There seems to be no problem at all with the dock Ethernet adapter.

On 4.13.4-300.fc27.x86_64, I still experience the SSL errors when downloading larger amounts of data, like git repositories and such. It gets fixed after disabling RC checksum offloading with the ethtool command you have provided before.

25 comments hidden view all 128 comments
Georgi Boiko (pandasauce) wrote :

I have same checksum issues on Precision 5520 and BIOS 1.4 using TB16. This is on Ubuntu 16.04.3.

$ for i in 1 2 3 4; do curl -s de.releases.ubuntu.com/.../ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
116b2649ec67507736517eb503e43fbf 1.iso
fd81a7fda3fcf5a7cbf313c7e54fcc06 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso

BIOS 1.3.2:

$ for i in 1 2 3 4; do curl -s de.releases.ubuntu.com/.../ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
4672ce371fb3c1170a9e71bc4b2810b9 1.iso
00683eb3f831c1179b242738131ddabd 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso

Luis Alvarado (luisalvarado) wrote :

It I can add, for the Logitech G930 Headset, it also gets the similar error, just before it drops connection with it (It is a USB Dongle that wirelessly (Not using Bluetooth but actual 2.4Ghz) connecst with the headset):

[12278.974880] perf: interrupt took too long (2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
[16639.737569] perf: interrupt took too long (3165 > 3140), lowering kernel.perf_event_max_sample_rate to 63000
[19369.252915] usb 1-5: USB disconnect, device number 3
[19369.254678] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.254685] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b60 trb-start 000000101b3d3b70 trb-end 000000101b3d3b70 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.255673] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.255679] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b70 trb-start 000000101b3d3b80 trb-end 000000101b3d3b80 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.256674] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.256681] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b80 trb-start 000000101b3d3b90 trb-end 000000101b3d3b90 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19369.257672] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[19369.257677] xhci_hcd 0000:00:14.0: Looking for event-dma 000000101b3d3b90 trb-start 000000101b3d3ba0 trb-end 000000101b3d3ba0 seg-start 000000101b3d3000 seg-end 000000101b3d3ff0
[19374.558284] usb 1-7: new full-speed USB device number 8 using xhci_hcd
[19375.274480] usb 1-7: New USB device found, idVendor=046d, idProduct=0a1f
[19375.274483] usb 1-7: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[19375.274485] usb 1-7: Product: Logitech G930 Headset
[19375.274487] usb 1-7: Manufacturer: Logitech
[19375.292170] input: Logitech Logitech G930 Headset as /devices/pci0000:00/0000:00:14.0/usb1/1-7/1-7:1.3/0003:046D:0A1F.0008/input/input23
[19375.354891] hid-generic 0003:046D:0A1F.0008: input,hiddev1,hidraw1: USB HID v1.01 Device [Logitech Logitech G930 Headset] on usb-0000:00:14.0-7/input3
[22613.309243] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.

Changed in linux (Fedora):
importance: Unknown → Undecided
status: Unknown → Confirmed
25 comments hidden view all 128 comments
Luis Alvarado (luisalvarado) wrote :
Download full text (33.7 KiB)

This bugs still present in the following kernels I have tested:

4.13.7
4.13.8
4.13.9
4.13.10
4.14-RC6

The error that typically show is:

12115.066777] retire_capture_urb: 23 callbacks suppressed
[12183.003052] usb 1-7: USB disconnect, device number 3
[12183.004602] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.004604] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7d0 trb-start 00000010243ef7e0 trb-end 00000010243ef7e0 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.005603] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.005605] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7e0 trb-start 00000010243ef7f0 trb-end 00000010243ef7f0 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.006602] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.006604] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef7f0 trb-start 00000010243ef800 trb-end 00000010243ef800 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.007603] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.007606] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef800 trb-start 00000010243ef810 trb-end 00000010243ef810 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.008601] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.008602] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef810 trb-start 00000010243ef820 trb-end 00000010243ef820 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.009599] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.009600] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef820 trb-start 00000010243ef830 trb-end 00000010243ef830 seg-start 00000010243ef000 seg-end 00000010243efff0
[12183.010626] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[12183.010628] xhci_hcd 0000:00:14.0: Looking for event-dma 00000010243ef830 trb-start 00000010243ef840 trb-end 00000010243ef840 seg-start 00000010243ef000 seg-end 00000010243efff0
[12186.007322] usb 1-4: new full-speed USB device number 7 using xhci_hcd
[12186.723745] usb 1-4: New USB device found, idVendor=046d, idProduct=0a1f
[12186.723746] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[12186.723747] usb 1-4: Product: Logitech G930 Headset
[12186.723747] usb 1-4: Manufacturer: Logitech
[12186.740338] input: Logitech Logitech G930 Headset as /devices/pci0000:00/0000:00:14.0/usb1/1-4/1-4:1.3/0003:046D:0A1F.0008/input/input22
[12186.799495] hid-generic 0003:046D:0A1F.0008: input,hiddev1,hidraw3: USB HID v1.01 Device [Logitech Logitech G930 Headset] on usb-0000:00:14.0-4/input3

My hardware is:

 sudo lshw -sanitize -numeric
computer
    description: Desktop Computer
    product: System Product Name (SKU)
    vendor: System manufacturer
    version: System Version
    serial: [REMOVED]
  ...

Changed in hwe-next:
status: New → Fix Released

This is happening even on my 9560 with 4.13.9 vanilla; when running a background rsync backup job, packages downloaded in a Debian docker build frequently do not match their checksum and need multiple runs to succeed.

And just to illustrate my point, on 4.14.0 vanilla:

while true; do
dd if=/nfsmount/debian-live-9.1.0-amd64-xfce+nonfree.iso bs=16M iflag=direct 2>/dev/null | sha1sum; done

With rx offload on (default):

489ed92b17aa9a4582899356d3123621b5d92189 -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
f11ba5f624dbab5a52319801c28a7032cc9b5100 -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
e925ff013c99a1b732a99aeaf5d3f1f02c8dfa40 -

With rx offload off:

742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -

Applied to 4.14.14. Offload:

tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
rx-vlan-offload: on
tx-vlan-offload: on

dd | sha1sum loop:

742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -
742462292c76189f63fc3e7af1acc9dec56c0a8d -

Ran for 10 minutes, so looks like that patch works (doing around 90mbit/s of traffic).

Luis Alvarado (luisalvarado) wrote :

If this helps, this is happening in Ubuntu 17.10 with the 4.13 and 4.15 Kernels. I am using a Logitech G930

This was the output with dmesg

[ 15.303655] logitech-hidpp-device 0003:046D:4069.000A: HID++ 4.5 device connected.
[ 49.671534] usb 3-2: USB disconnect, device number 2
[ 49.672965] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.672975] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eec0 trb-start 000000101f51eed0 trb-end 000000101f51eed0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.673966] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.673980] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eed0 trb-start 000000101f51eee0 trb-end 000000101f51eee0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.674932] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.674943] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eee0 trb-start 000000101f51eef0 trb-end 000000101f51eef0 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.675952] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.675962] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51eef0 trb-start 000000101f51ef00 trb-end 000000101f51ef00 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 49.676939] xhci_hcd 0000:07:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 1 comp_code 1
[ 49.676947] xhci_hcd 0000:07:00.0: Looking for event-dma 000000101f51ef00 trb-start 000000101f51ef10 trb-end 000000101f51ef10 seg-start 000000101f51e000 seg-end 000000101f51eff0
[ 52.527220] usb 1-3: new full-speed USB device number 7 using xhci_hcd
[ 53.240955] usb 1-3: New USB device found, idVendor=046d, idProduct=0a1f
[ 53.240960] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 53.240963] usb 1-3: Product: Logitech G930 Headset
[ 53.240965] usb 1-3: Manufacturer: Logitech
[ 53.259535] input: Logitech Logitech G930 Headset as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3:1.3/0003:046D:0A1F.000C/input/input24
[ 53.316838] hid-generic 0003:046D:0A1F.000C: input,hiddev0,hidraw0: USB HID v1.01 Device [Logitech Logitech G930 Headset] on usb-0000:00:14.0-3/input3

Georgi Boiko (pandasauce) wrote :

Update to my October post:

Dell Precision 5520 and BIOS 1.7 using TB16. This is on Ubuntu 16.04.3, kernel 4.13.0

The issue is still present. I tried limiting the bandwidth using `ethtool -s eth0 speed 100 duplex full autoneg on` and also as described in this blog post: http://mark.koli.ch/slowdown-throttle-bandwidth-linux-network-interface and it *seems* to be making the issue less apparent, but still present.

$ for i in 1 2 3 4; do curl -s http://old-releases.ubuntu.com/releases/17.04/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
2641b55ed2e203861fb6f642bb05b8f7 1.iso
63f41e8b8e4e5ad1909637dbd2efd849 2.iso
^C%

$ sudo ethtool -s eth0 speed 100 duplex full autoneg on

$ for i in 1 2 3 4; do curl -s http://old-releases.ubuntu.com/releases/17.04/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
4672ce371fb3c1170a9e71bc4b2810b9 1.iso
4672ce371fb3c1170a9e71bc4b2810b9 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso

$ for i in 1 2 3 4; do curl -s http://old-releases.ubuntu.com/releases/17.04/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
ed13e9c6c45f027f686000eccce42254 1.iso
4672ce371fb3c1170a9e71bc4b2810b9 2.iso
^C%

Next, I tried disabling offloading as described above. I haven't reset the device to 1 Gbps before doing so. It seems to be working fine so far. I will leave it running for an hour over lunch today to be completely sure.

$ sudo ethtool --offload eth0 tx off
Actual changes:
tx-checksumming: off
    tx-checksum-ipv4: off
    tx-checksum-ipv6: off
tcp-segmentation-offload: off
    tx-tcp-segmentation: off [requested on]
    tx-tcp6-segmentation: off [requested on]

$ sudo ethtool --offload eth0 rx off

$ for i in 1 2 3 4 5 6; do curl -s http://old-releases.ubuntu.com/releases/17.04/ubuntu-17.04-server-amd64.img -o $i.iso; md5sum $i.iso; done
4672ce371fb3c1170a9e71bc4b2810b9 1.iso
4672ce371fb3c1170a9e71bc4b2810b9 2.iso
4672ce371fb3c1170a9e71bc4b2810b9 3.iso
4672ce371fb3c1170a9e71bc4b2810b9 4.iso
4672ce371fb3c1170a9e71bc4b2810b9 5.iso
4672ce371fb3c1170a9e71bc4b2810b9 6.iso

In about a week I will be able to test this on a 2017 XPS 9560 (non-DE) too.

Kai-Heng Feng (kaihengfeng) wrote :

It's another bug. Please refer to LP: #1729674.

Displaying first 40 and last 40 comments. View all 128 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.