Renesas / NEC - µPD720202 needs XHCI_TRUST_TX_LENGTH quirk? Logitech C920 Webcam/Zoneminder/Startech

Bug #1710548 reported by Jeffrey Miller on 2017-08-14
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

Building zoneminder server. I think I stumbled on a kernel bug.

The suggestion is that the popular Renesas / NEC - µPD720202 may need "quirk" in the kernel.

Wiped the drive.
Loaded 17.04 Server
Now getting 100's of messages in DMESG complaining of thousands of errors:
[96415.044917] handle_tx_event: 516 callbacks suppressed
[96415.044924] xhci_hcd 0000:0c:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?

Messages suggest "quirk" is the solution.
a)Researched usb 3.0 "quirk'
b)Discovered it's really a "thing"
c)Couldn't find recipe for patching kernel for NEC/Renasas "quirk" fix.

[EDIT]: Note C920 webcam is USB 2.0.
Note DMESG errors disappear when cam is plug into motherboard USB 2.0 ports.

Anecdotes suggest Nec/Renasas firmware is to blame.
a) Kernel indicates 2024 firmaware
b) Startech website has 2026 firmware.
c) SIDEBAR: WindowCentric GUI instructions do not translate between ASCII and HEX. So if your Startech is on channels 9, 10, 11, and 12 from perspective of Windows, you must translate into 09, 0A, 0B, and 0C when editing runfile for effective firmware update of all 4 controllers. Yes, for the 4 channel controller, count them: 4 update iterations.
d) Firmware updated.

Rebooted with new Startech/Nec/Renases 2026 firmware.
SAME OLD PLETHORA OF ERRORS

I give up.

It may well be worth noting that there are no errors from DMESG until some seconds after the first webcam is attached to the Startech, while zoneminder
is running.

I think the file attached represents DMESG immediately before the errors start going nuts. My theory is that ZM does lazy polls on missing cams but once it gets a hold of the cam and sucking data the quirk bug surfaces.

ProblemType: Bug
DistroRelease: Ubuntu 17.04,17.10,18.04
Package: linux-image-4.10.0-32-generic 4.10.0-32.36 also in 4.13 and 4.15
ProcVersionSignature: Ubuntu 4.10.0-32.36-generic 4.10.17
Uname: Linux 4.10.0-32-generic x86_64
ApportVersion: 2.20.4-0ubuntu4.5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jeff 12776 F.... pulseaudio
 /dev/snd/controlC3: jeff 12776 F.... pulseaudio
 /dev/snd/controlC2: jeff 12776 F.... pulseaudio
 /dev/snd/controlC1: jeff 12776 F.... pulseaudio
Date: Sun Aug 13 16:48:41 2017
MachineType: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F
ProcEnviron:
 LANGUAGE=en_US
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:
 0 mgadrmfb
 1 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-32-generic root=UUID=4f802241-4254-4d61-9231-d7e16de36a3d ro quiet splash vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-32-generic N/A
 linux-backports-modules-4.10.0-32-generic N/A
 linux-firmware 1.164.1
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/31/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3.2a
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: X9SRE/X9SRE-3F/X9SRi/X9SRi-3F
dmi.board.vendor: Supermicro
dmi.board.version: 0123456789
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 17
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 0123456789
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3.2a:bd08/31/2015:svnSupermicro:pnX9SRE/X9SRE-3F/X9SRi/X9SRi-3F:pvr0123456789:rvnSupermicro:rnX9SRE/X9SRE-3F/X9SRi/X9SRi-3F:rvr0123456789:cvnSupermicro:ct17:cvr0123456789:
dmi.product.name: X9SRE/X9SRE-3F/X9SRi/X9SRi-3F
dmi.product.version: 0123456789
dmi.sys.vendor: Supermicro

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete

Thank you, Joseph!

I did test upstream at v4.13-rcr4 as you suggested. Same complaints. I tried v4.13-rcr4 "lowlatency" as well, same complaints.

What seems strange to me is that I don't think I was getting these complaints under 16.04 LTS Desktop with Studio (ie lowlatency)/LAMP/ZoneMinder overlays. This same machine, same H/W, was subject to intense scrutiny under 16.04 schema as described for at least 3 months after installing the NEC/Renasas 27xxx2 based Startech card.

Very hard to believe I didn't pull a DMESG in all that time.

But stranger things have happened.

Will try to follow on this report as suggested.

Placeholder for attachment

Post script: these warnings do not crash or otherwise interfere with system operation. ZoneMinder, in particular, `seems to function just fine.

It's just...annoying and it seems the fix should be simple. Either this chipset needs quirk, or not.

Does that sound crazy?

Will sit, and think, and talk to you soon.

Dear Kai-Heng Feng,

Sure enough you fixed it. No complaints now.

I'd be very curious to look over the source code changes to improve my linux-fu.

But that's secondary.

For now I hope an administrator can mark up my bug report to ensure that the changes you have made get propagated.

Thank you very much for taking my bug report seriously.

-Jeff

Placeholder for DMESG attachment after the fix incorporated by Kai-Heng Feng in

http://people.canonical.com/~khfeng/linux-image-4.13.0-rc5+_4.13.0-rc5+-2_amd64.deb

Note well the kernel complaints such as:

[96415.044917] handle_tx_event: 516 callbacks suppressed
[96415.044924] xhci_hcd 0000:0c:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?

..are gone.

Placeholder for DMESG attachment after the fix, sorry: file extension collision.

Kai-Heng Feng (kaihengfeng) wrote :

You can check the patch here: https://lkml.org/lkml/2017/8/18/4

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
summary: - Nec/Renasas 27xxx2 May also need Quirk in 17.04, no problem in 16.04?
- Logitech C920 Webcam/Zoneminder/Startech
+ Renesas / NEC - µPD720202 May also need Quirk in 17.04, no problem in
+ 16.04? Logitech C920 Webcam/Zoneminder/Startech
description: updated

Title and post edited to reflect actual chipset name Renesas / NEC - µPD720202 not 27xxx2 as initially and mistakenly reported.

Do not duplicate as new bug.

Curious as to when this fix will propagate into mainline. Haven't looked into it very deeply, but it seems that kernel/"ubuntu base" updates trounce the fix.

summary: - Renesas / NEC - µPD720202 May also need Quirk in 17.04, no problem in
- 16.04? Logitech C920 Webcam/Zoneminder/Startech
+ Renesas / NEC - µPD720202 needs XHCI_TRUST_TX_LENGTH quirk? Logitech
+ C920 Webcam/Zoneminder/Startech

Bug still exists in 4.15 kernel/Bionc Beaver

description: updated
description: updated
Kai-Heng Feng (kaihengfeng) wrote :

This was my attempt to send a quirk patch:
https://lkml.org/lkml/2017/9/6/516

Hmm, having resigned to building my own kernels for the foreseeable future, I DL'd the source tree for bionic and took a peek at xhci-pci.c It looks like someone has tried to incorporate the patch of interest. I find:

if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
   pdev->device == 0x0014)
  xhci->quirks |= XHCI_TRUST_TX_LENGTH;
 if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
   pdev->device == 0x0015)
  xhci->quirks |= XHCI_RESET_ON_RESUME;

So I recompiled as-is, though common sense suggested I was already running the exact same (bionic) kernel.

The build seemed to succeed but the errors in dmesg persist.

Could the 14 in:

pdev->device == 0x0014)

.. represent a typo?

I guess I'll change that tonight and try recompiling.

-Jeff

Yeah it's a typo.

Download full text (4.7 KiB)

 Hi,
It looks like someone attempted a fix, but it was beset by a typo. I fixed the typo and compiled the kernel and the issue was resolved. I detailed the typo in the original thread. Can you run this up the flagpole for me? I don't know how to get in touch with the devs.
Thank you,
-Jeff
    On Sunday, March 25, 2018, 9:36:06 PM PDT, Kai-Heng Feng <email address hidden> wrote:

 This was my attempt to send a quirk patch:
https://lkml.org/lkml/2017/9/6/516

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1710548

Title:
  Renesas / NEC - µPD720202 needs XHCI_TRUST_TX_LENGTH quirk? Logitech
  C920 Webcam/Zoneminder/Startech

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Building zoneminder server. I think I stumbled on a kernel bug.

  The suggestion is that the popular Renesas / NEC - µPD720202 may need
  "quirk" in the kernel.

  Wiped the drive.
  Loaded 17.04 Server
  Now getting 100's of messages in DMESG complaining of thousands of errors:
  [96415.044917] handle_tx_event: 516 callbacks suppressed
  [96415.044924] xhci_hcd 0000:0c:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?

  Messages suggest "quirk" is the solution.
  a)Researched usb 3.0 "quirk'
  b)Discovered it's really a "thing"
  c)Couldn't find recipe for patching kernel for NEC/Renasas "quirk" fix.

  [EDIT]: Note C920 webcam is USB 2.0.
  Note DMESG errors disappear when cam is plug into motherboard USB 2.0 ports.

  Anecdotes suggest Nec/Renasas firmware is to blame.
  a) Kernel indicates 2024 firmaware
  b) Startech website has 2026 firmware.
  c) SIDEBAR: WindowCentric GUI instructions do not translate between ASCII and HEX. So if your Startech is on channels 9, 10, 11, and 12 from perspective of Windows, you must translate into 09, 0A, 0B, and 0C when editing runfile for effective firmware update of all 4 controllers. Yes, for the 4 channel controller, count them: 4 update iterations.
  d) Firmware updated.

  Rebooted with new Startech/Nec/Renases 2026 firmware.
  SAME OLD PLETHORA OF ERRORS

  I give up.

  It may well be worth noting that there are no errors from DMESG until some seconds after the first webcam is attached to the Startech, while zoneminder
  is running.

  I think the file attached represents DMESG immediately before the
  errors start going nuts. My theory is that ZM does lazy polls on
  missing cams but once it gets a hold of the cam and sucking data the
  quirk bug surfaces.

  ProblemType: Bug
  DistroRelease: Ubuntu 17.04,17.10,18.04
  Package: linux-image-4.10.0-32-generic 4.10.0-32.36 also in 4.13 and 4.15
  ProcVersionSignature: Ubuntu 4.10.0-32.36-generic 4.10.17
  Uname: Linux 4.10.0-32-generic x86_64
  ApportVersion: 2.20.4-0ubuntu4.5
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  jeff      12776 F.... pulseaudio
   /dev/snd/controlC3:  jeff      12776 F.... pulseaudio
   /dev/snd/controlC2:  jeff      12776 F.... pulseaudio
   /dev/snd/controlC1:  jeff      12776 F.... pulseaudio
  Date: Sun Aug 13 16:48:41 2017
  MachineType: Supermicro X9SRE/X9SRE-3F/X...

Read more...

DARRIN VALLIS (dvallis) wrote :

This may be related.

I am trying to use USB-PCIe cards based on the Renesas 720202 chipset with an Intel dual Xeon 4114 server. 720202 devices are visible in lspci, but do not recognize USB device plug events.

dmesg attached.

Anyone familiar with this bug? I really need to get it resolved.

Brad Figg (brad-figg) on 2019-07-24
tags: added: cscc
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers