virtualbox+usb 3.0 breaks boot, -28 kernel works

Bug #1604058 reported by Lubos Kosco
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
High
Kamal Mostafa
Xenial
Fix Released
High
Kamal Mostafa

Bug Description

Hi guys
linux-image-4.4.0-31-generic kernel started crashing on boot with xhci_pci_probe
on usb 3.0
-28 worked

please see more + kernel logs on
https://forums.virtualbox.org/viewtopic.php?f=3&t=78656

thnx
Lubos

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1604058

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
Changed in linux (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → High
tags: added: kernel-da-key regression-release xenial
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Xenial):
status: Triaged → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This may be due to the following commit:
6ad3777 xhci: Cleanup only when releasing primary hcd

I built a test kernel with this commit reverted. It can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1604058/

Can you test this kernel and see if it resolves this bug?

Note that you need to install both the linux-image and linux-image-extra .deb packages.

Revision history for this message
Lubos Kosco (tarzanek) wrote :

confirming your kernel works,
fwiw did you guys release it in between?

Revision history for this message
Lubos Kosco (tarzanek) wrote :

ah, no release, I see, it just replaced the stock kernel
anyways, this patched(reverted) kernel works

just for testing docs - you need a vbox 5.1 Ubuntu xenial VM WITH usb3.0 (xhci)
to reproduce this (boot will "freeze")

thank you
Lubos

Revision history for this message
martin lantz (martin-lantz) wrote :

Hello Joseph,

your kernel resolves this bug.

I'v installed them in a ubuntu 16.04 guest running in a 5.1.0 virtualbox on a windows 7 host.

The ubuntu guest, configured to use usb 3.0 xhci controller, now boots normally. The guest can access usb memory sticks normally.

Kind regards,
Martin

Revision history for this message
John Veness (pelago) wrote :

This bug also affected me, and the jsalisbury kernel fixed the problem.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This bug may be fixed by the following commit, which is already commited to Xenial master-next:
47e41bd usb: xhci-plat: properly handle probe deferral for devm_clk_get()

I built a Xenial master-next kernel which can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1604058/

Can you test this kernel and see if it resolves this bug?

Note that you need to install both the linux-image and linux-image-extra .deb packages.

Revision history for this message
Lubos Kosco (tarzanek) wrote :

Joseph - the -32 kernel you built still suffers the problem

fwiw - I cannot test more kernels, my time I can spend (as QA) on this bug is gone, please setup a vbox VM and test your builds (it's probably faster than updating this bug)

xing the fingers (fwiw a link to code review/diff might spark more interest in me(I don't know how/where to lookup your changes), QA/QE is really something I am not good at, but then I can grok code ;) )
L

Revision history for this message
John Veness (pelago) wrote :

Yes, the 4.4.0-32.51~lp1604058masterNext kernel doesn't boot for me either.

Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

As previously noted, mainline commit 27a41a83ec54 ("xhci: Cleanup only when releasing primary hcd") causes this boot hang. I can reproduce the problem with VirtualBox 5.1.x, booting any kernel version which includes that commit (Xenial, Yakkety, or mainline v4.7-rc1).

I think I have identified the specific problem in that commit and have a constructed a test patch (attached for reference). This appears to fix the VirtualBox boot hang but I'd like some confirmation that it results in otherwise normal USB3.0 functionality. Here's a Xenial test kernel with the test patch applied:

    http://people.canonical.com/~kamal/lp1604058/

Affected users, please advise whether this fixes the boot hang in your VirtualBox environment, and whether your USB3.0 behavior is back to normal.

Changed in linux (Ubuntu Xenial):
assignee: Joseph Salisbury (jsalisbury) → Kamal Mostafa (kamalmostafa)
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → Kamal Mostafa (kamalmostafa)
tags: added: patch
Revision history for this message
Matthias Metzger (macellarius) wrote :

Hi,

I tested your test kernels and everything is working: no boot hang and USB 3.0 hdds are working with the expected speed.

Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

@Matthias, thanks for the positive test result.

The patch fixing this is now scheduled for inclusion in the next Ubuntu Xenial kernel (a note will be posted here at that time). The patch has also been submitted to mainline Linux: https://lkml.org/lkml/2016/8/1/353

Revision history for this message
Matthias Metzger (macellarius) wrote :

Somehow, since today I'm experiencing the following error messages while booting, still using your modified kernel:

...
[ 28.957096] xhci_hcd 0000:00:0c.0: Command completion event does not match command
...
[ 49.121191] xhci_hcd 0000:00:0c.0: Error while assigning device slot ID
[ 49.121201] xhci_hcd 0000:00:0c.0: Max number of devices this xHCI host supports is 32.
[ 49.121205] usb usb1-port1: couldn't allocate usb_device
...

I don't have any clue where it comes from so suddenly.

Revision history for this message
Stefan Bader (smb) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Matthias Metzger (macellarius) wrote :

Great, the kernel from -proposed is working as expected. All good.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.4 KiB)

This bug was fixed in the package linux - 4.4.0-36.55

---------------
linux (4.4.0-36.55) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1612305

  * I2C touchpad does not work on AMD platform (LP: #1612006)
    - SAUCE: pinctrl/amd: Remove the default de-bounce time

  * CVE-2016-5696
    - tcp: make challenge acks less predictable

linux (4.4.0-35.54) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1611215

  * [i915_bpo] Sync with v4.7 (LP: #1609742)
    - SAUCE: i915_bpo: Sync with v4.7

  * s390/cio: fix reset of channel measurement block (LP: #1609415)
    - s390/cio: allow to reset channel measurement block

  * in Ubuntu16.10: Hit on Call traces and system goes down when transactional
    memory tests are running in 32TB Brazos system (LP: #1606786)
    - powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
    - powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint()

  * Power Menu does not display after press the Power Button (LP: #1609204)
    - intel-vbtn: new driver for Intel Virtual Button
    - [config] enable CONFIG_INTEL_VBTN=m

  * OptiPlex 7450 AIO hangs when rebooting (LP: #1608762)
    - x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk

  * virtualbox+usb 3.0 breaks boot, -28 kernel works (LP: #1604058)
    - SAUCE: xhci: Fix soft lockup in xhci_pci_probe path when XHCI_STATE_HALTED

  * linux-kernel: Freeing IRQ from IRQ context (LP: #1597908)
    - block: defer timeouts to a workqueue

  * Tunnel offload indications not stripped from encapsulated packets, causing
    performance overhead (LP: #1602755)
    - tunnels: Remove encapsulation offloads on decap.

  * lm-sensors is throwing "ERROR: Can't get value of subfeature temp1_input:
    I/O error" for be2net driver (LP: #1607387)
    - be2net: perform temperature query in adapter regardless of its interface
      state

  * Dell dock MAC Address pass through doesn't work in Ubuntu (LP: #1579984)
    - r8152: Add support for setting pass through MAC address on RTL8153-AD

  * vmxnet3 LRO IPv6 performance issues (stalling TCP) (LP: #1605494)
    - Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets

  * ISST-LTE:pVM:monklp5:Ubuntu16.04.1:system crashed at
    lpfc_sli4_scmd_to_wqidx_distr (LP: #1597974)
    - SAUCE: lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from
      lpfc_send_taskmgmt()

  * Backport cxlflash shutdown patch to Xenial SRU (LP: #1605405)
    - SAUCE: cxlflash: Verify problem state area is mapped before notifying
      shutdown

  * Xenial update to v4.4.16 stable release (LP: #1607404)
    - mac80211: fix fast_tx header alignment
    - mac80211: mesh: flush mesh paths unconditionally
    - mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL
    - mac80211: Fix mesh estab_plinks counting in STA removal case
    - EDAC, sb_edac: Fix rank lookup on Broadwell
    - IB/cm: Fix a recently introduced locking bug
    - IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs
    - powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
    - powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
    - usb: dwc2: fix reg...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.