Hotplug device addition issue - missing patches on Xenial kernel

Bug #1599250 reported by bugproxy
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Tim Gardner
Yakkety
Fix Released
Undecided
Unassigned

Bug Description

== Comment: #0 - Guilherme Guaglianoni Piccoli - 2016-06-29 18:09:04 ==
When performing hotplug device addition to LPAR/guest with Ubuntu 16.04 (kernel 4.4.0-28) we can observe a kernel oops if device makes use of 64-bit DDW DMA. The following stack trace was observed with QLogic NIC (bnx2x driver):

[ 131.52] --- interrupt: 300 at enable_ddw+0x254/0x7c0
                   LR = enable_ddw+0x238/0x7c0
[ 131.52] [c0000001fbc67480] [c000000000089b88] dma_set_mask_pSeriesLP+0x208/0x290
[ 131.52] [c0000001fbc67510] [c0000000000246b8] dma_set_mask+0x58/0xf0
[ 131.52] [c0000001fbc67540] [d00000000387a654] bnx2x_init_one+0x504/0x10f0 [bnx2x]
[ 131.52] [c0000001fbc67620] [c0000000005e4eac] local_pci_probe+0x6c/0x140
[ 131.52] [c0000001fbc676b0] [c0000000005e5d58] pci_device_probe+0x168/0x200
[ 131.52] [c0000001fbc67710] [c0000000006d2530] driver_probe_device+0x1f0/0x610
[ 131.52] [c0000001fbc677a0] [c0000000006d2a6c] __driver_attach+0x11c/0x120
[ 131.52] [c0000001fbc677e0] [c0000000006ceeac] bus_for_each_dev+0x9c/0x110
[ 131.52] [c0000001fbc67830] [c0000000006d198c] driver_attach+0x3c/0x60
[ 131.52] [c0000001fbc67860] [c0000000006d1278] bus_add_driver+0x2d8/0x390
[ 131.52] [c0000001fbc678f0] [c0000000006d39dc] driver_register+0x9c/0x180
[ 131.52] [c0000001fbc67960] [c0000000005e401c] __pci_register_driver+0x6c/0x90

This issue is solved upstream, by the following 3 patches (SHA-1 from Linus tree):

c2078d9ef60 ("Revert \"powerpc/eeh: Fix crash in eeh_add_device_early() on Cell\"")
8445a87f709 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism")
8a934efe943 ("powerpc/pseries: Fix PCI config address for DDW")

So, we want to request the addition of these fixes in Ubuntu Xenial kernel.
Besides, an old and obsolete non-upstream patch related to this issue can be removed (SHA-1 from Ubuntu xenial tree):

623aabd5d68 ("UBUNTU: SAUCE: powerpc/eeh: Validate arch in eeh_add_device_early()")

Thanks,

Guilherme

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-143243 severity-high targetmilestone-inin16041
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Yakkety):
assignee: Taco Screen team (taco-screen-team) → nobody
status: New → Fix Released
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-07-14 13:34 EDT-------
(In reply to comment #8)
> https://lists.ubuntu.com/archives/kernel-team/2016-July/078752.html

Tim, I'm not really sure about in which release of Xenial's kernel those patches will be available. I mean...in 4.4.0-28 they are not merged, but in the next release they will be there?

Can you help me understand better? Thanks in advance,

Guilherme

Revision history for this message
Tim Gardner (timg-tpi) wrote :

These patches should appear in Ubuntu-4.4.0-32.51 or higher

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-07-19 14:06 EDT-------
(In reply to comment #10)
> These patches should appear in Ubuntu-4.4.0-32.51 or higher

Thanks very much for the information!

Revision history for this message
Seth Forshee (sforshee) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-07-29 10:36 EDT-------
(In reply to comment #12)
> This bug is awaiting verification that the kernel in -proposed solves the
> problem. Please test the kernel and update this bug with the results. If the
> problem is solved, change the tag 'verification-needed-xenial' to
> 'verification-done-xenial'.
>
> If verification is not done by 5 working days from today, this fix will be
> dropped from the source code, and this bug will be closed.
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Thank you!

I enabled Proposed repos and installed kernel 4.4.0-34-generic.

Test went fine with this version, so the issue is solved in 4.4.0-34.
Marking as "verification-done-xenial".

Thanks,

Guilherme

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (15.0 KiB)

This bug was fixed in the package linux - 4.4.0-34.53

---------------
linux (4.4.0-34.53) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1606960

  * [APL][SAUCE] Slow system response time due to a monitor bug (LP: #1606147)
    - x86/cpu/intel: Introduce macros for Intel family numbers
    - SAUCE: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont
      based CPUs

linux (4.4.0-33.52) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1605709

  * [regression] NFS client: access problems after updating to kernel
    4.4.0-31-generic (LP: #1603719)
    - SAUCE: (namespace) Bypass sget() capability check for nfs

linux (4.4.0-32.51) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1604443

  * thinkpad yoga 260 wacom touchscreen not working (LP: #1603975)
    - HID: wacom: break out parsing of device and registering of input
    - HID: wacom: Initialize hid_data.inputmode to -1
    - HID: wacom: Support switching from vendor-defined device mode on G9 and G11

  * changelog: add CVEs as first class citizens (LP: #1604344)
    - use CVE numbers in changelog

  * [Xenial] Include Huawei PCIe SSD hio kernel driver (LP: #1603483)
    - SAUCE: import Huawei ES3000_V2 (2.1.0.23)
    - SAUCE: hio: bio_endio() no longer takes errors arg
    - SAUCE: hio: blk_queue make_request_fn now returns a blk_qc_t
    - SAUCE: hio: use alloc_cpumask_var to avoid -Wframe-larger-than
    - SAUCE: hio: fix mask maybe-uninitialized warning
    - [config] enable CONFIG_HIO (Huawei ES3000_V2 PCIe SSD driver)
    - SAUCE: hio: Makefile and Kconfig

  * CVE-2016-5243 (LP: #1589036)
    - tipc: fix an infoleak in tipc_nl_compat_link_dump
    - tipc: fix nl compat regression for link statistics

  * CVE-2016-4470
    - KEYS: potential uninitialized variable

  * integer overflow in xt_alloc_table_info (LP: #1555353)
    - netfilter: x_tables: check for size overflow

  * CVE-2016-3135:
    - Revert "UBUNTU: SAUCE: (noup) netfilter: x_tables: check for size overflow"

  * CVE-2016-4440 (LP: #1584192)
    - kvm:vmx: more complete state update on APICv on/off

  * the system hangs in the dma driver when reboot or shutdown on a baytrail-m
    laptop (LP: #1602579)
    - dmaengine: dw: platform: power on device on shutdown
    - ACPI / LPSS: override power state for LPSS DMA device

  * Add proper palm detection support for MS Precision Touchpad (LP: #1593124)
    - Revert "HID: multitouch: enable palm rejection if device implements
      confidence usage"
    - HID: multitouch: enable palm rejection for Windows Precision Touchpad

  * Add support for Intel 8265 Bluetooth ([8087:0A2B]) (LP: #1599068)
    - Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]

  * CVE-2016-4794 (LP: #1581871)
    - percpu: fix synchronization between chunk->map_extend_work and chunk
      destruction
    - percpu: fix synchronization between synchronous map extension and chunk
      destruction

  * Xenial update to v4.4.15 stable release (LP: #1601952)
    - net_sched: fix pfifo_head_drop behavior vs backlog
    - net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.