Hotplug device addition issue - missing patches on Xenial kernel

Bug #1599250 reported by bugproxy on 2016-07-05
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Tim Gardner
Yakkety
Undecided
Unassigned

Bug Description

== Comment: #0 - Guilherme Guaglianoni Piccoli - 2016-06-29 18:09:04 ==
When performing hotplug device addition to LPAR/guest with Ubuntu 16.04 (kernel 4.4.0-28) we can observe a kernel oops if device makes use of 64-bit DDW DMA. The following stack trace was observed with QLogic NIC (bnx2x driver):

[ 131.52] --- interrupt: 300 at enable_ddw+0x254/0x7c0
                   LR = enable_ddw+0x238/0x7c0
[ 131.52] [c0000001fbc67480] [c000000000089b88] dma_set_mask_pSeriesLP+0x208/0x290
[ 131.52] [c0000001fbc67510] [c0000000000246b8] dma_set_mask+0x58/0xf0
[ 131.52] [c0000001fbc67540] [d00000000387a654] bnx2x_init_one+0x504/0x10f0 [bnx2x]
[ 131.52] [c0000001fbc67620] [c0000000005e4eac] local_pci_probe+0x6c/0x140
[ 131.52] [c0000001fbc676b0] [c0000000005e5d58] pci_device_probe+0x168/0x200
[ 131.52] [c0000001fbc67710] [c0000000006d2530] driver_probe_device+0x1f0/0x610
[ 131.52] [c0000001fbc677a0] [c0000000006d2a6c] __driver_attach+0x11c/0x120
[ 131.52] [c0000001fbc677e0] [c0000000006ceeac] bus_for_each_dev+0x9c/0x110
[ 131.52] [c0000001fbc67830] [c0000000006d198c] driver_attach+0x3c/0x60
[ 131.52] [c0000001fbc67860] [c0000000006d1278] bus_add_driver+0x2d8/0x390
[ 131.52] [c0000001fbc678f0] [c0000000006d39dc] driver_register+0x9c/0x180
[ 131.52] [c0000001fbc67960] [c0000000005e401c] __pci_register_driver+0x6c/0x90

This issue is solved upstream, by the following 3 patches (SHA-1 from Linus tree):

c2078d9ef60 ("Revert \"powerpc/eeh: Fix crash in eeh_add_device_early() on Cell\"")
8445a87f709 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism")
8a934efe943 ("powerpc/pseries: Fix PCI config address for DDW")

So, we want to request the addition of these fixes in Ubuntu Xenial kernel.
Besides, an old and obsolete non-upstream patch related to this issue can be removed (SHA-1 from Ubuntu xenial tree):

623aabd5d68 ("UBUNTU: SAUCE: powerpc/eeh: Validate arch in eeh_add_device_early()")

Thanks,

Guilherme

bugproxy (bugproxy) on 2016-07-05
tags: added: architecture-ppc64le bugnameltc-143243 severity-high targetmilestone-inin16041
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Yakkety):
assignee: Taco Screen team (taco-screen-team) → nobody
status: New → Fix Released
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed

------- Comment From <email address hidden> 2016-07-14 13:34 EDT-------
(In reply to comment #8)
> https://lists.ubuntu.com/archives/kernel-team/2016-July/078752.html

Tim, I'm not really sure about in which release of Xenial's kernel those patches will be available. I mean...in 4.4.0-28 they are not merged, but in the next release they will be there?

Can you help me understand better? Thanks in advance,

Guilherme

Tim Gardner (timg-tpi) wrote :

These patches should appear in Ubuntu-4.4.0-32.51 or higher

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-07-19 14:06 EDT-------
(In reply to comment #10)
> These patches should appear in Ubuntu-4.4.0-32.51 or higher

Thanks very much for the information!

Seth Forshee (sforshee) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-07-29 10:36 EDT-------
(In reply to comment #12)
> This bug is awaiting verification that the kernel in -proposed solves the
> problem. Please test the kernel and update this bug with the results. If the
> problem is solved, change the tag 'verification-needed-xenial' to
> 'verification-done-xenial'.
>
> If verification is not done by 5 working days from today, this fix will be
> dropped from the source code, and this bug will be closed.
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Thank you!

I enabled Proposed repos and installed kernel 4.4.0-34-generic.

Test went fine with this version, so the issue is solved in 4.4.0-34.
Marking as "verification-done-xenial".

Thanks,

Guilherme

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (15.0 KiB)

This bug was fixed in the package linux - 4.4.0-34.53

---------------
linux (4.4.0-34.53) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1606960

  * [APL][SAUCE] Slow system response time due to a monitor bug (LP: #1606147)
    - x86/cpu/intel: Introduce macros for Intel family numbers
    - SAUCE: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont
      based CPUs

linux (4.4.0-33.52) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1605709

  * [regression] NFS client: access problems after updating to kernel
    4.4.0-31-generic (LP: #1603719)
    - SAUCE: (namespace) Bypass sget() capability check for nfs

linux (4.4.0-32.51) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1604443

  * thinkpad yoga 260 wacom touchscreen not working (LP: #1603975)
    - HID: wacom: break out parsing of device and registering of input
    - HID: wacom: Initialize hid_data.inputmode to -1
    - HID: wacom: Support switching from vendor-defined device mode on G9 and G11

  * changelog: add CVEs as first class citizens (LP: #1604344)
    - use CVE numbers in changelog

  * [Xenial] Include Huawei PCIe SSD hio kernel driver (LP: #1603483)
    - SAUCE: import Huawei ES3000_V2 (2.1.0.23)
    - SAUCE: hio: bio_endio() no longer takes errors arg
    - SAUCE: hio: blk_queue make_request_fn now returns a blk_qc_t
    - SAUCE: hio: use alloc_cpumask_var to avoid -Wframe-larger-than
    - SAUCE: hio: fix mask maybe-uninitialized warning
    - [config] enable CONFIG_HIO (Huawei ES3000_V2 PCIe SSD driver)
    - SAUCE: hio: Makefile and Kconfig

  * CVE-2016-5243 (LP: #1589036)
    - tipc: fix an infoleak in tipc_nl_compat_link_dump
    - tipc: fix nl compat regression for link statistics

  * CVE-2016-4470
    - KEYS: potential uninitialized variable

  * integer overflow in xt_alloc_table_info (LP: #1555353)
    - netfilter: x_tables: check for size overflow

  * CVE-2016-3135:
    - Revert "UBUNTU: SAUCE: (noup) netfilter: x_tables: check for size overflow"

  * CVE-2016-4440 (LP: #1584192)
    - kvm:vmx: more complete state update on APICv on/off

  * the system hangs in the dma driver when reboot or shutdown on a baytrail-m
    laptop (LP: #1602579)
    - dmaengine: dw: platform: power on device on shutdown
    - ACPI / LPSS: override power state for LPSS DMA device

  * Add proper palm detection support for MS Precision Touchpad (LP: #1593124)
    - Revert "HID: multitouch: enable palm rejection if device implements
      confidence usage"
    - HID: multitouch: enable palm rejection for Windows Precision Touchpad

  * Add support for Intel 8265 Bluetooth ([8087:0A2B]) (LP: #1599068)
    - Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]

  * CVE-2016-4794 (LP: #1581871)
    - percpu: fix synchronization between chunk->map_extend_work and chunk
      destruction
    - percpu: fix synchronization between synchronous map extension and chunk
      destruction

  * Xenial update to v4.4.15 stable release (LP: #1601952)
    - net_sched: fix pfifo_head_drop behavior vs backlog
    - net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers