Hotplug remove and re-add adds PCI adapter to next PCI domain (PCI)

Bug #1603574 reported by bugproxy
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Tim Gardner
Yakkety
Won't Fix
Undecided
Unassigned

Bug Description

== Comment: #26 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-07-15 16:26:38 ==
When performing hotplug operations in PowerPC currently, if we remove a PHB and re-add it later, its domain address is incremented. An example below (drmgr is the tool we use in PowerPC to perform this operation):

(i) $> lspci
0000:60:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet
0000:60:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet

(ii) $> drmgr -c phb -s "PHB 20" -r # removing a PHB

(iii) $> lspci #empty

(iv) $> drmgr -c phb -s "PHB 20" -a # re-adding the PHB

(v) $> lspci
0001:60:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet
0001:60:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet

This behavior is harmful when kernel is using network predictable naming for interfaces, for example, since after the PHB re-add, the PCI address of NIC functions changes, and so the interfaces' naming also change. Recently, a patch was merged to powerpc-next to avoid this situation, by relating the PCI domain number with device-tree properties. With the patch, the above steps (i) and (v) present the same output for lspci.

We want to ask Canonical to merge the patch in Ubuntu Xenial's kernel, since it will solve multiple bugs we've experienced by performing NIC hotplug - the commit is present on powerpc-next under the commit-id 63a72284b15.

A link to the patch: https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?id=63a72284b159c569ec52f380c9

Thanks in advance,

Guilherme

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-136437 severity-high targetmilestone-inin16041
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-07-15 18:14 EDT-------
FYI Canonical: This is an updated and upstream patch to address the issue originally documented in LP Bug 1560514.

Revision history for this message
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Yakkety):
status: New → Fix Committed
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Stefan Bader (smb) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-08-16 16:09 EDT-------
Checked on proposed kernel 4.4.0-36 and everything is working fine.
Marking as "verification-done-xenial".

Cheers,

Guilherme

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.4 KiB)

This bug was fixed in the package linux - 4.4.0-36.55

---------------
linux (4.4.0-36.55) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1612305

  * I2C touchpad does not work on AMD platform (LP: #1612006)
    - SAUCE: pinctrl/amd: Remove the default de-bounce time

  * CVE-2016-5696
    - tcp: make challenge acks less predictable

linux (4.4.0-35.54) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1611215

  * [i915_bpo] Sync with v4.7 (LP: #1609742)
    - SAUCE: i915_bpo: Sync with v4.7

  * s390/cio: fix reset of channel measurement block (LP: #1609415)
    - s390/cio: allow to reset channel measurement block

  * in Ubuntu16.10: Hit on Call traces and system goes down when transactional
    memory tests are running in 32TB Brazos system (LP: #1606786)
    - powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
    - powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint()

  * Power Menu does not display after press the Power Button (LP: #1609204)
    - intel-vbtn: new driver for Intel Virtual Button
    - [config] enable CONFIG_INTEL_VBTN=m

  * OptiPlex 7450 AIO hangs when rebooting (LP: #1608762)
    - x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk

  * virtualbox+usb 3.0 breaks boot, -28 kernel works (LP: #1604058)
    - SAUCE: xhci: Fix soft lockup in xhci_pci_probe path when XHCI_STATE_HALTED

  * linux-kernel: Freeing IRQ from IRQ context (LP: #1597908)
    - block: defer timeouts to a workqueue

  * Tunnel offload indications not stripped from encapsulated packets, causing
    performance overhead (LP: #1602755)
    - tunnels: Remove encapsulation offloads on decap.

  * lm-sensors is throwing "ERROR: Can't get value of subfeature temp1_input:
    I/O error" for be2net driver (LP: #1607387)
    - be2net: perform temperature query in adapter regardless of its interface
      state

  * Dell dock MAC Address pass through doesn't work in Ubuntu (LP: #1579984)
    - r8152: Add support for setting pass through MAC address on RTL8153-AD

  * vmxnet3 LRO IPv6 performance issues (stalling TCP) (LP: #1605494)
    - Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets

  * ISST-LTE:pVM:monklp5:Ubuntu16.04.1:system crashed at
    lpfc_sli4_scmd_to_wqidx_distr (LP: #1597974)
    - SAUCE: lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from
      lpfc_send_taskmgmt()

  * Backport cxlflash shutdown patch to Xenial SRU (LP: #1605405)
    - SAUCE: cxlflash: Verify problem state area is mapped before notifying
      shutdown

  * Xenial update to v4.4.16 stable release (LP: #1607404)
    - mac80211: fix fast_tx header alignment
    - mac80211: mesh: flush mesh paths unconditionally
    - mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL
    - mac80211: Fix mesh estab_plinks counting in STA removal case
    - EDAC, sb_edac: Fix rank lookup on Broadwell
    - IB/cm: Fix a recently introduced locking bug
    - IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs
    - powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
    - powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
    - usb: dwc2: fix reg...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → nobody
Changed in linux (Ubuntu Yakkety):
assignee: Taco Screen team (taco-screen-team) → nobody
Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie yakkety. The bug task representing the yakkety nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Won't Fix
Po-Hsu Lin (cypressyew)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.