powerpc/powernv/pci: Work around races in PCI bridge enabling

Bug #1788549 reported by bugproxy
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Incomplete
Critical
Canonical Kernel Team
linux (Ubuntu)
In Progress
Critical
Canonical Kernel Team
Xenial
In Progress
Critical
Canonical Kernel Team
Bionic
In Progress
Critical
Canonical Kernel Team
Cosmic
Fix Released
Critical
Canonical Kernel Team

Bug Description

== Comment: #0 - Michael Ranweiler <email address hidden> - 2018-08-21 13:52:03 ==
+++ This bug was initially created as a clone of Bug #170766 +++

Please apply the following kernel patch:

https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=db2173198b9513f7add8009f225afa1f1c79bcc6

powerpc/powernv/pci: Work around races in PCI bridge enabling

The generic code is racy when multiple children of a PCI bridge try to
enable it simultaneously.

This leads to drivers trying to access a device through a
not-yet-enabled bridge, and this EEH errors under various
circumstances when using parallel driver probing.

There is work going on to fix that properly in the PCI core but it
will take some time.

x86 gets away with it because (outside of hotplug), the BIOS enables
all the bridges at boot time.

This patch does the same thing on powernv by enabling all bridges that
have child devices at boot time, thus avoiding subsequent races. It's
suitable for backporting to stable and distros, while the proper PCI
fix will probably be significantly more invasive.

Signed-off-by: Benjamin Herrenschmidt <email address hidden>
Cc: <email address hidden>
Signed-off-by: Michael Ellerman <email address hidden>

== Comment: #2 - Michael Ranweiler <email address hidden> - 2018-08-21 18:23:35 ==
This has some fuzz and also move it back from the pci macro to dev_err so we'll attach the backported patch.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-170793 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → kernel-package (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: triage-g
affects: kernel-package (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Cosmic):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Critical
Changed in linux (Ubuntu Cosmic):
importance: Undecided → Critical
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit db2173198b95. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1788549

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Changed in ubuntu-power-systems:
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I also built a Xenial test kernel with commit db2173198b9513f7add8009f225afa1f1c79bcc6, since it was requested in bug 1788850. That bug is now marked as a duplicate of this bug.

The Xenial test kernel is available from:
http://kernel.ubuntu.com/~jsalisbury/lp1788549/xenial

Revision history for this message
Manoj Iyer (manjo) wrote :

IBM, could you please verify the PPA test kernels provided by the kernel team, and report back here.

Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

IBM is currently testing the test kernel from the PPA referred to in comment #2.

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Changed in linux (Ubuntu Bionic):
status: In Progress → Incomplete
Changed in linux (Ubuntu Cosmic):
status: In Progress → Incomplete
Revision history for this message
Mike Ranweiler (mranweil) wrote :

I've tried this on two systems and tried some different testing - I can't recreate the original problem that is is intermittent and can require quite a few devices, but I did not see any regressions.

Changed in ubuntu-power-systems:
status: Incomplete → Triaged
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in linux (Ubuntu Xenial):
status: Incomplete → In Progress
Changed in linux (Ubuntu Bionic):
status: Incomplete → In Progress
Changed in linux (Ubuntu Cosmic):
status: Incomplete → In Progress
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Mike, do you feel comfortable with the testing you've done to submit an SRU request? Or would you like to perform some additional testing?

Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Released
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Marking as Incomplete while awaiting feedback on testing status.

Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Xenial):
assignee: Joseph Salisbury (jsalisbury) → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Bionic):
assignee: Joseph Salisbury (jsalisbury) → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Cosmic):
assignee: Joseph Salisbury (jsalisbury) → Canonical Kernel Team (canonical-kernel-team)
Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-04-24 02:36 EDT-------
The bug is pretty old now.

https://kernel.ubuntu.com/~jsalisbury/lp1788549/xenial The link which distro provided 2018 time frame is no longer valid now.

rejecting the bug. please free to open new bug if you see the issue in latest release.

Revision history for this message
Frank Heimes (fheimes) wrote :

Just fyi,
this bug is marked as a duplicate of bug 1805245 and with that no longer updated itself.
Work was spent on bug 1805245 and it got Fix Released (for bionic), see:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1805245

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.