kernel: fix x86 DMI checks for PCI quirks

Bug #225811 reported by Matt Domsch on 2008-05-02
4
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Medium
Tim Gardner
Hardy
Medium
Tim Gardner
Intrepid
Medium
Tim Gardner

Bug Description

http://lkml.org/lkml/2008/5/2/205

On Fri, May 02, 2008 at 02:19:34AM -0700, Yinghai Lu wrote:
> On Fri, May 2, 2008 at 12:44 AM, Ingo Molnar <email address hidden> wrote:
> > * Matt Domsch <email address hidden> wrote:
> >
> > > fix x86 DMI checks for PCI quirks
> > >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=10583
> > > https://bugzilla.redhat.com/show_bug.cgi?id=444791
> > >
> > > Since git commit 08f1c192c3c32797068bfe97738babb3295bbf42 (between
> > > kernels 2.6.22 and 2.6.23), arch/x86/pci/acpi.c has not called
> > > pcibios_scan_root(), which would have called
> > > arch/x86/pci/common.c:dmi_check_system(). This has prevented the
> > > quirks listed in pciprobe_dmi_table[] from being checked and
> > > appropriate action taken.
> >
> > ugh ...
> >
> >
> > > This manifests itself in several Dell and HP servers not automatically
> > > having the pci=bfsort option be applied, as well as Samsung X20 and
> > > Compaq EVO N800c systems needing pci=assign-all-busses was no longer
> > > automatically applied.
> >
> > Jesse Barnes (new PCI maintainer) Cc:-ed.
>
> please check the patch in x86.git, it should do the same thing, but
> put the call in pci_access_init...
>
> commit 9817aa147000086bc11b571620ecc1c73a4a614b
> Author: Yinghai Lu <email address hidden>
> Date: Mon Apr 14 15:40:37 2008 -0700

Indeed it does (boot tested on one of the affected systems), and is a
simpler patch. I'd be quite happy with this. Bonus that it's already
in the x86.git tree. :-)

Ingo, is this ready to go to Linus?

Now to get it backported to -stable...

Adding the folks from HP who have lots of systems listed.

Thanks,
Matt

--
Matt Domsch
Linux Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux

Changed in linux:
status: Unknown → Fix Released
Tim Gardner (timg-tpi) on 2008-05-02
Changed in linux:
assignee: nobody → timg-tpi
importance: Undecided → Medium
milestone: none → ubuntu-8.04.1
status: New → In Progress
Matt Domsch (matt-domsch) wrote :

fix I submitted to <email address hidden>.

Tim Gardner (timg-tpi) wrote :

SRU Justification:

Impact: Several Dell and HP servers do not detect network interfaces in the correct order.

Fix Description: Call dmi_check_pciprobe() earlier in the PCI scan process.

Patch: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commit;h=9165bd62ac85685eebd4e9e6d10617c0a1adac24

TEST CASE: Upgrade a server from Gutsy to Hardy. The network interfaces are out of order in Hardy from what they were in Gutsy.

Steve Langasek (vorlon) wrote :

Dropping the 'verification-needed' tag, which is for packages that have already been accepted into -proposed and are ready to be verified; from your last comment, I understand that this has only been committed to git, it's not yet available in -proposed.

I'm also not clear on why a change in the ordering of network interface detection should be an SRU justification. We use udev to enforce network device name ordering, so why should changing the order of the kernel scanning warrant an SRU?

Matt Domsch (matt-domsch) wrote :

Steve:

The fundamental problem is that users expect the NICs on the motherboard to be named "eth0" and "eth1" in Linux. Without this SRU, they're named "eth1" and "eth0" (e.g. backwards as compared to the silkscreen on the case, and in BIOS SETUP pages) from expectations, which is confusing to system administrators.

http://linux.dell.com/files/whitepapers/nic-enum-whitepaper-v3.pdf is a whitepaper I wrote describing the problem and various solutions. Auto-enabling pci=bfsort is really one workaround, about a 90% solution, to the generic problem. That hints at a 100% solution, which is a new udev helper program called biosdevname (http://linux.dell.com/biosdevname and http://linux.dell.com/git biosdevname).

Yes, it can be fixed up after the fact with udev rules. But it's better if it need not be, which is what this patch allows.

This was a bug introduced into the 2.6.23 kernel that disabled this feature that had been around for several kernel releases. This SRU fixes that bug. It affects 5 Dell servers, and 20+ HP servers, as well as a couple laptops.

Changed in dell:
importance: Undecided → High
status: New → Confirmed
Steve Langasek (vorlon) wrote :

A couple of observations here:

- this regression won't affect anyone upgrading from previous Ubuntu releases, because we do have udev rules in place which by default remember the device name mappings once established.
- for the same reason, this fix won't /benefit/ any users who have already installed 8.04, because the "wrong" mapping will already be committed in /etc/udev/rules.d/70-persistent-net.rules.

On balance, I agree that we want this for .1; but we at least need to be aware that users who installed 8.04 will still be affected by the reversed device names.

Changed in linux:
assignee: nobody → timg-tpi
importance: Undecided → Medium
milestone: none → ubuntu-8.04.1
status: New → Incomplete
status: Incomplete → In Progress
Steve Langasek (vorlon) on 2008-05-07
Changed in linux:
milestone: ubuntu-8.04.1 → none
Martin Pitt (pitti) wrote :

Accepted into -proposed, please test and give feedback here

Changed in linux:
milestone: ubuntu-8.04.1 → none
status: In Progress → Fix Committed
Steve Langasek (vorlon) on 2008-06-04
Changed in linux:
milestone: none → ubuntu-8.04.1
Martin Pitt (pitti) wrote :

Copied to hardy-updates. The new kernel was tested extensively by many people, who reported back in other bug reports. Due to lack of feedback, this particular bug was not confirmed to be tested, though. Please report back here if the bug still occurs for you with the new kernel packages, then we will reopen this bug.

Changed in linux:
status: Fix Committed → Fix Released
Changed in dell:
status: Confirmed → Fix Released

This is "Fix Released" for Intrepid.

Changed in linux:
status: Fix Committed → Fix Released
Changed in linux:
importance: Unknown → Medium
Changed in somerville:
importance: Undecided → High
status: New → Fix Released
no longer affects: dell
Timothy R. Chavez (timrchavez) wrote :

The bug task for the somerville project has been removed by an automated script. This bug has been cloned on that project and is available here: https://bugs.launchpad.net/bugs/1305499

no longer affects: somerville
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.