Fix MSI mapping quirk on HT-based nVidia platform

Bug #181081 reported by Stas Sușcov
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Fedora)
Fix Released
High
linux (Ubuntu)
Fix Released
Medium
Tim Gardner
Hardy
Fix Released
Medium
Tim Gardner
Intrepid
Fix Released
Medium
Tim Gardner

Bug Description

Binary package hint: nic-modules-2.6.24-2-generic-di

My network controller can not get an ip from dhcp after installing the 7th December build of Ubuntu Hardy.
First I thought it because of this bug https://bugs.edge.launchpad.net/ubuntu/+bug/124086 but after testing the cdrom on a different computer (where everything worked) I understood that it's a problem with my r8169 module on hardy release kernel 2.6.24.

My network controller after lspci is:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)

The system I own is an Asus A6M laptop, with AMD 64bit CPU on an NForce2 motherboard.

Revision history for this message
Brian Murray (brian-murray) wrote :

Please include as attachments the following additional information, if you have not already done so (please pay attention to lspci's additional options), as required by the Ubuntu Kernel Team:
1. Please include the output of the command 'uname -a' in your next response. It should be one, long line of text which includes the exact kernel version you're running, as well as the CPU architecture.
2. Please run the command 'dmesg > dmesg.log' and attach the resulting file 'dmesg.log' to this bug report.
3. Please run the command 'sudo lspci -vvnn > lspci-vvnn.log' and attach the resulting file 'lspci-vvnn.log' to this bug report.
For your reference, the full description of procedures for kernel-related bug reports is available at https://wiki.ubuntu.com/KernelTeamBugPolicies . Thanks in advance!

Changed in linux:
status: New → Incomplete
Revision history for this message
Stas Sușcov (sushkov) wrote : Re: [Bug 181081] Re: r8169 module for Ethernet controller doesn't work well with hardy heron kernel 2.6.24

Thank you for your assistance.
I've attached the specified files.

--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message
Brian Murray (brian-murray) wrote : Re: r8169 module for Ethernet controller doesn't work well with hardy heron kernel 2.6.24

Unfortunately, Launchpad does not accept attachments via e-mail. Could you please upload them by visiting your bug's web page? Thanks in advance.

Revision history for this message
Stas Sușcov (sushkov) wrote :

dmesg

Revision history for this message
Stas Sușcov (sushkov) wrote :

lspci -vvnn

Revision history for this message
Stas Sușcov (sushkov) wrote :

uname -a

Revision history for this message
Laurent (laurent-goujon) wrote :

Got the same problem so I'm attaching my own logs

Noticed some warning messages when running manually dhclient
[ 1154.691236] NETDEV WATCHDOG: eth0: transmit timed out
[ 1154.699360] r8169: eth0: link up

$ uname -a
Linux gemini 2.6.24-3-generic #1 SMP Thu Jan 3 22:50:33 UTC 2008 x86_64 GNU/Linux

Revision history for this message
Stas Sușcov (sushkov) wrote :

Sorry, I didn't know that.
Can you please check these files?
Thank you!

Revision history for this message
Laurent (laurent-goujon) wrote :
Revision history for this message
Laurent (laurent-goujon) wrote :
description: updated
Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
description: updated
Revision history for this message
Stas Sușcov (sushkov) wrote :

I've backgrounded a tcpdump -i eth0 -vv > tcpdump.log
and ran dhclient eth0.
The log is attached.

Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

Very sorry hit enter by mistake...

Description of problem:
My gigabit ethernet card used to work with kernel before 2.6.24, now it doesn't

Additional info:
lspci -v:
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI
Express Gigabit Ethernet controller (rev 01)
 Subsystem: ASUSTeK Computer Inc. Unknown device 11f5
 Flags: bus master, fast devsel, latency 0, IRQ 11
 I/O ports at c800 [size=256]
 Memory at dcfff000 (64-bit, non-prefetchable) [size=4K]
 Expansion ROM at dcfe0000 [disabled] [size=64K]
 Capabilities: [40] Power Management version 2
 Capabilities: [48] Vital Product Data <?>
 Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable-
 Capabilities: [60] Express Endpoint, MSI 00
 Capabilities: [84] Vendor Specific Information <?>
 Kernel driver in use: r8169

relevant dmesg:
...
eth0: RTL8168b/8111b at 0xffffc2000001e000, 00:18:f3:32:78:91, XID 38000000 IRQ 11
...
r8169: eth0: link up

Which is exactly identical to what it said in 2.6.23...

Revision history for this message
In , Vaclav (vaclav-redhat-bugs) wrote :

Can you post the latest kernel version you have tried? Also please send output
from ifconfig, ethtool.
If you have time and can still reproduce the bug with the latest kernel version,
please read http://fedoraproject.org/wiki/BugsAndFeatureRequests and add a more
precise description to this bug.

Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

Created attachment 296365
ethtool output

Please forgive me for the lack of detail in initial report, I had no time to
deeply test anything.

- Latest kernel I've tried is 2.6.24-2.fc9 (from the livecd) both 32 and 64
bits

attached ethtool

Revision history for this message
Stas Sușcov (sushkov) wrote :

Tested hardy beta-5 on a r8169 eth card and it worked!
Please close this bug.

Changed in linux:
status: Triaged → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

Stanislav by setting the bug to Fix Released you have effectively closed the bug report. It might be useful for other bug reporters if you were to add the specific kernel version, found via 'cat /proc/version_signature', that you found this fixed in. Thanks again!

Revision history for this message
Laurent (laurent-goujon) wrote :

Still not working for me with the latest kernel version:

$ cat /proc/version_signature
Ubuntu 2.6.24-10.16-generic

However after diffing between 2.6.22 and 2.6.24 I saw that one of the main difference is that MSI is enabled. So I added pci=nomsi at the end of the boot commandline and then the wired network worked again.

Revision history for this message
Stas Sușcov (sushkov) wrote : Re: [Bug 181081] Re: r8169 module for Ethernet controller doesn't work well with hardy heron kernel 2.6.24

Laurent: your network card uses the r8169 module and you tested it
with hardy beta 5?

--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message
Laurent (laurent-goujon) wrote : Re: r8169 module for Ethernet controller doesn't work well with hardy heron kernel 2.6.24

Yes my network card is using the r8169 module.
And I suppose that kernel included into beta 5 is 2.6.24-10.16 but I have this problem since using 2.6.24

Revision history for this message
Laurent (laurent-goujon) wrote :

~$ cat /proc/version_signature
Ubuntu 2.6.24-8.14-generic

Same thing

Revision history for this message
In , Vaclav (vaclav-redhat-bugs) wrote :

I suppose, you are using DHCP right? Please try to assign a fixed IP address for
adapter and try it again. Also try to use the latest kernel, as rawhide is a
moving target and it can be broken in some older versions :-). I have similar
network card in my ntb and it works correctly with the last kernels.

Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

Yes, I was using DHCP. I've also tried static IP, but the results are the same
(except tha pinging the router makes its led to blink, but still "Destination
host Unreachable")
Latest fedora kernel does not boot, sorry I cannot try it...
Anyway, A friend of mine which has my same notebook (Asus A6T) and uses Gentoo
told me that he is having the same problem, from 2.6.24 to whatever is the
latest version of vanilla-sources (2.6.25.?)...
If it's really the same problem, then it's not fedora specific, should I file
a bug at kernel.org?

Revision history for this message
In , Vaclav (vaclav-redhat-bugs) wrote :

Oh I nearly forgot. Is there anything interesting in /var/log/messages? I
noticed also some similar bugreport on debian list with the same notebook type.
Please also try to boot with acpi=off if that works, narrowing down by trying
pci=noacpi instead. You should also try it with the latest kernel, it should
make them bootable for you again.

Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

Argh, I remember there are many errors regarding many different, and I hope
unrelated problems, including at least the following (quoting from memory,
I'll send precise output ASAP, away from home now...)
-various kinds of ACPI errors, both with acpi=off and pci=noacpi
-various other APIC errors: I have to use "noapic" to get anything below
2.6.23 to boot (known problem for this laptop... :( ), 2.6.24 on the other
hand, boots even without this... hmmm...
-something being not E820-reserved
-tons of repeated lines of complaints about ata1, which I will do you the
favour of removing, because they make the rest unreadable

In spite of all the above, my machine worked *just fine* until now...

Revision history for this message
In , Jay (jay-redhat-bugs) wrote :

I can confirm that the kernel upgrade had the same effect on my system; works
correctly under the prior kernel, the latest upgrade broke it. Let me know what
additional steps you'd like me to take; DHCP is broken, haven't attempted a
static assignment yet but can.

Revision history for this message
Laurent (laurent-goujon) wrote : Re: r8169 module for PCI ID 1043:11f5 doesn't work well with hardy heron kernel 2.6.24

Putting back in confirmed state. I'll still have this bug and obviously no fix has been released

Changed in linux:
status: Fix Released → Confirmed
Revision history for this message
Shane Guillory (shane-rppl) wrote :

Some more feedback on this -

We are using Xubuntu Feisty and testing Hardy with a Kontron COMexpress CPU module with the RealTek 8111/8168 PCI-E adapter and had similar issues to other bugs reported here.

The manufacturer's r8168 driver (now at 8.005.00 as of 2008-01-29) for the 8111/8168 family(PCI id 10EC:8168) is generally much more robust than the r8169 driver in the kernel. With the mfg's r8168 (8168, that is, not 8169), I don't need to periodically restart the network any more, and it also fixed a major issue for us - the network adapter only worked if the adapter was physically connected at startup, the network would come up but not quite work right if connected after startup.

I don't know the process for this, but based on our experience, we would recommend that the mfg's 8168 driver be incorporated into the kernel, and the [10EC:8168] pci id to be unassociated with the r8169 driver.

For those needing a workaround, the mfg's site is:
http://www.realtek.com.tw/downloads/downloadsView.aspx?Langid=1&PNid=13&PFid=5&Level=5&Conn=4&DownTypeID=3&GetDown=false

Changed in linux:
status: Unknown → Confirmed
Changed in linux:
status: Unknown → Confirmed
Changed in linux:
status: Confirmed → In Progress
Revision history for this message
In , Nicolò (nicol-redhat-bugs) wrote :

WORKS, at least for me,
F9-Beta, 2.6.25-0.121.rc5.git4.fc9 (why don't you use simpler version
numbers? ;-)

Revision history for this message
In , Vaclav (vaclav-redhat-bugs) wrote :

Great, I'm closing it with rawhide resolution.

Changed in linux:
status: Confirmed → Fix Released
Revision history for this message
Laurent (laurent-goujon) wrote :

As stated into fedora bug, and tested by myself, current Linus tree doesn't exhibit the problem.
I'll try to find the source of the problem using git and git-bisect

Revision history for this message
Shane Guillory (shane-rppl) wrote :

Regarding the inability to reproduce the problem, the problems that we have observed are squirrelly and vary from unit to unit. The only thing that we can consistently reproduce with our Kontron COMExpress-CD boards is that the network does not work with r8169 driver if the cable is not connected at boot time (even if the driver is re-loaded once the network is connected).

RealTek's r8168 driver works gloriously every time; it seems that it does a better job of initializing the device and clearing pathological states that stochastically come up in the boards that we are using.

Given that RealTek has found it necessary to release a separate driver for the r8168 (which they are keeping up to date and releasing under GPL), and multiple users are reporting that it fixes issues for them on their specific hardware compared to the r8169, is there a compelling reason NOT to use this driver in the kernel for R8168/8111 devices instead of the r8169?

Revision history for this message
Laurent (laurent-goujon) wrote :

After a lots of kernel compilation, it seems bug have been fixed (for myself at
least) since linux-2.6.25rc3.
Using git bisect, it seems the following patch fixed the bug:

commit 9dc625e72309e1c919ea3e7f51d0ffca96123787
Author: Peer Chen <email address hidden>
Date: Mon Feb 4 23:50:13 2008 -0800

    PCI: quirks: set 'En' bit of MSI Mapping for devices onHT-based nvidia
platform

    According to HT spec, to get message interrupt from devices mapped to HT
    interrupt message, the 'En' bit of MSI Mapping capability need to be set.
    The patch do this setting in quirks code for the devices on HT-based nvidia
    platform.

    [<email address hidden>: coding-style fixes]

    Signed-off-by: Andy Currid <email address hidden>
    Signed-off-by: Peer Chen <email address hidden>
    Cc: "Eric W. Biederman" <email address hidden>
    Cc: Ingo Molnar <email address hidden>
    Cc: Thomas Gleixner <email address hidden>
    Cc: Andi Kleen <email address hidden>
    Signed-off-by: Andrew Morton <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

:040000 040000 479224d3d9b51c6554b70f00224963ec124cb6a7
a0e3e966c5b27a7508cc63423d477285cd52278f M drivers

This patch seems to fix MSI handling for NVidia chipset.

Next step: backport the patch to 2.6.24-16 (should be fairly simple) and check if it fixes the bug definitively

Revision history for this message
Laurent (laurent-goujon) wrote :

Patching went successful (offset was not right though) and after reboot i got the IP address as usual.

If possible please include it to Hardy. Having no network is not a good experience, especially for a long-term support release.

Changed in linux:
status: Confirmed → Triaged
Changed in linux:
milestone: none → ubuntu-8.04.1
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Laurent,

Care to test the Intrepid Ibex 8.10 kernel? It was most recently rebased with the upstream 2.6.25
kernel (which contains the patch you've referenced) and is currently available in the following PPA:

https://edge.launchpad.net/~kernel-ppa/+archive

If you are not familiar with how to install packages from a PPA basically do the following:

Create the file /etc/apt/sources.list.d/kernel-ppa.list to include the following two lines:

deb http://ppa.launchpad.net/kernel-ppa/ubuntu hardy main
deb-src http://ppa.launchpad.net/kernel-ppa/ubuntu hardy main

Then run the command:

sudo apt-get update

You should then be able to install the linux-image-2.6.25 kernel package. After you've finished testing you can remove the kernel-ppa.list file and run 'sudo apt-get update' once more to restore your system. Please let us know your results. Thanks.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

SRU Justification:

Impact: MSI interrupts do not work correctly on some nVidia platforms

Fix Description: According to HT spec, to get message interrupt from devices mapped to HT interrupt message, the 'En' bit of MSI Mapping capability need to be set. The patch do this setting in quirks code for the devices on HT-based nvidia platform. This patch is also in 2.6.25 stable.

Patch: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commit;h=6878851b6f94bab0f847ff97b58423d893d145c2

TEST CASE: See Bug Description.

Revision history for this message
Tim Gardner (timg-tpi) wrote :
Changed in linux:
assignee: ubuntu-kernel-team → timg-tpi
status: Triaged → Fix Committed
Revision history for this message
Laurent (laurent-goujon) wrote :

Hi Leann,

Tested with success latest linux-2.6.25-1-generic package. I'm now waiting for the fixed packaged for Hardy Heron

Changed in linux:
assignee: nobody → timg-tpi
importance: Undecided → Medium
milestone: none → ubuntu-8.04.1
status: New → Fix Committed
Steve Langasek (vorlon)
Changed in linux:
milestone: ubuntu-8.04.1 → none
Revision history for this message
Martin Pitt (pitti) wrote :

Accepted into -proposed, please test and give feedback here

Revision history for this message
Steve Langasek (vorlon) wrote :

Laurent, Shane, can you please test the 2.6.24-19 kernel in hardy-proposed and let us know whether this kernel works for you?

Revision history for this message
Laurent (laurent-goujon) wrote :

2.6.24-19 just tested and working. Yeepee

Revision history for this message
Martin Pitt (pitti) wrote :

Copied to hardy-updates.

Changed in linux:
status: Fix Committed → Fix Released
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Marking this "Fix Released" against Intrepid. Thanks.

Changed in linux:
status: Fix Committed → Fix Released
Revision history for this message
Stas Sușcov (sushkov) wrote :

Request for reopening the bug!
I'm testing Ubuntu 8.04 and 8.04.1 netboot cd images which are coming with 2.6.24-16-generic kernel and my laptop's Ethernet card (it uses r8169 module) can't get an ip address from the dhcp server.

I'm currently on hardy, where everything is ok.

My laptop id in hwdb is ebe3e0e1d79c845c3b0b9a032b1f68dd

Please let me know if there's need of any logs or reports.

Revision history for this message
Martin Pitt (pitti) wrote :

But the bug wasn't fixed in -16 (as in 8.04), but in -19 (as in 8.04.1). Can you please check that you actually used the right image?

Revision history for this message
Stas Sușcov (sushkov) wrote : Re: [Bug 181081] Re: Fix MSI mapping quirk on HT-based nVidia platform

I've checked now more carefully the netboot images date and most
probably those we're not updated at all.
Here's the location of the images I downloaded:
http://archive.ubuntu.com/ubuntu/dists/hardy/main/installer-amd64/current/images/
I thought "current" means "latest hardy images which I hoped to see
8.04.1", but I'm afraid I was wrong.
I apologize for the false alarm, but still, where can I find the
updated images? Launchpad?

--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message
Martin Pitt (pitti) wrote : Re: [Bug 181081] Re: Fix MSI mapping quirk on HT-based nVidia platform

Stanislav Sushkov [2008-07-15 22:48 -0000]:
> I've checked now more carefully the netboot images date and most
> probably those we're not updated at all.
> Here's the location of the images I downloaded:
> http://archive.ubuntu.com/ubuntu/dists/hardy/main/installer-amd64/current/images/

Right, that 's the hardy final version.

http://archive.ubuntu.com/ubuntu/dists/hardy-updates/main/installer-amd64/current/images/

has the 8.04.1 version. That might be underdocumented :/

Revision history for this message
Stas Sușcov (sushkov) wrote : Re: [Bug 181081] Re: Fix MSI mapping quirk on HT-based nVidia platform

Thank you Martin,
You're right, those details from the documentation are not at all easy
to observe.
I'll have to be more careful next time.

Have a nice day.

--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Changed in linux:
status: In Progress → Fix Released
Revision history for this message
Alfredas Beinartas (fuxialis) wrote :

I think this bug affects Ubuntu Karmic Alpha. Boot hangs showing few
pci 0000.00.00.0 Found Enabled HT MSI Mapping
messages.
After some power-offs system boots but network controller can't get ip from dhcp.

uname
2.6.31-11-generic #36-Ubuntu SMP Fri Sep 25 06:37:23 UTC 2009 x86_64 GNU/Linux

System is desktop with MSI Motherboard, nVidia MCP55 chipset. nVidia 7900GS graphics adapter.

Changed in linux:
importance: Unknown → Medium
Changed in linux (Fedora):
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.