sky2: kernel does not power off NIC on shutdown when WOL is disabled

Bug #621343 reported by Michael Cook on 2010-08-20
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
network-manager (Ubuntu)
Low
Unassigned

Bug Description

Binary package hint: network-manager

CONTEXT:

Ubuntu 10.4 Desktop x86
lsb_release -rd .. says .. Description: Ubuntu 10.04.1 LTS Release: 10.04
uname -r .. says .. 2.6.32-24-generic
System hardware is Acer T180 desktop 3Gb RAM AMD 3800+ NVidia 6100 plus Marvell Yukon NIC
GNOME 2.30.2 Network Manager 0.8-0ubuntu3
sudo ethtool -i eth1 .. says .. driver: sky2 version: 1.25 firmware-version: N/A bus-info: 0000:03:00.0
dmesg .. says ..

 0.847886] sky2 driver version 1.25
[ 0.854478] ACPI: PCI Interrupt Link [APC6] enabled at IRQ 16
[ 0.854488] sky2 0000:03:00.0: PCI INT A -> Link[APC6] -> GSI 16 (level, low) -> IRQ 16
[ 0.854503] sky2 0000:03:00.0: setting latency timer to 64
[ 0.854540] sky2 0000:03:00.0: Yukon-2 EC Ultra chip revision 2
[ 0.854602] alloc irq_desc for 27 on node -1
[ 0.854604] alloc kstat_irqs on node -1
[ 0.854617] sky2 0000:03:00.0: irq 27 for MSI/MSI-X

INITIAL PROBLEM

Other machine on the network is OSX 10.4.11 PPC mac mini. After Ubuntu 10.4 "shutdown -h now" with Ubuntu's NIC still powered on (see "OBSERVATIONS" below) after several minutes I will get a log message like this: "Aug 18 17:28:47 mona kernel[0]: { 851b0 910040} UniNEnet::restartTransmitter - transmitter appeared to be hung", and thereafter networking becomes unusable. Upon investigation .. (see Apple Support http://lists.apple.com/archives/macos-x-server/2004/Nov/msg00088.html .. "This is an kernel error message. There is an issue with the Ethernet port on your system. Most likely the NI has been taken down by the kernel because it can't transmit. This message indicates that the problem was specifically with unicast messages, that is those destined to specific MAC addresses on the ethernet. The kernel sent packets out and didn't hear them echo back and figures the NIC is hung.") and (see http://lists.apple.com/archives/macos-x-server/2004/Nov/msg00199.html .. "This is a error we've seen caused by unicast traffic failure ... We've almost always traced this down to a termination issue. Packets transmitted (unicast) aren't getting echo'd back to the NI properly. When you transmit a packet you're supposed to be able to hear it echo back. This can be caused by excessive traffic, dropped packets, collisions, or a switch that can't forward. It's also a function of the NI being able to handle the traffic. But it's basically a frame problem."

ADDITIONAL OBSERVATIONS:

0) Wake on LAN is turned off in the PC's BIOS
1) Router status lights show the physical connection from the Ubuntu system is still active, and on checking PC system rear lights the NIC is still powered on even tho' the PC has otherwise been shut down. Even after removing all power to the Ubuntu box, results are unpredictable. Putting the Mac through Sleep/Unsleep can resolve the issue, but at other times I have to reboot both the Mac and the router in order to recover the network.
2) On Ubuntu 8.04 "shutdown -h now" powers off the NIC.
3) On Ubuntu 10.4 it does not. If I issue "sudo ifconfig eth1 down" first this will avoid the problem as the NIC is powered down as expected.
4) Using Network Manager "Disconnect" still leaves eth1 "UP" in ifconfig (I would expect this to become "DOWN").

CONCLUSION

Because the PC's NIC is not properly quiesced it is still active on the network in some kind of "zombie" state. I have not tried to sniff the network to see what it is putting out, but the point is that the interface should not be left in this state. Therefore Network Manager (or some other software that I do not know about) fails to react properly to the shutdown request and fails to do necessary cleanup, eventually causing the network problem I observed.

OTHER OBSERVATIONS

I had originally looked into Network Manager because I was getting the "Networking disabled" bug reported in https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/524454

In /etc/Network Manager/nm-system-settings.conf .. it says "managed=false" in the [ifupdown] section. I would expect this to be "managed=true", but making that change made no difference.

In "/etc/Network Manager/system-connections/Wired connection 1" I noticed that the leading zero of the MAC address is not present in the [802-3-ethernet] section .. it says "mac-address=0:19:21:ef:a6:22" although but the address appears correctly in System > Preferences > Network Connections.

Network Manager 8.01 will be available shortly.

Hi! Thanks for filing this bug and helping to make Ubuntu better!

This is expected behaviour, as shutting down the NIC in some cases will cause Wake on Lan to be unusable later (e.g. at some point Windows' broadcom drivers would shut down a port making it totally unusable in Ubuntu). Additionally, this does not appear to be an issue with NetworkManager since other systems (e.g. ifupdown) also touch the interface, and ultimately the kernel will be what is dealing with the NIC.

As you were able to link from Apple support forums: http://lists.apple.com/archives/macos-x-server/2004/Nov/msg00199.html:
"We've almost always traced this down to a termination issue. Packets transmitted (unicast) aren't getting echo'd back to the NI properly. When you transmit a packet you're supposed to be able to hear it echo back. This can be caused by excessive traffic, dropped packets, collisions, or a switch that can't forward. It's also a function of the NI being able to handle the traffic. But it's basically a frame problem.

You can confirm this with a sniffer. Just don't run the sniffer on the same host as it's already loosing packets."

This is likely caused by the switch not doing its job right. My understanding is even if the interface stays up (and I have used systems other than Apple/OSX on a network with Ubuntu machines offline successfully), the switch should be able to mark the link at the spanning tree level as being down, the only thing being sent to the offline device being BPDUs (and nothing coming from it).

This really looks to me like an OSX bug or a hardware defect and not an issue with NetworkManager or even Ubuntu.

My suggestion is to follow the advice in the apple forum linked above, and run a sniffer on a third system (probably better if it is not an OSX device or the shutdown Ubuntu machine :), and see if there is traffic from the MAC address of the device that should be offline that reaches the other systems (as it would be broadcasted, otherwise the system isn't really offline). Another possibility would be to bring up the bug at Apple (through support or whatever facilities they have), since their systems need to be able to deal properly with "interference", or unsollicited packets.

For now, I'm marking this bug as Invalid. If you can provide a packet capture that shows there is indeed traffic coming from the offline Ubuntu system (and that this can provably be linked to not shutting down the NIC, which would still much more likely mean it's a bug in NIC firmware), then don't hesitate to re-open this bug or file a new one that refers to this bug number and case.

Thanks again!

Changed in network-manager (Ubuntu):
importance: Undecided → Low
status: New → Invalid

Thanks for your response, Mathieu.

1) My main issue was (still is) Ubuntu 10.4 not turning off the NIC at
shutdown, unlike Ubuntu 8.04 and Windows Vista.

2) The "affect" that brought this to my attention was the "poisoning" (not
really a "zombie state") of the switch after several minutes.

 - Various web postings indicate this may be an old problem with Marvell
Yukon 88E8056 NIC using the sky2 driver.

 - In fact OSX fails to connect prior to getting the "transmitter hung"
message.

 - I found that in order to restore my network I really only needed to power
off the PC altogether, or else just restart Ubuntu. No need to restart OSX
or the router, so the problem is confined to the switch.

3) Resolution is as follows:

 - I can work around "NIC not powered off" by always doing "sudo ifconfig
eth1 down" before issuing shutdown.

 - I got, compiled, installed driver module sk98lin.ko from Acer's web site
and found the problem went away. The NIC now stays powered on still, but
it's inactive: you don't see it "beaconing" as with the original 10.4
sky2.ko driver module, by which I mean the light on the router for that port
remains solid, whereas with sky2 it would keep flashing.

4) Re. "working as designed":

 - I had not appreciated that Network Manager is not intended to power off
the NIC. I can understand that this might be necessary in some usage
scenarios, but these are "green times"...

 - Wake on LAN is disabled in my BIOS and it's setting as reported by
ethtool has been "d" (none) throughout.

 - Perhaps you can suggest a different place to put the "bug report"?

5) Re. packet trace:

 - I don't have 3rd system to use to run the packet trace as suggested, but
I ran ettercap on OSX to trace both MAC addresses. The trace ends at the
point when the Ubuntu machine is shut down. I have no way to trace the
switch.

6) Other hardware/software:

 - Along the way I changed the WRT54Gv2 router firmware from stock Linksys
to Tomato v1.28. This made no difference.

 - OX 10.4.11 is way old and so is my Mac Mini PPC late 2005 (just before
they switched to Intel CPU), so these are not going to change. The Mac's
motherboard and NIC both checked out fine. I've no way to see if newer Macs
handle it differently, but the real "affect" of the problem is at the
switch.

 - I uninstalled Network Manager but I get exactly get the same problem
using WICD.

With this information now, would you suggest reopening the bug report or
reassigning it elsewhere? I can easily recreate the problem if more
documentation is required.

Thanks again,

Michael Cook

P.S There are still a couple of things you didn't comment about:

 - In /etc/Network Manager/nm-system-settings.conf .. it says
"managed=false" in the [ifupdown] section. I would expect this to be
"managed=true", altho' making that change made no difference.

 - In "/etc/Network Manager/system-connections/Wired connection 1" I
noticed that the leading zero of the MAC address is not present in the
[802-3-ethernet] section .. it says "mac-address=0:19:21:ef:a6:22",
although the address appears correctly in System > Preferences >
Network Connections.

[End].

Michael, no, the things I commented about won't have any effect in your case. You have a point though, managed could be set to true, and this is something I may look into in future releases. As for the leading zero, no need to worry about it. It's customary to leave it out in some cases. I'd still open an upstream bug about it just in case.

Let's do it this way, let's mark this as affecting linux re: the sky2 driver, so that the kernel hopefully shuts down the NIC depending on the state of WOL in BIOS, if that's possible. As I'm not a kernel developer, the best I can do is reassign the bug to linux.

summary: - Network Manager 8.0 does not power off NIC when disconnecting
+ sky2: kernel does not power off NIC on shutdown when WOL is disabled
Jeremy Foshee (jeremyfoshee) wrote :

Hi Michael,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/daily/current/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 621343

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Spang (hetkot) wrote :

I'm not sure I saw it pass yet. But I wanted to note that removing the network-manager and network-manager-gnome all together and switching back to good old config files "/etc/network/interfaces" is a workaround for this problem as well.

see also: http://ubuntuforums.org/showthread.php?t=511182

I am facing (faced) the same problem and solved it this way.

My case:
Asus P5K Pro (Marvell88E8056® PCIe Gigabit LAN controller featuring AI NET2)
Ubuntu 10.4.1 amd64

I think I had the problem before without it leading to trouble, probably due to the fact in before cases that part of the network wasn't used if the perpetrator was powered down.
When changing that context the network appeared unusable when this computer was powered down, apparently caused by the traffic flood of the "zombie" NIC.

Hi Jeremy,

Re, your message Wed, Sep 8, 2010 at 1:09 PM ...

I should be able to get to trying those things over the next few days, and
then I'll report back.

Thanks,

Michael Cook (mcook2) wrote :

Hi Jeremy,

I had to skip testing Maverick .iso 'cos I have no free space.

I just completed the upstream test using these files I downloaded Sept 13 (earlier the same day as rc4 was posted):

  - linux-headers-2.6.36-020636rc3_2.6.36-020636rc3.201008300905_all.deb
  - linux-headers-2.6.36-020636rc3-generic_2.6.36-020636rc3.201008300905_i386.deb
 - linux-image-2.6.36-020636rc3-generic_2.6.36-020636rc3.201008300905_i386.deb.

sky2 came up fine as Version 1.28, and I'm pleased to report that the NIC is properly powered off by `shutdown -h now` so the issue seems to be fixed at that layer. (Spang's post 9/12 seems to suggest it still might be an issue "higher up the stack").

Anyhoo, if you want, I can redo the upstream test with rc4 and/or wait 'til newer packages are available in a few days time.

My test system has nvidia graphics and as the upstream has no driver I couldn't bring up gdm to confirm further with the extra layers like Network Manager and stuff.

Nvidia seems to be broken in Maverick, but I'll look into that some more ... maybe I can run a useful test off USB or s.t.

I suggest there's no need to rush to do the "fixed upstream" tag thing yet, but it looks good for sky2.

Ta-ta,

Michael Cook (mcook2) wrote :

Now that Maverick got released and I've freed up some space, I should get that going and do the "upstream test".

Hopefully I'll be able to report back shortly.

Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers