Ubuntu:network show incorrect wired status after S4

Bug #240648 reported by Scott Zhang on 2008-06-17
10
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Tim Gardner
Hardy
Undecided
Unassigned

Bug Description

the network icon in right-left show the wrong status after S4, reproduce steps:
1.start OS with network connection, and the network icon show the status of wired connection.
2.Click Gnome Power Manager icon on the panel and select "Hibernate"(S4)
3.Wake the system up by pressing the Power button.
4.Check the network icon, and the network show the wrong starte(see capture in attachment);
 type "ifconfig" under terminal, no eth0 in the output list;
 systerm couldn' connect network.
 Even plug/unplug in wired cable, network connection can't recovery.
rate:80%
NIC: Realtek8169
OS: Ubuntu 8.04
Kernel version: 2.6.24.16-generic

Systerm can recovery by reboot.and related log/catpture attached.

Scott Zhang (scott-zhang) wrote :
description: updated
Changed in dell:
assignee: nobody → etnieskater900
Steve Zhang (steve-zhang) wrote :

Tim,

Please help to check this issue.

This line of code indicate r8169.ko is not loaded after resume from Hibernate. The nic card is RTL8102EL.

[ 29.492534] r8169 Gigabit Ethernet driver 2.2LK loaded
[ 29.492804] ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 17 (level, low) -> IRQ 17
[ 29.492816] PCI: cache line size of 32 is not supported by device 0000:09:00.0
[ 29.492823] ACPI: PCI interrupt for device 0000:09:00.0 disabled
[ 29.493340] r8169: probe of 0000:09:00.0 failed with error -22

Steve Zhang (steve-zhang) wrote :

More finding here:

After nic driver fail to load, I run rmmod r8169 and modprobe r8169 debug=16 to print out the debug information of the driver. the error message show the MAC address is turn to 0xffffffff which is not correct. You can check the attached bad.log.

And you also can find from the bad.log that when the system resume from hibernate, in the first boot stage, initrd.img will be loaded and at that time, NIC driver can be successfully loaded. But after resume from file, OS try to load NIC driver again and at that time, it failed.

It's kind of wired.

Steve Zhang (steve-zhang) wrote :

Good log

Changed in dell:
importance: Undecided → Critical
Steve Zhang (steve-zhang) wrote :

Message log which indicate during initrd.img loading r8169 is correctly loaded with correct MAC but failed to load after resume from file.

Steve Zhang (steve-zhang) wrote :

This is the good lspci information

Steve Zhang (steve-zhang) wrote :

after NIC failed to load, the lspci information of NIC card is wrong

Steve Zhang (steve-zhang) wrote :

Ignore previous wrong log for bad lspci information. Here is the real wrong log for bad lspci information

Tim Gardner (timg-tpi) wrote :

My guess is that this is related to the platform BIOS or the NIC BIOS extension (if any). Without hardware it will be quite difficult to debug.

Steve Zhang (steve-zhang) wrote :

Tim,

Could you please give me your address? We can ship the system to you for debuging

Steve Zhang (steve-zhang) wrote :

I found these information in log. It seems when PM write back the configuration data to NIC, it failed.

[ 28.614580] PM: Writing back config space on device 0000:09:00.0 at offset f (was ffffffff, writing 10a)
[ 28.614586] PM: Writing back config space on device 0000:09:00.0 at offset e (was ffffffff, writing 0)
[ 28.614591] PM: Writing back config space on device 0000:09:00.0 at offset d (was ffffffff, writing 40)
[ 28.614597] PM: Writing back config space on device 0000:09:00.0 at offset c (was ffffffff, writing 0)
[ 28.614603] PM: Writing back config space on device 0000:09:00.0 at offset b (was ffffffff, writing 813610ec)
[ 28.614608] PM: Writing back config space on device 0000:09:00.0 at offset a (was ffffffff, writing 0)
[ 28.614614] PM: Writing back config space on device 0000:09:00.0 at offset 9 (was ffffffff, writing 0)
[ 28.614620] PM: Writing back config space on device 0000:09:00.0 at offset 8 (was ffffffff, writing f000000c)
[ 28.614628] PM: Writing back config space on device 0000:09:00.0 at offset 7 (was ffffffff, writing 0)
[ 28.614633] PM: Writing back config space on device 0000:09:00.0 at offset 6 (was ffffffff, writing f6aff004)
[ 28.614639] PM: Writing back config space on device 0000:09:00.0 at offset 5 (was ffffffff, writing 0)
[ 28.614645] PM: Writing back config space on device 0000:09:00.0 at offset 4 (was ffffffff, writing de01)
[ 28.614650] PM: Writing back config space on device 0000:09:00.0 at offset 3 (was ffffffff, writing 8)
[ 28.614656] PM: Writing back config space on device 0000:09:00.0 at offset 2 (was ffffffff, writing 2000002)
[ 28.614661] PM: Writing back config space on device 0000:09:00.0 at offset 1 (was ffffffff, writing 100103)
[ 28.614667] PM: Writing back config space on device 0000:09:00.0 at offset 0 (was ffffffff, writing 813610ec)

Steve Zhang (steve-zhang) wrote :

Tim, this issue also can be found even with cold boot.
As I am investigating, I found the failed cold boot has this result in dmesg. Seems intel audio is initializing during NIC driver initialization. Maybe it’s the root cause. I am adding spinlock to the in NIC driver. But it doesn’t work.

[ 22.370396] r8169 Gigabit Ethernet driver 2.2LK loaded
[ 22.370469] ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 17 (level, low) -> IRQ 17
[ 22.370517] PCI: Setting latency timer of device 0000:09:00.0 to 64
[ 22.370551] r8169 0000:09:00.0: unknown MAC (27a00000)
[ 22.371049] eth0: RTL8169 at 0xf8abc000, 00:21:70:6e:00:64, XID 24a00000 IRQ 220
[ 22.371051] Lock On //I turn spinlock on before I enter init_phy()
[ 22.440670] ACPI: Video Device [VID2] (multi-head: yes rom: no post: no)
[ 22.460622] ieee80211_crypt: registered algorithm 'NULL'
[ 22.552819] wl: module license 'unspecified' taints kernel.
[ 22.849354] Bluetooth: Core ver 2.11
[ 22.850283] NET: Registered protocol family 31
[ 22.850286] Bluetooth: HCI device and connection manager initialized
[ 22.850290] Bluetooth: HCI socket layer initialized
[ 22.885883] Bluetooth: HCI USB driver ver 2.9
[ 22.887565] usbcore: registered new interface driver hci_usb
[ 23.228843] Lock Off//I turn off spinlock before I exit init_phy()
[ 23.228851] eth0: RTL8169 at 0xf8abc000, 00:21:70:6e:00:64, XID 7cf0f8ff IRQ 220

Steve Zhang (steve-zhang) wrote :

Tim,

How can I prevent other device to initialize when NIC driver is setting the HW? I want to try it so that I can know if it's the root cause.

Tim Gardner (timg-tpi) wrote :

SRU Justification

Impact: Some models of the Realtek r8169 do not initialize correctly.

Patch Description: Add an initialization quirk for 8102EL and 8102E.

Patch: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commit;h=ff32fd1f99091f2b6a65a524883cc8066c351b1f

Test Case: See Bug Description

Tim Gardner (timg-tpi) wrote :
Changed in linux:
assignee: nobody → timg-tpi
importance: Undecided → Medium
milestone: none → ubuntu-8.04.2
status: New → Fix Committed
Steve Langasek (vorlon) wrote :

Accepted into -proposed, please test and give feedback here. Please see https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux:
status: New → Fix Committed
Alex Kiernan (alex-kiernan) wrote :

This seems to fix the problem for me - I've been running for a week or so with a custom built kernel which included the fix and have just swapped to the -proposed kernel and it all seems to work fine (this is on an Intel D945GCLF motherboard)

Changed in dell:
status: New → Fix Released
infiniti_guy (infiniti-guy) wrote :

Same Board D945GCLF, using OS 8.04, kernel 2.6.24-19-41 in update repo seems to fix the ethernet problems... but still have acpi problem where system does NOT power off, just goes to black screen. I tested the power consumption with a meter and still get idle power draw. I tried kernel option acpi-force with no effect. Anyone get this board to turn off...

From what I understand acpi is a problem that the EEPC guys have noted (with the new atom based boards)... this may be at the heart of the S$/S5 issues noted here... Anyone have this working?

infiniti_guy (infiniti-guy) wrote :

Have also tried kernel in proposed 2.6.24-21 with same acpi issue.

Martin Pitt (pitti) wrote :

linux 2.6.24-21 copied to hardy-updates.

Changed in linux:
status: Fix Committed → Fix Released
Changed in linux:
status: Fix Committed → Fix Released
infiniti_guy (infiniti-guy) wrote :

ACPI is still a problem with newer kernel 2.6.27-14. The shutdown command does not actually power off the machine. Full power is still running through the board, and video output is still active.

Colin Watson (cjwatson) wrote :

infiniti_guy: That seems like an entirely different bug (even though it may affect the same system) and I'd recommend you file it separately if you haven't already done so.

Changed in somerville:
assignee: nobody → Tim Gardner (etnieskater900)
importance: Undecided → Critical
status: New → Fix Released
no longer affects: dell
Timothy R. Chavez (timrchavez) wrote :

The bug task for the somerville project has been removed by an automated script. This bug has been cloned on that project and is available here: https://bugs.launchpad.net/bugs/1306166

no longer affects: somerville
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers