Realtek RTL8111/8168B wired network interface goes up and down at random

Bug #1007236 reported by Louis Simard on 2012-06-01
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

-- The context --

I am using Linux 3.2.0-24-generic-pae on Ubuntu Precise Pangolin, 32-bit, installed on 2012-04-29. The computer I am currently using is an Acer Aspire AM3450-ES10P, whose network card is identified in 'lspci' output as follows:

  05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

Until 2012-05-20 or so, the network interface worked fine in Precise ( unlike in Lucid where it would constantly go up and down every 5 to 20 minutes, starting from boot ).

-- The problem --

On or around 2012-05-20, the network interface started to go down and up at random times, which I initially thought of as the Internet access on my downstream router going out, but I started to see NetworkManager (GNOME) notifications as well:

  [ Wired connection 1 - Disconnected, you are now offline ]

The dmesg log gets filled with "link is now down" and "link is now up" at random times, the period between which is random and can be anything from 5 minutes to 2 hours. 'dhclient' gets killed and restarted, and no crash is reported on dhclient.

The computer that is affected was bought during the month of 2012-02. From it, the path to the Internet is as follows:

* The NIC, which is on a chip on the motherboard, and supports 100BaseT and 1000BaseT;
* A plug on the computer, with 2 status lights, one on either side of the plug. The status light closest to the top of the computer (i.e. the power supply unit) flashes when the network is accessed, and the status light closest to the bottom is constantly on;
* An Ethernet cable bought during the month of 2011-12, with connectors that rest securely in the Ethernet plug, known good due to it working flawlessly on the previous computer;
* A D-Link router, whose status light for the affected computer does absolutely nothing when this bug occurs, staying on throughout the subsequent reconnection;
* An Ethernet cable between the router and the modem, also bought during the month of 2011-12 and known good;
* A SpeedTouch modem for ADSL service, whose status light for "Internet" does absolutely nothing when this bug occurs, staying on throughout the disconnection and subsequent reconnection;
* A phone wire.

IPv4 addresses are not in conflict on the network. The D-Link router is set up as a DHCP server and hands out fixed addresses by MAC. There are no IPv6 addresses on the network.

NetworkManager is set up via GNOME to seek DHCP addresses but have custom DNS nameservers at 127.0.0.1 (bind9, for the local network) and 192.168.0.1, the D-Link router, which forwards DNS requests to the ISP's nameservers. There is also a custom DNS suffix for the local network, which is not '.local'.

-- The workarounds attempted --

I installed the "connman" package around 2012-05-05 and uninstalled it right away, restoring network-manager (because connman was flaky), and only today (2012-05-31) uninstalled its configuration files. Even after deleting its configuration files and rebooting, the bug persists.

I attempted a workaround based on something I found at [1]: add "ethtool -s eth0 autoneg off" to /etc/network/if-pre-up.d/ethtool. Even after rebooting, the bug persists.

I had seen during the first week of installation (2012-04-29 to 2012-05-04) that avahi-daemon crashed and caused one or two disconnections during that week, after which dhclient ran and reconnected to the LAN. Apport prompted me to send a problem report, but I don't know if it ever reached Launchpad due to the problem being network-related.

I uninstalled avahi-daemon and avahi-autoipd today. Even after rebooting, the bug persists.

-- Other possibly pertinent information --

I installed the zram-config package, which is apparently marked as staging. If I should uninstall it first due to the staging mark, then I shall.

I can attach Apport-collect information to this bug report on request, but the transfer of information may cut off in the middle...

[1] http://forums.fedoraforum.org/showthread.php?t=250807

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-24-generic-pae 3.2.0-24.39
ProcVersionSignature: Ubuntu 3.2.0-24.39-generic-pae 3.2.16
Uname: Linux 3.2.0-24-generic-pae i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu8
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: cynthia 2239 F.... pulseaudio
 /dev/snd/controlC0: cynthia 2239 F.... pulseaudio
 /dev/snd/seq: timidity 1818 F.... timidity
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Card0.Amixer.info:
 Card hw:0 'SB'/'HDA ATI SB at 0xfeb00000 irq 16'
   Mixer name : 'Realtek ALC662 rev1'
   Components : 'HDA:10ec0662,10258100,00100101'
   Controls : 36
   Simple ctrls : 20
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfe910000 irq 19'
   Mixer name : 'ATI RS690/780 HDMI'
   Components : 'HDA:1002791a,00791a00,00100000'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Thu May 31 21:05:17 2012
HibernationDevice: RESUME=UUID=beb517d6-0687-4ecb-a4a6-0d21aedb9ec2
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release i386 (20120423)
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.
MachineType: Acer Aspire M3450
ProcEnviron:
 LANGUAGE=en_CA:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic-pae root=UUID=7716c476-e7b8-43c6-ae32-14e77e12de69 ro vt.handoff=7
PulseList:
 Error: command ['pacmd', 'list'] failed with exit code 1: Home directory /home/cynthia not ours.
 No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic-pae N/A
 linux-backports-modules-3.2.0-24-generic-pae N/A
 linux-firmware 1.79
RfKill:

SourcePackage: linux
StagingDrivers: zram
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/30/2011
dmi.bios.vendor: AMI
dmi.bios.version: P01-A2
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: Aspire M3450
dmi.board.vendor: Acer
dmi.board.version: To be filled by O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Acer
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAMI:bvrP01-A2:bd09/30/2011:svnAcer:pnAspireM3450:pvrTobefilledbyO.E.M.:rvnAcer:rnAspireM3450:rvrTobefilledbyO.E.M.:cvnAcer:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: Aspire M3450
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: Acer

Brad Figg (brad-figg) on 2012-06-01
Changed in linux (Ubuntu):
status: New → Confirmed
ttbek (ttbek) wrote :

I can confirm this for the 64 bit kernel as well. If I drop back and use the kernel 3.2.0-17 I do not have the problem. In my case the lights do not blink as described here. On my system the lights on the interface do not light up at all, and the light for the connection on my Microsoft MN-500 router blinks green slowly, roughly a 1 second duration blink every 3 or 4 seconds. I didn't notice anything in dmesg, but I had only looked right after booting up, I haven't had the time to delve into this in detail, I can later though if anyone needs more information than the original poster provided.

Update: In the few seconds just *before* NetworkManager tells me that my connection is down, the router's status light for the affected computer turns off. If I look at the router when (and after) NetworkManager fires off the notification, it's already too late. Therefore, it seems that whatever is happening is completely disabling the NIC for all purposes during a few seconds.

The router is operating at 100BaseT full duplex for the connection.

Another workaround attempt does not work: I used the following command to disable Wake-On-LAN:

  sudo ethtool -s eth0 wol ""

Additionally, using the following command does not do anything, leaving the type of connection at MII:

  sudo ethtool -s eth0 type tp

The bug is still occurring at the same frequency.

UPDATE 2: The router's event logs contain this gem:

  Jun/01/2012 19:48:15 System started

This is immediately after an occurrence of the bug, and after another user on the network reported a connectivity issue as well. This other user is on a different computer which is using a different NIC and Windows instead of Linux.

Since the router indicates it just started, then it must have rebooted. The main suspect is now power blips on the router's power outlet.

Ordinarily, I would now close this bug as Invalid, but I see in the bug history that:
* Brad Figg marked this as Confirmed;
* 1 person marked him/herself as affected;
* ttbek confirmed this bug independently.

I will try various things on the D-Link router, having already started by moving the power plug to a different outlet. I will report back on this bug in 7 days if the problem has subsided, or before if the problem recurs.

Anyone else? :)

... By which I mean that I'll post on this bug if the problem has recurred *without* the router rebooting. Obviously I can't post if it's not the NIC's fault. :s

ttbek (ttbek) wrote :

Hmm, the problem seems to have been transient for me. It persisted through several reboots, I also rebooted into windows and into the older 17 (as opposed to the current 24) kernel. The connection worked in both of those, this morning I booted again back to the 24 kernel, and it seems to have magically started working again. Also the trouble I was having may have been different, I have no signs of my router restarting, and no suspect messages in dmesg. I did also have other wired connections on the router at the same time these problems were happening, so I'm still convinced there was some sort of problem, but I'm stumped as to what it might have been if not a recent update... but it fixed itself... I'll post again if the problem reoccurs. I've now marked myself as unaffected by the bug for now.

Cristina Franzolini (forkirara) wrote :

Me too.
I have installed last realtek driver (r8168-8.031.00) and blaclisted r8169 but don't connect in internet. lspci said "Kernel driver in use: r8168" but don't work.
I tried to insert in "etc/modules" r8168 but the same don't work.
I have kernel 3.2.0-25-generic Ubuntu 12.04
Before upgrade (from 11.10) network card worked.

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.5kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-rc1-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: needs-upstream-testing
description: updated
Cristina Franzolini (forkirara) wrote :

I tried kernel 3.5.0-030500rc1-generic (with extra e header). The problem is the same.
I tried so:
- I de-blacklisted r8169
- I installed new kernel
- rebooted
- problem the same

So I have installed last driver of realtek so:
$ sudo -s
# rmmod r8169
# mv '/lib/modules/3.5.0-030500rc1-generic/kernel/drivers/net/ethernet/realtek/r8169.ko' ~/r8169.ko.backup
# cd /varie/down/linux/r8168-8.031.00
# make clean modules
# make install
# depmod -a
# insmod ./src/r8168.ko
# mv /initrd.img ~/initrd.img.backup
# mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`
# echo “r8168′′ >> /etc/modules
reboot

Problem the same.

lspci -v

02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
 Subsystem: Hewlett-Packard Company Device 2a9c
 Flags: bus master, fast devsel, latency 0, IRQ 47
 I/O ports at d800 [size=256]
 Memory at fbdff000 (64-bit, non-prefetchable) [size=4K]
 Memory at faffc000 (64-bit, prefetchable) [size=16K]
 Expansion ROM at fbdc0000 [disabled] [size=128K]
 Capabilities: <access denied>
 Kernel driver in use: r8168
 Kernel modules: r8168

lshw -C network

 *-network
       description: Ethernet interface
       product: RTL8111/8168B PCI Express Gigabit Ethernet controller
       vendor: Realtek Semiconductor Co., Ltd.
       physical id: 0
       bus info: pci@0000:02:00.0
       logical name: eth0
       version: 03
       serial: 40:61:86:c3:e1:34
       size: 100Mbit/s
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress msix vpd bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=r8168 driverversion=8.031.00-NAPI duplex=full ip=192.168.1.129 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
       resources: irq:47 ioport:d800(size=256) memory:fbdff000-fbdfffff memory:faffc000-faffffff memory:fbdc0000-fbddffff

Maybe a problem with Networkmanager?

tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing

After update of today (apt-get update and apt-get upgrade) network work (updated DNS packages and other).
I have now a problem with firestarter taht block network both wireles than cable). The firestarter's configuration is the same as before and worked.
So with kernel 3.5.0-030500rc1-generic (with extra e header) work without firestarter.

tags: added: kernel-fixed-upstream
removed: kernel-bug-exists-upstream

Update 3: I made sure the router did not reboot. The bug stopped manifesting itself, using the same NIC.

Marking myself unaffected now.

Kostas Lagogiannis (costaslag) wrote :

Hi, good to see the issue has been opened.
Using the r8168 realtek drivers r8168-8.023.00 with kernel 3.0.0-19-generic used to work fine (after replacing the default r8169).
Further kernel updates do not allow compiling of the r8168-8.023.00 driver. Using the latest available Realtek drivers at the time and up to kernel version 3.0.0-26 gives very poor network quality.
 The network connection is very poor with frequent stalls in download and completely messes up streaming media.
Alas the stats do not show disconnections. After trying realtek drivers :
r8168-8.025.00
r8168-8.030.00
r8168-8.031.00

I have reverted back to kernel 3.0.0-19 so I can get a decent network connection.

lspci gives:
06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Steven L (steven-0) wrote :

Same problem here, both with the official r8168 and r8169.

Linux tux 3.2.0-39-generic #62-Ubuntu SMP Thu Feb 28 00:28:53 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

[ 746.381180] r8168 Gigabit Ethernet driver 8.035.00-NAPI loaded
[ 746.381233] r8168 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 746.381277] r8168 0000:02:00.0: setting latency timer to 64
[ 746.381417] r8168 0000:02:00.0: irq 52 for MSI/MSI-X
[ 746.381757] eth%d: RTL8168E-VL/8111E-VL at 0xffffc900057fe000, b8:70:f4:63:80:f8, IRQ 52
[ 746.560076] r8168: This product is covered by one or more of the following patents: US5,307,459, US5,434,872, US5,732,094, US6,570,884, US6,115,776, and US6,327,625.
[ 746.560089] eth0: Identified chip type is 'RTL8168E-VL/8111E-VL'.
[ 746.560096] r8168 Copyright (C) 2012 Realtek NIC software team <email address hidden>
[ 746.560100] This program comes with ABSOLUTELY NO WARRANTY; for details, please see <http://www.gnu.org/licenses/>.
[ 746.560105] This is free software, and you are welcome to redistribute it under certain conditions; see <http://www.gnu.org/licenses/>.
[ 746.629297] r8168: eth0: link down
[ 746.631067] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 746.633350] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 748.117719] r8168: eth0: link up
[ 748.120062] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 748.633050] r8168: eth0: link up
[ 758.576057] eth0: no IPv6 routers present
[ 2852.370696] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 2852.407680] r8168: eth0: link down
[ 2852.409275] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 2853.404075] r8168: eth0: link down
[ 2854.064643] r8168: eth0: link up
[ 2854.066118] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 2854.405038] r8168: eth0: link up
[ 2864.200025] eth0: no IPv6 routers present
[ 3085.589070] ADDRCONF(NETDEV_UP): wlan0: link is not ready

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers