Unreliable network connection with B44 driver

Bug #279102 reported by Henry Gomersall
158
This bug affects 21 people
Affects Status Importance Assigned to Milestone
Linux
Invalid
High
linux (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

network connection drops regularly when using

  network controller : Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)
and
  under heavy network load

original description :
--------------------------

Binary package hint: network-manager

Network manager drops wired connections repeatedly. It seems to be correlated with accessing servers. Pages with lots of images (e.g. www.dilbert.com) seem to trigger the bug more readily than simple text only pages.

The connection is immediately picked up again using DHCP.

This is using Intrepid beta.

from lspci:
Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

snippet from /var/log/daemon.log (private information has been replaced with X or hostname or domainname):

Oct 6 15:25:01 whg21-laptop NetworkManager: <info> (eth0): carrier now OFF (device state 8)
Oct 6 15:25:01 whg21-laptop NetworkManager: <info> (eth0): device state change: 8 -> 2
Oct 6 15:25:01 whg21-laptop NetworkManager: <info> (eth0): deactivating device.
Oct 6 15:25:01 whg21-laptop NetworkManager: <info> eth0: canceled DHCP transaction, dhcp client pid 6923
Oct 6 15:25:01 whg21-laptop NetworkManager: <WARN> check_one_route(): (eth0) error -34 returned from rtnl_route_del(): Sucess
Oct 6 15:25:01 whg21-laptop avahi-daemon[5367]: Withdrawing address record for XXX.XXX.XXX.137 on eth0.
Oct 6 15:25:01 whg21-laptop avahi-daemon[5367]: Leaving mDNS multicast group on interface eth0.IPv4 with address XXX.XXX.XXX.137.
Oct 6 15:25:01 whg21-laptop avahi-daemon[5367]: Interface eth0.IPv4 no longer relevant for mDNS.
Oct 6 15:25:01 whg21-laptop NetworkManager: <info> Setting system hostname to 'whg21-laptop' (no default device)
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> (eth0): carrier now ON (device state 2)
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> (eth0): device state change: 2 -> 3
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Setting system hostname to 'whg21-laptop' (no default device)
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) starting connection 'Auto eth0'
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> (eth0): device state change: 3 -> 4
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) scheduled...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) started...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) scheduled...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) complete.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) starting...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> (eth0): device state change: 4 -> 5
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) successful.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) scheduled.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) complete.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) started...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> (eth0): device state change: 5 -> 7
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Beginning DHCP transaction.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> dhclient started with pid 8164
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) complete.
Oct 6 15:25:04 whg21-laptop dhclient: Internet Systems Consortium DHCP Client V3.1.1
Oct 6 15:25:04 whg21-laptop dhclient: Copyright 2004-2008 Internet Systems Consortium.
Oct 6 15:25:04 whg21-laptop dhclient: All rights reserved.
Oct 6 15:25:04 whg21-laptop dhclient: For info, please visit http://www.isc.org/sw/dhcp/
Oct 6 15:25:04 whg21-laptop dhclient:
Oct 6 15:25:04 whg21-laptop dhclient: wmaster0: unknown hardware address type 801
Oct 6 15:25:04 whg21-laptop dhclient: wmaster0: unknown hardware address type 801
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> DHCP: device eth0 state changed normal exit -> preinit
Oct 6 15:25:04 whg21-laptop dhclient: Listening on LPF/eth0/xx:xx:xx:xx:xx:xx
Oct 6 15:25:04 whg21-laptop dhclient: Sending on LPF/eth0/xx:xx:xx:xx:xx:xx
Oct 6 15:25:04 whg21-laptop dhclient: Sending on Socket/fallback
Oct 6 15:25:04 whg21-laptop dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
Oct 6 15:25:04 whg21-laptop dhclient: DHCPOFFER of XXX.XXX.XXX.137 from XXX.XXX.XXX.251
Oct 6 15:25:04 whg21-laptop dhclient: DHCPREQUEST of XXX.XXX.XXX.137 on eth0 to 255.255.255.255 port 67
Oct 6 15:25:04 whg21-laptop dhclient: DHCPACK of XXX.XXX.XXX.137 from XXX.XXX.XXX.251
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> DHCP: device eth0 state changed preinit -> bound
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) scheduled...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) started...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> address XXX.XXX.XXX.137
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> prefix 22 (255.255.252.0)
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> gateway XXX.XXX.XXX.250
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> hostname 'hostname'
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> nameserver 'XXX.XXX.XXX.10'
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> nameserver 'XXX.XXX.XXX.11'
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> domain name 'domainname'
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) scheduled...
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) complete.
Oct 6 15:25:04 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started...
Oct 6 15:25:04 whg21-laptop avahi-daemon[5367]: Joining mDNS multicast group on interface eth0.IPv4 with address XXX.XXX.XXX.137.
Oct 6 15:25:04 whg21-laptop dhclient: bound to XXX.XXX.XXX.137 -- renewal in 21028 seconds.
Oct 6 15:25:04 whg21-laptop avahi-daemon[5367]: New relevant interface eth0.IPv4 for mDNS.
Oct 6 15:25:04 whg21-laptop avahi-daemon[5367]: Registering new address record for XXX.XXX.XXX.137 on eth0.IPv4.
Oct 6 15:25:05 whg21-laptop NetworkManager: <info> Setting system hostname to 'hostname' (no default device)
Oct 6 15:25:05 whg21-laptop NetworkManager: <info> (eth0): device state change: 7 -> 8
Oct 6 15:25:05 whg21-laptop NetworkManager: <info> Policy set (eth0) as default device for routing and DNS.
Oct 6 15:25:05 whg21-laptop NetworkManager: <info> Activation (eth0) successful, device activated.
Oct 6 15:25:05 whg21-laptop NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.

Revision history for this message
Alexander Sack (asac) wrote :

i guess your log snippet starts with a disconnect event? Anyway, if you get carrier OFF regularly, this either is a driver issue or maybe even a physical one (e.g. bad cable).

Reassigning to linux since network manager appears to be working as good as possible (e.g. reconnecting after carrier OFF).

Revision history for this message
Henry Gomersall (hgomersall) wrote :

Although it is possible that the cable has suddenly broken, it only started to occur after the upgrade to Intrepid. I have confirmed my local cable is fine by replacing the immediate link.

Revision history for this message
Henry Gomersall (hgomersall) wrote :

As an addition, Ubuntu updates reliably trigger the network down. I guess this is because there are many files being downloaded in parallel.

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Unfortunately we can't fix it without more information.

Please include the following additional information, if you have not already done so (pay attention to lspci's additional options), as required by the Ubuntu Kernel Team:
1. Please include the output of the command "uname -a" in your next response. It should be one, long line of text which includes the exact kernel version you're running, as well as the CPU architecture.
2. Please run the command "dmesg > dmesg.log" after a fresh boot and attach the resulting file "dmesg.log" to this bug report.
3. Please run the command "sudo lspci -vvnn > lspci-vvnn.log" and attach the resulting file "lspci-vvnn.log" to this bug report.
4. Please attach your /var/log/kern.log and /var/log/syslog after these network problems occur (preferably annotated so it is easy to find in the logs)

For your reference, the full description of procedures for kernel-related bug reports is available at https://wiki.ubuntu.com/KernelTeamBugPolicies Thanks in advance!

Changed in linux:
assignee: nobody → chrisccoulson
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Henry Gomersall (hgomersall) wrote :

Additional info: The network issue only occurs when Compiz is turned on. This is using the open source ati driver.

I attach the output from dmesg. Compiz is turned on around [50] and a few lines at the bottom show the network driver going down (commented).

I will also attach output from lspci, syslog and kern.log. The relevant portions of syslog and kern.log are at the bottom of the files.

Revision history for this message
Henry Gomersall (hgomersall) wrote :
Revision history for this message
Henry Gomersall (hgomersall) wrote :
Revision history for this message
Henry Gomersall (hgomersall) wrote :
Revision history for this message
Koen (koen-beek) wrote :
Download full text (7.9 KiB)

Hi, I confirm this issue also with intrepid

It happens more often when there is heavy downloads going on

kern.log :
Oct 30 22:38:13 koen-desktop kernel: [14923.954560] b44: eth0: powering down PHY
Oct 30 22:38:14 koen-desktop kernel: [14923.989101] b44: eth0: Link is down.
Oct 30 22:38:17 koen-desktop kernel: [14926.988638] b44: eth0: Link is up at 100 Mbps, full duplex.
Oct 30 22:38:17 koen-desktop kernel: [14926.988651] b44: eth0: Flow control is off for TX and off for RX.

syslog :
Oct 30 22:38:13 koen-desktop kernel: [14923.954560] b44: eth0: powering down PHY
Oct 30 22:38:14 koen-desktop kernel: [14923.989101] b44: eth0: Link is down.
Oct 30 22:38:14 koen-desktop NetworkManager: <info> (eth0): carrier now OFF (device state 8)
Oct 30 22:38:14 koen-desktop NetworkManager: <info> (eth0): device state change: 8 -> 2
Oct 30 22:38:14 koen-desktop NetworkManager: <info> (eth0): deactivating device (reason: 0).
Oct 30 22:38:14 koen-desktop NetworkManager: <info> eth0: canceled DHCP transaction, dhcp client pid 24607
Oct 30 22:38:14 koen-desktop NetworkManager: <WARN> check_one_route(): (eth0) error -34 returned from rtnl_route_del(): Sucess
Oct 30 22:38:14 koen-desktop avahi-daemon[5025]: Withdrawing address record for 192.168.1.101 on eth0.
Oct 30 22:38:14 koen-desktop avahi-daemon[5025]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.1.101.
Oct 30 22:38:14 koen-desktop avahi-daemon[5025]: Interface eth0.IPv4 no longer relevant for mDNS.
Oct 30 22:38:17 koen-desktop NetworkManager: <info> (eth0): carrier now ON (device state 2)
Oct 30 22:38:17 koen-desktop NetworkManager: <info> (eth0): device state change: 2 -> 3
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) starting connection 'Auto eth0'
Oct 30 22:38:17 koen-desktop NetworkManager: <info> (eth0): device state change: 3 -> 4
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) scheduled...
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) started...
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) scheduled...
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) complete.
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) starting...
Oct 30 22:38:17 koen-desktop NetworkManager: <info> (eth0): device state change: 4 -> 5
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) successful.
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) scheduled.
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) complete.
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) started...
Oct 30 22:38:17 koen-desktop NetworkManager: <info> (eth0): device state change: 5 -> 7
Oct 30 22:38:17 koen-desktop NetworkManager: <info> Activation (eth0) Beginning DHCP transaction.
Oct 3...

Read more...

Koen (koen-beek)
Changed in linux:
status: Incomplete → Confirmed
status: Confirmed → Incomplete
Revision history for this message
Koen (koen-beek) wrote :

Hi,

  I'm setting this bug as confirmed as there are several similar reports

Changed in linux:
status: Incomplete → Confirmed
Revision history for this message
Koen (koen-beek) wrote :

All three reports seem to concern the same network controller : Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02))

Changed in linux:
assignee: chrisccoulson → ubuntu-kernel-team
status: Confirmed → Triaged
Revision history for this message
Guillermo Pérez (bisho) wrote :

I confirm this problem. I'm using the same network controller.

This broadcom driver is used in almost all Dell's and also in other vendors, so I presume it will affect many people.

Revision history for this message
Henry Gomersall (hgomersall) wrote :

Do other people notice the same interaction with Compiz that I reported? Notably, the problem stops when Compiz is turned off.

Revision history for this message
Guillermo Pérez (bisho) wrote :

I confirm the compiz interaction... Stoping compiz stops the problem.

really strange!

Revision history for this message
Guillermo Pérez (bisho) wrote :

As it could have interactions with the graphics driver, mine is:

00:05.0 VGA compatible controller [0300]: nVidia Corporation C51 [GeForce 6150 LE] [10de:0241] (rev a2)
 Subsystem: Dell Device [1028:01ed]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0
 Interrupt: pin A routed to IRQ 16
 Region 0: Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
 Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M]
 Region 3: Memory at fb000000 (64-bit, non-prefetchable) [size=16M]
 [virtual] Expansion ROM at 88000000 [disabled] [size=128K]
 Capabilities: [48] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
  Address: 0000000000000000 Data: 0000
 Kernel driver in use: nvidia
 Kernel modules: nvidiafb, nvidia

All others with this problem are using nvidia too?

Revision history for this message
Henry Gomersall (hgomersall) wrote :

Guillermo, are you using the open source "radeon" driver?

What video hardware are you using?

I am running the radeon driver with x1400 Radeon Mobility graphics chip.

Revision history for this message
Guillermo Pérez (bisho) wrote :

I'm using the propietary nvidia driver, version 177, for a GeForce 6150 card. I pasted details of it in a previous post. It's very strange it also happens with the open source radeon, a very different card.

Revision history for this message
Eitan Bonderover (redyaky) wrote :

FWIW, I am also using the nVidia GeForce 6150 LE with the proprietary 177 driver. Also, I had the same problem with 64 bit Intrepid.

Revision history for this message
Guillermo Pérez (bisho) wrote :

The new kernel 2.6.27-7.16 seems to solve this problem on my system!!!

Update and check if it also solves your problem.

Revision history for this message
Guillermo Pérez (bisho) wrote :

The changes between 15 and 16 seems unrelated to this problem, but it's certainly resolving the problem on my system. Any clue?

Revision history for this message
Guillermo Pérez (bisho) wrote :

Sorry for spamming again.

I'm still suffering this problem with the new kernel. With compiz, not manipulating windows the network is ok. If you try to move windows, scroll on evolution or any other window manipulation, the network card gets stoped again. I hope this helps to solve this bug.

Revision history for this message
Koen (koen-beek) wrote :

I tried with the newest unix kernel in Intrepid 2.6.27-7.16

I still have the problem when compiz is on (visual effects - extra)
I haven't had the problem yet since I set compiz off

So it seems there is some interaction between the Broadcom Corporation BCM4401-B0 100Base-TX (rev 02) network controller and compiz

Koen (koen-beek)
description: updated
Revision history for this message
Koen (koen-beek) wrote :

After some time the disconnect happened to me even when compiz was off, so compiz may simply speed disconnects up but does not cause it

description: updated
Revision history for this message
Guillermo Pérez (bisho) wrote :

Specially if you are moving windows. If X are static (try scp from another machine) the network remains stable. As soon as I start clicking on windows, moving them or other interactions, the connection gets down.

Revision history for this message
MikkoD (mikko-dittmer) wrote :

With a vanilla 2.6.28-rc5 kernel on a amilo l1310g laptop with
ATI ixp SB400 chipset and Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)
I have the same problem.

The nic would just stop working completely under heavy load, but after adding the kernel option "irqpoll" it now just
resets the connection every once in a while as reported here by others.

.
.
b44: eth0: powering down PHY
b44: eth0: Link is down.
b44: eth0: Link is up at 100 Mbps, full duplex.
b44: eth0: Flow control is off for TX and off for RX.
b44: eth0: powering down PHY
.
.

On a standard intrepid 2.6.27 kernel network seems fine, but the audio sometimes breaks up (stutters) after a while until a reboot. So there seems to be some interrupt conflict?

Revision history for this message
Guillermo Pérez (bisho) wrote :

And is this reported to kernel developers?

I think is a important issue to resolve for Ubuntu.

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
Daniel Katz-Braunschweig (dkatz-ubuntu) wrote :

Confirmed with Broadcom 4401 network card and ATI X1400 mobility card. Also confirmed workaround of disabling compiz (system -> preferences -> appearance -> Visual Effects -> none) is an workaround.

I thought I was loosing my mind when everytime I'd visit CNN or Break.com in firefox, it would cause my network connection to drop.

Revision history for this message
Luciano Barcellos (lcbarcellos) wrote :

I'm using a Dell Vostro 1000 and the system doesn't even establish a connection via wired interface. It tries to get an address from dhcp and at the end it disconnects. It's not a hardware issue as it worked when I tested using a live CD from another distribution.
The VGA is from ATI (VGA compatible controller: ATI Technologies Inc RS482 [Radeon Xpress 200]) and turning off compiz doesn't work as a workaround for me. I use the free driver.

Revision history for this message
Henry Gomersall (hgomersall) wrote :

I have installed Intrepid from scratch, removing all the cruft from previous installations in the process.

My machine now seems to work flawlessly with Compiz running. No dropped connections.

I should note that I did have changed the graphics acceleration to from XAA to EXA to solve another problem with hangs in Compiz. This is the major change. I also added a few extra lines which I don't understand (below the EXA change) to xorg.conf:

 Option "AccelMethod" "EXA"
 Option "AccelDFS" "off"
 Option "AGPMode" "1"
 Option "AGPFaseWrise" "1"

It may be that all this has nothing to do with the network droppings, but there did appear to be a relationship between compiz and the dropped connections.

Revision history for this message
Kris Simpson (vortex1) wrote :

I can confirm the same problems using the same Broadcom BCM4401-B0 controller. I have recently installed Intrepid fresh, and attempted the workaround of disabling compiz. In all cases, I still experience the intermittent connection drops while under heavy network load. I'm currently using the most recent released kernel 2.6.27-9.19 and an nvidia graphics card GeForce FX 5200.

Revision history for this message
Henry Gomersall (hgomersall) wrote :

Further to my last post (2008-12-01), I should note that I accidentally installed 64-bit Intrepid. Accidentally because I borrowed a disk without thinking what version I was installing. Its a testament to just how far things have come that I didn't notice for 3 weeks! Anyhoo, it is with this 64-bit version that I have a nicely functioning system - this would be a pretty fundamental change from what I had before the re-installation.

Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message
BeSt (steph-belardi) wrote :

I confirm this bug on my laptop.
Disconnect under heavy load. I don't use "radeon" driver, nor Compiz.
I didn't notice that bug under 8.04.1

lspci | grep BCM
02:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

lspci | grep Graphic
00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)

I hope this will help.

Revision history for this message
smswehla (smswehla) wrote :

I see a similar symptom, but I only see it when receiving large files. Sending large files works flawlessly at around 10MB/s, as does receiving files at around 1.2MB/s. I can replicate with 100% reliability when copying a file to the affected machine via the local network.

Revision history for this message
eco (edoardo-costa) wrote :

I have the same problem. It was working fine on kubuntu 8.04 w/ kernel 2.6.24-23
I just did a fresh install of Kubuntu 8.10 and I'm getting the same errors:

I have a few partitions mounted over NFS. It hapens when I do large transfers.

dmesg:
[ 6713.988320] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 6713.988331] b44: eth0: Flow control is off for TX and off for RX.
[ 6736.366678] b44: eth0: powering down PHY
[ 6736.988227] b44: eth0: Link is down.
[ 6739.989318] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 6739.989329] b44: eth0: Flow control is off for TX and off for RX.

lspci
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility X1400
        Subsystem: Dell Device 2002
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at d0000000 (32-bit, prefetchable) [size=256M]
        Region 1: I/O ports at ee00 [size=256]
        Region 2: Memory at efdf0000 (32-bit, non-prefetchable) [size=64K]
        Expansion ROM at efe00000 [disabled] [size=128K]
        Capabilities: <access denied>

03:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)
        Subsystem: Dell Device 01cd
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64
        Interrupt: pin A routed to IRQ 17
        Region 0: Memory at ef9fe000 (32-bit, non-prefetchable) [size=8K]
        Capabilities: <access denied>
        Kernel driver in use: b44
        Kernel modules: b44

dpkg -l
ii radeontool 1.5-5build1 utility to control ATI Radeon backlight func
ii xserver-xorg-video-radeon 1:6.9.0+git20081003.f9826a56-0ubuntu2.1 X.Org X server -- ATI Radeon display driver

Revision history for this message
eco (edoardo-costa) wrote :

For anyone having this problem, I can confirm that disabling the desktop effects in KDE 4.1 (Compiz) does solve the problem.

Small price to pay to get access to my network.

Revision history for this message
BeSt (steph-belardi) wrote :

As I said before, I'm running Xubuntu with no Compiz effects / desktop effects, and i'm not using radeon/nvidia driver (because I have an Intel card), but I have the problem.

Revision history for this message
Kris Simpson (vortex1) wrote :

I've tried disabling Compiz/Desktop effects in 8.10 using Gnome and still experience the same problem.

Revision history for this message
eco (edoardo-costa) wrote :

BeSt, I'm not saying it's a solution but a work around for some of us. I'm sorry it doesn't solve your problem but it might help someone who has the same settings as mine until the problem is solved.

Have any of you tried to rename compiz, drop to command line to avoid all graphics and then try to generate large amounts of traffic?

Revision history for this message
Henry Gomersall (hgomersall) wrote :

I realise this isn't always an option, but has anybody else tried using the 64-bit installation?

I have not had any issues since I (accidentally) installed 64-bit Intrepid.

Revision history for this message
BeSt (steph-belardi) wrote :

Has someone tried to compile the b44 module ?
I tried, but I didn't success :-(

Revision history for this message
Noiano (noiano) wrote :

I also suffer from this bug. I have never had any problems with 8.04LTS.
With version 8.10 (full patched) I get connected and disconnected only when I receive files using samba (over LAN).

I never get disconnected when receiving (large) file from the internet. Only samba makes my nic power on and off all the time.

I don't even have compiz installed so I guess it has nothing to to with the b44 driver problem...

This is the full detailed description of my nic

02:05.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)
 Subsystem: ASUSTeK Computer Inc. Device 80a8
 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 32
 Interrupt: pin A routed to IRQ 20
 Region 0: Memory at dd800000 (32-bit, non-prefetchable) [size=8K]
 Capabilities: [40] Power Management version 2
  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
  Status: D0 PME-Enable- DSel=0 DScale=2 PME-
 Kernel driver in use: b44
 Kernel modules: b44

PS: my graphic card is a ati radeon 9500 pro (chipset r300)

Revision history for this message
Noiano (noiano) wrote :

for the time being: isn't there any workaroud?it's quite annoying not being able to use any samba share :(

Revision history for this message
Martin Benes (martinbenesh) wrote :

The workaround would be to install kernel from here: http://www.fi.muni.cz/~xbenes6/b44/
I merged the b44.c files between 2.6.24 && 2.6.27 basically just removing couple of lines of code from 2.6.27, which are probably supposed to do some power management.
This modified kernel seems to work nicely though.

(patch for 2.6.27 included, but this is just a hot-fix, not a solution)

Revision history for this message
Marsux (jr-komite) wrote :

I also have a laptop with the Broadcom Corporation BCM4401-B0 100Base-TX (rev 02) chipset. I am using the b44 kernel driver. If I load it without doing anything, the link goes up then down, and this cycles with a delay from one down to the next up. Playing with the option of the kernel module I found out that this problem disappears when setting the speed limit at 10 Mbps.

# ethtool -s eth0 speed 10 autoneg off

(replace eth0 by what applies to your situation). I have no problems when I use this option.
I realize it is not a perfect solution, since it limits the speed of the network connection, but slower network is better than no network at all.

I have been looking for solutions for this issue from time to time on forums but no explanation for this issue have come up yet. I have never seen any notice on the fact that setting the speed limit at 10 Mbps makes the problem disappear. Maybe it can provide a hint to people trying to get rid of this behaviour, if it is confirmed that it helps getting rid of faulty behaviour at 100 Mbps.

Hope this helps.

PS: my graphic chipset is an Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03), I am not sure graphic chipsets have anything to do with the issue.

Revision history for this message
Brad Figg (brad-figg) wrote :

Unfortunately it seems this bug is still an issue. Can you confirm this issue exists with the most recent Jaunty Jackalope 9.04 release - http://www.ubuntu.com/news/ubuntu-9.04-desktop . Please let us know your results. Thanks.

Changed in linux (Ubuntu):
assignee: nobody → brad-figg
Revision history for this message
HubertB (hubertb) wrote :

Yeah, I can confirm this bug is still there, even in Ubuntu 9.04 (AMD64):

$ uname -a
Linux luna 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 GNU/Linux

Guys, could you please check this out, this is my workaround for this:
1. Boot up your computer
2. Login to the desktop
3. Suspend your computer
4. Wake your computer up again (e. g. by pressing the powerbutton)
5. The b44-card is now working without any problems (even with compiz turned on!!)

Revision history for this message
Henry Gomersall (hgomersall) wrote :

I haven't experienced this problem yet on AMD64 9.04 (although I do run from suspend most of the time). I didn't experience it on 8.10 either after I reinstalled my machine (and upgraded to AMD64).

My machine is a dell Inspiron 9400.

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

Same happening here:

03:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)
 Subsystem: Dell Device 01cd
 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 64
 Interrupt: pin A routed to IRQ 17
 Region 0: Memory at ecbfe000 (32-bit, non-prefetchable) [size=8K]
 Capabilities: [40] Power Management version 2
  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
  Status: D0 PME-Enable- DSel=0 DScale=2 PME-
 Kernel driver in use: b44
 Kernel modules: b44

Revision history for this message
HubertB (hubertb) wrote :

Hey guys, please check whether this works for you:
1. Boot up your computer
2. Login to the desktop
3. Suspend your computer
4. Wake your computer up again (e. g. by pressing the powerbutton)
5. The b44-card is now working without any problems (even with compiz turned on!!)

This way I can use my b44-card without problems. If this also works for you, the problem might be located somewhere in the power management system (maybe the card isn't initialised correctly after booting up, but it's initialised correctly when the system is waking up from suspend).

Please post your results here!

Revision history for this message
Gerg (gergely-salamon) wrote :

Hello,

Same problem here, since upgraded from Ubuntu 8.10 (and previously 8.04) to 9.04. Realtek r8168 NIC (RTL8111/8168B), worked fine before.
It is a server (Linux 2.6.28-11-server #42-Ubuntu SMP Fri Apr 17 02:48:10 UTC 2009 i686 GNU/Linux) no ATI or Nvidia driver just the default intel on the motherboard which is D945GCLF2 (Dual Intel Atom).
It only dropps off when havy traffic, otherwise it is 'stable' (SSH, http access are ok).
Loaded modul is r8169, I am plannig to change it to the latest r8168 provided by Realtek to see what effect it might have.

How can I chack this send it to sleep then wake it up again on a server?

Revision history for this message
HubertB (hubertb) wrote :

You could try the following:

$ sudo -s
# echo "mem" > /sys/power/state

This should bring your server into the standby-state. You'll have to wake it up using the space bar or power button or so. Please post your results here :-)

Revision history for this message
Gerg (gergely-salamon) wrote :

Ok, I did, but negative. It went to sleep mode and came back (without the VGA, it remained in sleep mode!).

I started to copy some 60 GB data (mainly MP3s) from the server to my Windows 7 RC PC (Gigabit network, D-Link DIR-655 router) and it seemed good for some 15 minutes, but then just started it again (see attached "messages" from /var/log). I am listening to some of the same MP3s for hours now through the network on my Popcorn Hour, no problem.

It is a fresh install on an OZC 30 GB SSD (SATA), data are stored on a separate 1TB Samsung SATA HDD. Just for the record, previously I used a 16GB OZC Turbo USB drive for the system, it was upgraded from 8.10 to 9.04 and the situation was much worse, the Ethernet hung up after a few second on heavy traffic and it took minutes to came back. With the current setup it comes back rather quickly (in seconds sometimes).

I will go for the other driver (r8168) to see what happens.

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

Its not fixed in 2.6.30 neither :( (Karmic)

Revision history for this message
Gerg (gergely-salamon) wrote :

Following the instructions and using their script from here: http://www.jamesonwilliams.com/hardy-r8168, I managed to change the driver to r8168, it seems ok now. Moreover, it doubled the speed. With the r8169 driver it was around 9-10,000 kbytes/sec max., now it is continuously more than 20,000 it can even go up to 25,000 kbytes/sec.

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

The error lays in the b44_poll method. It seems that the read suddenly gets a ERROR, and then it restarts the network card.

 846 static int b44_poll(struct napi_struct *napi, int budget)
847 {
848 struct b44 *bp = container_of(napi, struct b44, napi);
849 int work_done;
850
851 spin_lock_irq(&bp->lock);
852
853 if (bp->istat & (ISTAT_TX | ISTAT_TO)) {
854 /* spin_lock(&bp->tx_lock); */
855 b44_tx(bp);
856 /* spin_unlock(&bp->tx_lock); */
857 }
858 spin_unlock_irq(&bp->lock);
859
860 work_done = 0;
861 if (bp->istat & ISTAT_RX)
862 work_done += b44_rx(bp, budget);
863
864 if (bp->istat & ISTAT_ERRORS) {
865 unsigned long flags;
866
867 spin_lock_irqsave(&bp->lock, flags);
868 b44_halt(bp);
869 b44_init_rings(bp);
870 b44_init_hw(bp, B44_FULL_RESET_SKIP_PHY);
871 netif_wake_queue(bp->dev);
872 spin_unlock_irqrestore(&bp->lock, flags);
873 work_done = 0;
874 }
875
876 if (work_done < budget) {
877 napi_complete(napi);
878 b44_enable_ints(bp);
879 }
880
881 return work_done;
882 }

it gets into 864

Revision history for this message
HubertB (hubertb) wrote :

@Jean-Louis Dupond: Great - How did you retrieve this information?

As for me, my card is working when suspending and waking up the system again - but I don't know how to find out the difference between a normal boot and a normal boot followed by a suspend + wakeup. Maybe it's just a flag in some kind of configuration register of the NIC-Chip or so, dunno.

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

I modified the driver to:

static int b44_poll(struct napi_struct *napi, int budget)
{
        struct b44 *bp = container_of(napi, struct b44, napi);
        int work_done;

        spin_lock_irq(&bp->lock);

        if (bp->istat & (ISTAT_TX | ISTAT_TO)) {
                /* spin_lock(&bp->tx_lock); */
                b44_tx(bp);
                /* spin_unlock(&bp->tx_lock); */
        }
        spin_unlock_irq(&bp->lock);

        work_done = 0;
        if (bp->istat & ISTAT_RX)
                work_done += b44_rx(bp, budget);

        if (bp->istat & ISTAT_ERRORS) {
                unsigned long flags;

                spin_lock_irqsave(&bp->lock, flags);

                printk(KERN_INFO PFX "DUP: b44_poll\n");

                if (bp->istat & ISTAT_DSCE)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_DSCE\n");
                }
                if (bp->istat & ISTAT_DATAE)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_DATAE\n");
                }
                if (bp->istat & ISTAT_DPE)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_DPE\n");
                }
                if (bp->istat & ISTAT_RDU)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_RDU\n");
                }
                if (bp->istat & ISTAT_RFO)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_RFO\n");
                }
                if (bp->istat & ISTAT_TFU)
                {
                        printk(KERN_INFO PFX "DUP: ISTAT_TFU\n");
                }

                b44_halt(bp);
                b44_init_rings(bp);
                b44_init_hw(bp, B44_FULL_RESET_SKIP_PHY);
                netif_wake_queue(bp->dev);
                spin_unlock_irqrestore(&bp->lock, flags);
                work_done = 0;
        }

        if (work_done < budget) {
                napi_complete(napi);
                b44_enable_ints(bp);
        }

        return work_done;
}

And when it goes down I see:

[25656.573416] b44: DUP: b44_poll
[25656.573424] b44: DUP: ISTAT_RFO
[25656.573627] b44: eth0: powering down PHY
[25656.816096] b44: eth0: Link is down.
[25659.816225] b44: eth0: Link is up at 100 Mbps, full duplex.
[25659.816231] b44: eth0: Flow control is off for TX and off for RX.

Revision history for this message
JC (jc-boursiquot) wrote :

Jean-Louis Dupond can you please give us the step by step instructions on how to fix this problem. I am a newbie and desperately need help. What is the file name... I am assuming it is the b44.ko? How do we modify the file if it is a compile .ko file?

Thanks in advance for your assistance.

Jean-Claude

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

@JC: I didn't fix the bug, just debugged it with that code to check WHY its resetting

David Miller (the driver dev) answered me on the mailing list with:

The problem is that we need to know if a receive FIFO overflow
puts the chip into a state where it must be reset.

And until we know that, we cannot simply exclude that condition
from the reset test.

Revision history for this message
Jeff (grapnell-gmail) wrote :

I can confirm this is still happening on Jaunty. Seems to happen most often under heavy network usage, such as VNC or connecting to Windows Terminal Server.

~$ uname -a
Linux jeff-desktop 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009 i686 GNU/Linux

Revision history for this message
manoova (spn-gardner) wrote :

I have this problem on the following hardware and config:

Dell Dimension E521
mythbuntu 9.04

# uname -a
Linux mythtv 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 00:28:35 UTC 2009 i686 GNU/Linux

# lspci | grep -i ether
04:07.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

I saw this posting http://ubuntuforums.org/showthread.php?p=7794789#post7794789 and I disabled "Cool and Quiet" in my BIOS. Since disabling this feature 24 hours ago I have had no problems with Samba. I have transferred over 3Gb of data to the server.

Hopefully other people can test this on similar motherboards and it may go some way to resolving this problem.

Revision history for this message
HubertB (hubertb) wrote :

Well guys, I installed 9.10 Alpha 5 today. Seems that this issue is now gone :-) I have full 100 MBit/s speed and don't need to do my suspend-and-resume-trick mentioned above any more (and even then I haven't had full 100 MBit speed). Can anyone confirm that?

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

Still not fixed
Linux laptopjl 2.6.31-10-generic #32-Ubuntu SMP Thu Sep 10 23:29:56 UTC 2009 x86_64 GNU/Linux

[ 4182.816140] b44: eth0: Link is down.
[ 4193.816214] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 4193.816221] b44: eth0: Flow control is off for TX and off for RX.
[ 4308.816131] b44: eth0: Link is down.
[ 4395.817263] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 4395.817269] b44: eth0: Flow control is off for TX and off for RX.
[ 4515.816130] b44: eth0: Link is down.

:'(

Revision history for this message
m1fcj (hakan-koseoglu) wrote :

Problem appearing on Dell Inspiron 9400 with 9.04 while running two rsyncs over ssh to my local file server.

03:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

hakan@photon:~$ cat /proc/version
Linux version 2.6.28-15-generic (buildd@yellow) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) ) #49-Ubuntu SMP Tue Aug 18 19:25:34 UTC 2009

Sep 14 20:59:55 photon kernel: [44383.598306] b44: eth0: powering down PHY
Sep 14 20:59:55 photon kernel: [44383.805069] b44: eth0: Link is down.
Sep 14 20:59:58 photon kernel: [44386.804617] b44: eth0: Link is up at 100 Mbps, full duplex.
Sep 14 20:59:58 photon kernel: [44386.804623] b44: eth0: Flow control is off for TX and off for RX.
Sep 14 21:00:03 photon kernel: [44391.230109] b44: eth0: powering down PHY
Sep 14 21:00:03 photon kernel: [44391.805137] b44: eth0: Link is down.
Sep 14 21:00:06 photon kernel: [44394.805225] b44: eth0: Link is up at 100 Mbps, full duplex.
Sep 14 21:00:06 photon kernel: [44394.805230] b44: eth0: Flow control is off for TX and off for RX.

Revision history for this message
ramesh (ramesh-info) wrote :

Facing similar problem: Dell Dimension C521, Ubuntu 9.04 64bit, Fresh Install.

Linux DELLC521 2.6.28-15-generic #49-Ubuntu SMP Tue Aug 18 19:25:34 UTC 2009 x86_64 GNU/Linux
04:07.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)
00:05.0 VGA compatible controller: nVidia Corporation C51 [GeForce 6150 LE] (rev a2)

Happens only during heavy file transfer && (( moving windows ) || ( scrolling in Firefox ))

Transfering large iso files across using scp. scp reports "stalled" but continues. No md5sum errors detected in the transfered files!

"Cool and Quite" (suggested above) did not help.

"Sleep and Wakeup" (suggested above) did not help. The computer never woke-up.

"Turning off Compiz" (suggested above) seems to help. Unable to reproduce behavior after this.

Hope this helps.

Revision history for this message
HubertB (hubertb) wrote :

@ramesh: Please try a live CD of the upcoming Ubuntu release 9.10 "Karmic Koala" (you can download them here: http://cdimage.ubuntu.com/releases/karmic/alpha-6/ ).

As I reported, I had the same issues like you but starting from Karmic Alpha 5 onwards, the issue is gone (at least for me but I hope for all others out there, too). Kernel 2.6.31-9-generic and 2.6.31-10-generic from Ubuntu 9.10 are working just fine :-)

Please report your results.

Revision history for this message
ramesh (ramesh-info) wrote :

@HubertB: Will check out karmic alpha-6. Thanks. Ramesh.

Revision history for this message
Fixion (fixion) wrote :

The problem persists after a clean install of Karmic (non-beta), kernel 2.6.31-14-generic still includes the bug. After reverting to kernel 2.5.27-14 (if I remember correctly) in Jaunty, the problem disappeared. I am not sure in which kernel version the problem first appeared.

Revision history for this message
Jeff (grapnell-gmail) wrote :

I can confirm this is still happening in a fresh install of Karmic...

$ uname -a
Linux jeff-desktop 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009 i686 GNU/Linux

$ lspci | grep 4401
04:07.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

$ dmesg
[ 491.052203] b44: eth0: powering down PHY
[ 491.989107] b44: eth0: Link is down.
[ 493.989190] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 493.989198] b44: eth0: Flow control is off for TX and off for RX.
[ 495.192737] b44: eth0: powering down PHY
[ 495.989111] b44: eth0: Link is down.
[ 497.988185] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 497.988193] b44: eth0: Flow control is off for TX and off for RX.
[ 500.341831] b44: eth0: powering down PHY
[ 500.988105] b44: eth0: Link is down.
[ 502.988184] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 502.988192] b44: eth0: Flow control is off for TX and off for RX.
[ 505.819760] b44: eth0: powering down PHY
[ 505.988104] b44: eth0: Link is down.
[ 508.988184] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 508.988193] b44: eth0: Flow control is off for TX and off for RX.
[ 509.713016] b44: eth0: powering down PHY
[ 509.989108] b44: eth0: Link is down.
[ 512.988625] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 512.988634] b44: eth0: Flow control is off for TX and off for RX.
[ 514.420793] b44: eth0: powering down PHY
[ 514.989113] b44: eth0: Link is down.
[ 517.988179] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 517.988187] b44: eth0: Flow control is off for TX and off for RX.
[ 698.432662] b44: eth0: powering down PHY
[ 698.989100] b44: eth0: Link is down.
[ 701.988182] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 701.988190] b44: eth0: Flow control is off for TX and off for RX.
[ 914.477360] b44: eth0: powering down PHY
[ 914.988105] b44: eth0: Link is down.
[ 917.989178] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 917.989187] b44: eth0: Flow control is off for TX and off for RX.
[ 1161.566117] b44: eth0: powering down PHY
[ 1161.988106] b44: eth0: Link is down.
[ 1164.988180] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 1164.988188] b44: eth0: Flow control is off for TX and off for RX.
[ 1460.516631] b44: eth0: powering down PHY
[ 1460.988096] b44: eth0: Link is down.
[ 1463.988181] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 1463.988189] b44: eth0: Flow control is off for TX and off for RX.
[ 1465.121118] b44: eth0: powering down PHY
[ 1465.988098] b44: eth0: Link is down.
[ 1467.988177] b44: eth0: Link is up at 100 Mbps, full duplex.
[ 1467.988185] b44: eth0: Flow control is off for TX and off for RX.
[ 1700.813187] b44: eth0: powering down PHY

Revision history for this message
Jeff (grapnell-gmail) wrote :

Oh, and this has absolutely 0 (zero) to do with Compiz... All effects are turned off and it still happens even if I switch to the generic vesa driver.

Revision history for this message
XChesser (xchesser) wrote :

I have the same problem in openSUSE. I also have 'Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)', I have no Compiz installed, and my graphics is Intel. So the problem is just in b44 module.

Revision history for this message
makkonen (makkonan) wrote :

Thought I'd chime in here and point out that none of the simple fixes mentioned in this thread were of any use. I was running XBMC Live 9.11 (Ubuntu 9.10 based). No Compiz to disable. Disabling CnQ in the bios did nothing. Putting the computer to sleep and waking it did nothing.

However, the less quick fix seems to have done the trick. Martin Benes's patch from post #44 (http://www.fi.muni.cz/~xbenes6/b44/b44.patch) still applies (mostly cleanly) to the b44 sources included with kernel-2.6.31-19. I ended up building a whole new kernel because I didn't really know what I was doing, but it works now. No drops even when streaming HD video across the LAN. Cheers.

Revision history for this message
AttilaN (attila123456) wrote :

dell e1505, BCM4401-B0. It worked fine in karmic but after upgrading to lucid-rc, i can't stream HD movies from a network drive anymore. copying works (to make it more confusing).

Revision history for this message
Etienne (richelle) wrote :

I have a Dell Inspiron 6400 running Kubuntu 10.04LTS with wicd to manage networks.

lspci gives
03:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02)

The network was stable with karmic, and keeps on disconnecting with lucid.

May 7 20:12:40 etienne-laptop kernel: [ 6070.982114] b44: eth0: powering down PHY
May 7 20:12:40 etienne-laptop kernel: [ 6071.000114] b44: eth0: Link is down.
May 7 20:12:42 etienne-laptop kernel: [ 6072.745265] b44: eth0: powering down PHY
May 7 20:12:45 etienne-laptop kernel: [ 6075.989196] b44: eth0: Link is up at 100 Mbps, full duplex.
May 7 20:12:45 etienne-laptop kernel: [ 6075.989200] b44: eth0: Flow control is off for TX and off for RX.
May 7 20:12:45 etienne-laptop kernel: [ 6076.192229] b44: eth0: powering down PHY
May 7 20:12:48 etienne-laptop kernel: [ 6078.989190] b44: eth0: Link is up at 100 Mbps, full duplex.
May 7 20:12:48 etienne-laptop kernel: [ 6078.989196] b44: eth0: Flow control is off for TX and off for RX.
May 7 20:12:49 etienne-laptop kernel: [ 6079.368242] b44: eth0: powering down PHY
May 7 20:12:52 etienne-laptop kernel: [ 6083.000232] b44: eth0: Link is up at 100 Mbps, full duplex.
May 7 20:12:52 etienne-laptop kernel: [ 6083.000238] b44: eth0: Flow control is off for TX and off for RX.

I switched back to kernel 2.6.32-21-generic from 2.6.32-22-generic and it is stable again...

So there must be a bug in the b44 module inside 2.6.32-22-generic as not in 2.6.32-21-generic.

Revision history for this message
AttilaN (attila123456) wrote :

installed 2.6.34 rc6 kernel from the mainline PPA - the network problem seems to be gone.

Revision history for this message
Etienne (richelle) wrote :

@AttilaN
This solves it for me too.
Thanks

Revision history for this message
jcdutton (james-superbug) wrote :

Problem still present here.
Tried 2.6.32-22 and 2.6.34 (no rc) from PPA
Linux ally 2.6.34-020634-generic #020634 SMP Mon May 17 19:27:49 UTC 2010 x86_64 GNU/Linux

It is most definitely network load related.
I use mythtv. It plays the live tv, but as soon as I go to view the program guide, the ethernet interface goes up and down.
May 18 13:53:18 ally kernel: [ 113.012750] b44 ssb0:0: eth0: Flow control is off for TX and off for RX
May 18 13:53:21 ally kernel: [ 115.983994] b44 ssb0:0: eth0: powering down PHY
May 18 13:53:21 ally kernel: [ 116.012651] b44 ssb0:0: eth0: Link is down
May 18 13:53:24 ally kernel: [ 119.011423] b44 ssb0:0: eth0: Link is up at 100 Mbps, full duplex
May 18 13:53:24 ally kernel: [ 119.011432] b44 ssb0:0: eth0: Flow control is off for TX and off for RX
May 18 13:53:26 ally kernel: [ 120.395970] b44 ssb0:0: eth0: powering down PHY
May 18 13:53:26 ally kernel: [ 121.010316] b44 ssb0:0: eth0: Link is down
May 18 13:53:29 ally kernel: [ 124.010237] b44 ssb0:0: eth0: Link is up at 100 Mbps, full duplex
May 18 13:53:29 ally kernel: [ 124.010247] b44 ssb0:0: eth0: Flow control is off for TX and off for RX

Some way needs to be found to recover from overload more gracefully.

Revision history for this message
jcdutton (james-superbug) wrote :

In addition to the above, I tried using the wireless interface instead of the b44 wired interface.
Exactly the same happens to the wireless card. I.e. Playing the media stream works fine, but selecting the program guide causes the network interface to fail.
So, I would conclude that it is a more general fault, and not confined to the b44 card.

May 18 14:04:55 ally kernel: [ 810.172577] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:04:55 ally kernel: [ 810.172589] iwl3945 0000:0b:00.0: Error setting new configuration (-110).
May 18 14:04:56 ally kernel: [ 810.670057] iwl3945 0000:0b:00.0: Error sending REPLY_SCAN_CMD: time out after 500ms.
May 18 14:04:56 ally kernel: [ 811.170069] iwl3945 0000:0b:00.0: Error sending REPLY_TX_PWR_TABLE_CMD: time out after 500ms.
May 18 14:04:57 ally kernel: [ 811.672601] iwl3945 0000:0b:00.0: Error sending REPLY_SCAN_CMD: time out after 500ms.
May 18 14:04:57 ally kernel: [ 812.170440] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:04:57 ally kernel: [ 812.170452] iwl3945 0000:0b:00.0: Error setting new configuration (-110).
May 18 14:04:58 ally kernel: [ 812.670651] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:04:58 ally kernel: [ 812.670663] iwl3945 0000:0b:00.0: Error setting new configuration (-110).
May 18 14:04:58 ally kernel: [ 813.172574] iwl3945 0000:0b:00.0: Error sending REPLY_TX_PWR_TABLE_CMD: time out after 500ms.
May 18 14:05:02 ally kernel: [ 817.170072] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:05:02 ally kernel: [ 817.170084] iwl3945 0000:0b:00.0: Error setting new configuration (-110).
May 18 14:05:03 ally kernel: [ 817.671518] iwl3945 0000:0b:00.0: Error sending REPLY_SCAN_CMD: time out after 500ms.
May 18 14:05:03 ally kernel: [ 818.172595] iwl3945 0000:0b:00.0: Error sending REPLY_TX_PWR_TABLE_CMD: time out after 500ms.
May 18 14:05:04 ally kernel: [ 818.672420] iwl3945 0000:0b:00.0: Error sending REPLY_SCAN_CMD: time out after 500ms.
May 18 14:05:04 ally kernel: [ 819.170332] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:05:04 ally kernel: [ 819.170344] iwl3945 0000:0b:00.0: Error setting new configuration (-110).
May 18 14:05:05 ally kernel: [ 819.672603] iwl3945 0000:0b:00.0: Error sending REPLY_RXON: time out after 500ms.
May 18 14:05:05 ally kernel: [ 819.672615] iwl3945 0000:0b:00.0: Error setting new configuration (-110).

Revision history for this message
pintubigfoot (pintubigfoot) wrote :

Hey guys,

I am using Fujitsu Lifebook S2110 with Ubuntu 10.04 AMD64 and were consistently getting the issue.

The issue was then gone away when I turned on the eth flow-control: "ethtool -A eth0 rx on tx on autoneg off"

Can some of you try to verify if "ethtool -A eth0 rx on tx on autoneg off" also resolves your problem there?

Revision history for this message
pintubigfoot (pintubigfoot) wrote :

Sorry, the command is incomplete.

Full commands are:
# ethtool -A eth0 rx on tx on autoneg off
# ethtool -s eth0 autoneg off

So far I do not encounter disconnection during the heavy network load.
I put the above 2 commands into /etc/rc.local

Revision history for this message
Dave Espionage (daveespionage) wrote :

I only had disconnect issues with my wireless until I switched over to the connman so that my network connection would show up in the indicator applet. Prior to that wireless would randomly get kernel-level boot. Now with connman I'm getting the same symptoms as this issue

Jun 4 20:53:20 localhost kernel: [14329.844166] b44: eth0: powering down PHY
Jun 4 20:53:20 localhost kernel: [14330.000546] b44: eth0: Link is down.
Jun 4 20:53:23 localhost kernel: [14333.006329] b44: eth0: Link is up at 100 Mbps, full duplex.
Jun 4 20:53:23 localhost kernel: [14333.006334] b44: eth0: Flow control is on for TX and on for RX.
Jun 4 20:56:27 localhost kernel: [14516.865782] b44: eth0: powering down PHY
Jun 4 20:56:27 localhost kernel: [14517.000127] b44: eth0: Link is down.
Jun 4 20:56:30 localhost kernel: [14520.004122] b44: eth0: Link is up at 100 Mbps, full duplex.
Jun 4 20:56:30 localhost kernel: [14520.004131] b44: eth0: Flow control is on for TX and on for RX.

@pintubigfoot I tried putting those in rc.local, and now it says Flow control is on, instead of Flow control is off, but still drops the link whenever I'm loading any web content.

Revision history for this message
AttilaN (attila123456) wrote :

I spoke too soon in comment #76 - the issue persists, Google Earth is badly affected, even with the latest 2.6.35 kernel.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
assignee: Brad Figg (brad-figg) → nobody
Revision history for this message
jcdutton (james-superbug) wrote :

Please can some other people test this bug fix.
I would like to see if it is just a "works for me" or a more general fix for everybody.

The fix just resets the FIFO and the RFO condition and then continues, instead of before it reset the entire chip and causes the link to go down.

Revision history for this message
Jean-Louis Dupond (dupondje) wrote :

Is this a patch that got applied upstream or self-made? :)

Revision history for this message
AttilaN (attila123456) wrote :

jcdutton, thx for the patch. I just tested it by playing a 720p video over the LAN. lots of "b44_poll: ISTAT_RFO" messages in the log, and the video is still choppy. the network seems to be unstable.

Revision history for this message
martyscholes (martyscholes) wrote :

I can confirm this is still an issue and appears to be network load related and not Compiz-specific.

As I type this I am attempting an iSCSI install of Ubuntu 10.10 using Ubuntu 10.04 netboot binaries on a Dell Inspiron 600m and the link is constantly being reset, presumably by the iSCSI traffic.

This is a netboot install, so the graphics are text-only and there is no sound or wireless.

The only way to get past the mkfs portion of the install was to force the network to 10Mb at the switch.

The laptop is currently installing packages, causing the link to go up and down. I just reset the switch port to 10Mb and the NIC is very slow, but working flawlessly.

Revision history for this message
jcdutton (james-superbug) wrote :

Jean-Louis: I wrote the patch myself.
AttilaN: Does my patch improve the situation. I.e. Is it less choppy.

The problem is that the RFO "Receive buffer overflow" is happening. So, data is lost, but my patch should make it recover more quickly.
I think the real problem might be elsewhere in the kernel. Something holding onto the CPU for too long, so that the CPU does not have time to process the received data in time. My guess is some function call to the graphics card is taking too long to return.

Revision history for this message
AttilaN (attila123456) wrote :

I've had good results using the stock maverick kernel with b44 recompiled with a version of your patch from http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=32737e934a952c1b0c744f2a78d80089d15c7ee3.

Revision history for this message
Andy Whitcroft (apw) wrote :

Ok the fix mentioned in comment #89 was in the Natty kernels. Could someone affected give a 2.6.37 based Natty kernel a spin and report back here if that fixes the issue.

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Changed in linux:
importance: Unknown → High
Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
Changed in linux:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.