802.3ad bond interface have show high RX dropped packets

Bug #1041070 reported by Jean-Daniel Bussy
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
bridge-utils (Ubuntu)
Invalid
Medium
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

On Ubuntu 12.04.1 LTS

With the following networks settings:

/etc/network/interfaces
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp

# Network: instances
# Bonding
auto eth2
iface eth2 inet manual
    bond-master bond0
auto eth4
iface eth4 inet manual
    bond-master bond0

auto bond0
iface bond0 inet manual
    bond-mode 802.3ad
    bond-miimon 100
    bond-lacp-rate 1
    bond-slaves none

The RX packets drop counter increase very fast:

bond0 Link encap:Ethernet HWaddr 00:10:18:e0:5e:a4
          inet6 addr: fe80::210:18ff:fee0:5ea4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
          RX packets:5912 errors:0 dropped:3223 overruns:0 frame:0
          TX packets:110 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:585020 (585.0 KB) TX bytes:13804 (13.8 KB)

All packages are up to date:

uname -a
Linux std006 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

apt-cache policy ifenslave-2.6
ifenslave-2.6:
  Installed: 1.1.0-19ubuntu5
  Candidate: 1.1.0-19ubuntu5
  Version table:
 *** 1.1.0-19ubuntu5 0
        500 http://10.5.0.1/ubuntu/ precise/main amd64 Packages
        100 /var/lib/dpkg/status
---
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
DistroRelease: Ubuntu 12.04
Package: bridge-utils 1.5-2ubuntu6
PackageArchitecture: amd64
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm-256color
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
Tags: precise
Uname: Linux 3.2.0-29-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

Revision history for this message
Jean-Daniel Bussy (silversurfer972) wrote : Dependencies.txt

apport information

tags: added: apport-collected precise
description: updated
description: updated
Revision history for this message
James Page (james-page) wrote :

Thanks for taking the time to report this bug in Ubuntu.

As you are configuring with "bond-mode 802.3ad" this will also require configuration on the switch your server is connected to - please can you confirm that this has been done? Configuration of the switch will depend on vendor so unfortunately I can't give you an easy way to check.

Marking 'Incomplete' for the time being; please set back to 'New' if you believe the switch is correctly configured.

Changed in bridge-utils (Ubuntu):
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote :

Also please could you check whether the RX packets are being dropped across both slave interfaces or just one of them - might give us some pointers.

Revision history for this message
Tim Stewart (tim-j-stewart) wrote :

$ uname -a
Linux horton06c 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04 LTS
Release: 12.04
Codename: precise

I'm seeing this problem too. 60% of packets on the bonded NIC are being dropped.

Revision history for this message
Tim Stewart (tim-j-stewart) wrote :

Update: I'm running the bond in bond-mode 1.

Revision history for this message
Steve Boyle (h-sb-f) wrote :

I'm seeing the same thing. Yes, my switch is configured for 802.3ad. The bond comes up but drops 2pps consistently. I see the vast majority of the drops on the bond0 interface, the underlying eth0 and eth1 interfaces each show much less loss than bond0.

bond0 Link encap:Ethernet HWaddr d4:ae:52:9b:7d:ef
          inet addr:10.10.2.152 Bcast:10.10.2.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST MTU:9000 Metric:1
          RX packets:1830498255 errors:9389 dropped:683402 overruns:0 frame:9371
          TX packets:1503988811 errors:0 dropped:89 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:12421268594512 (12.4 TB) TX bytes:6420158358964 (6.4 TB)

eth0 Link encap:Ethernet HWaddr d4:ae:52:9b:7d:ef
          UP BROADCAST RUNNING SLAVE MULTICAST MTU:9000 Metric:1
          RX packets:911558907 errors:4627 dropped:153 overruns:0 frame:4589
          TX packets:755225077 errors:0 dropped:59 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6200006841270 (6.2 TB) TX bytes:3423624108536 (3.4 TB)
          Interrupt:35

eth1 Link encap:Ethernet HWaddr d4:ae:52:9b:7d:ef
          UP BROADCAST RUNNING SLAVE MULTICAST MTU:9000 Metric:1
          RX packets:918939348 errors:4762 dropped:159 overruns:0 frame:4782
          TX packets:748763734 errors:0 dropped:30 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6221261753242 (6.2 TB) TX bytes:2996534250428 (2.9 TB)
          Interrupt:38

lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:685523919 errors:0 dropped:0 overruns:0 frame:0
          TX packets:685523919 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:9961540573942 (9.9 TB) TX bytes:9961540573942 (9.9 TB)

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 17
        Partner Key: 17
        Partner Mac Address: 00:01:e8:d6:a9:4e

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: d4:ae:52:9b:7d:ef
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: d4:ae:52:9b:7d:ee
Aggregator ID: 1
Slave queue ID: 0

Revision history for this message
Jean-Daniel Bussy (silversurfer972) wrote :

Configuration is correctly setup on the attached switch for 802.3ad.
Just like Steve Boyle, I have around 2pps consistently.
I have no drops on any of the underlying interfaces.

Changed in bridge-utils (Ubuntu):
status: Incomplete → New
Revision history for this message
Philipp Wollermann (philwo) wrote :

This is just a display error, initially caused by this commit:
https://github.com/torvalds/linux/commit/3aba891d

It is fixed by this patch in Linux 3.4-rc7:
https://github.com/torvalds/linux/commit/b99215cdc6e191f5649687536d4fb0faa3d7f56e

Revision history for this message
James Page (james-page) wrote :

Adding task for linux as this appears to be kernel related.

Changed in bridge-utils (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1041070

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Stéphane Graber (stgraber) wrote :

Closing bridge-utils task as this bug is about bonding, not bridging and it actually looks like a kernel bug more than a userspace bug.

Changed in bridge-utils (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Kaizoku (neoark) wrote :

I am also seeing lot of RX packet drops for some off reason in Ubuntu 12.10.

Revision history for this message
Chris J Arges (arges) wrote :

This is related to the bonding mode and _not_ a bug. The bonding module will drop duplicate frames received on inactive ports, which is normal behavior. [0] Overall the packets should be getting into the machine without problems since they are received on the active slave. To confirm this do the following

1) Check dropped packets from all interfaces. So if eth0/eth1 are connected to bond0, we may see dropped packets for bond0 and eth0, but not for eth1. This depends on which interface is the active interface. This can be checked using the following:
cat /sys/class/net/bond0/bonding/active_slave

So if the active_slave isn't dropping packets, and the inactive slave is dropping packets this is normal in 'active-backup' mode (or any mode where there is an inactive slave).

2) If we want both interfaces to not drop packets we can use 'all_slaves_active' bonding module parameter [0].
Check:
cat /sys/class/net/bond0/bonding/all_slaves_active, it should default to 0 which means drop frames on the inactive slave.

If we set this to 1, we will no longer drop frames:
echo 1 | sudo tee /sys/class/net/bond0/bonding/all_slaves_active

Changed in linux (Ubuntu):
status: Expired → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers