bond slave interface sometimes does not come up on boot

Bug #996369 reported by Tom van Leeuwen
This bug report is a duplicate of:  Bug #1160490: race condition updating statefile. Edit Remove
50
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
High
khuanglim
Precise
Confirmed
High
Unassigned
Quantal
Won't Fix
High
Unassigned
Raring
Won't Fix
High
Unassigned
Saucy
Won't Fix
High
Unassigned
Trusty
Confirmed
High
Unassigned

Bug Description

bug report:
Hi guys,

I'm running ubuntu12.04 server on a HP DL380G7.
server01 ~ # lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

I've got 2 ethernet cards with 4x 10G interfaces.
I've got a bond on 4x10G where only eth4 and eth6 are patched.

server01 ~ # ethtool -i eth4
driver: be2net
version: 4.0.100u
firmware-version: 4.0.360.15
bus-info: 0000:0e:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no

server01 ~ # ethtool -i eth6
driver: be2net
version: 4.0.100u
firmware-version: 4.0.360.15
bus-info: 0000:15:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no

After a reboot I expect to see a bond0 interface with 2 RUNNING SLAVE interfaces.

However, sometimes I only see 1 interface coming up in the bond after a reboot and when I manually bring the other interface up (ifconfig eth6 up) it works:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
 Aggregator ID: 2
 Number of ports: 1
 Actor Key: 33
 Partner Key: 32773
 Partner Mac Address: 00:23:04:ee:be:01

Slave Interface: eth6
MII Status: down <<<<<<<<<<<< SHOULD BE UP
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:9c:02:3c:c9:70
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth4
MII Status: up <<<<<<<<<<<< ONLY INTERFACE THAT IS UP
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:99:98
Aggregator ID: 2
Slave queue ID: 0

Slave Interface: eth7
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:c9:74
Aggregator ID: 3
Slave queue ID: 0

Slave Interface: eth5
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:99:9c
Aggregator ID: 4
Slave queue ID: 0

server01 ~ # ifconfig
bond0 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          inet6 addr: fe80::29c:2ff:fe3c:c970/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
          RX packets:169071 errors:0 dropped:54 overruns:0 frame:0
          TX packets:1236 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:18435358 (18.4 MB) TX bytes:174727 (174.7 KB)

eth4 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
          RX packets:169071 errors:0 dropped:45 overruns:0 frame:0
          TX packets:1235 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:18435358 (18.4 MB) TX bytes:174637 (174.6 KB)

eth5 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eth7 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:1333 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1333 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:102784 (102.7 KB) TX bytes:102784 (102.7 KB)

vlan888 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          inet addr:1.1.0.50 Bcast:1.1.0.63 Mask:255.255.255.240
          inet6 addr: 2222:2222:ffff::11/124 Scope:Global
          inet6 addr: fe80::29c:2ff:fe3c:c970/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:168250 errors:0 dropped:0 overruns:0 frame:0
          TX packets:661 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:15995674 (15.9 MB) TX bytes:41938 (41.9 KB)

vlan889 Link encap:Ethernet HWaddr 00:9c:02:3c:c9:70
          inet addr:1.1.0.5 Bcast:1.1.0.15 Mask:255.255.255.240
          inet6 addr: 2222:2222:ffff::105/120 Scope:Global
          inet6 addr: fe80::29c:2ff:fe3c:c970/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:708 errors:0 dropped:0 overruns:0 frame:0
          TX packets:571 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:59088 (59.0 KB) TX bytes:126565 (126.5 KB)

server01 ~ # ifconfig eth6 up

logging in /var/log/syslog:
May 8 07:35:14 server01 kernel: [ 201.620795] 8021q: adding VLAN 0 to HW filter on device eth6
May 8 07:35:14 server01 kernel: [ 201.627183] bonding: bond0: link status definitely up for interface eth6, 10000 Mbps full duplex.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
 Aggregator ID: 2
 Number of ports: 2
 Actor Key: 33
 Partner Key: 32773
 Partner Mac Address: 00:23:04:ee:be:01

Slave Interface: eth6
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:9c:02:3c:c9:70
Aggregator ID: 2
Slave queue ID: 0

Slave Interface: eth4
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:99:98
Aggregator ID: 2
Slave queue ID: 0

Slave Interface: eth7
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:c9:74
Aggregator ID: 3
Slave queue ID: 0

Slave Interface: eth5
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:9c:02:3c:99:9c
Aggregator ID: 4
Slave queue ID: 0

Network configuration:
# cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth4 eth5 eth6 eth7
iface eth4 inet manual
  bond-master bond0
iface eth5 inet manual
  bond-master bond0
iface eth6 inet manual
  bond-master bond0
iface eth7 inet manual
  bond-master bond0

auto bond0
iface bond0 inet static
  bond-slaves none
  bond-mode 802.3ad
  bond-miimon 100
  address 0.0.0.0
  netmask 0.0.0.0
  ip-proxy-arp 0

auto vlan888
iface vlan888 inet static
  vlan_raw_device bond0
  address 1.1.0.50
  netmask 255.255.255.240
  ip-proxy-arp 0
  post-up sysctl -w net.ipv4.conf.${IFACE}.forwarding=1
  post-up ip route add 192.168.1.0/24 via 1.1.0.53
  post-up ip route add 1.1.0.32/28 via 1.1.0.53
  post-up ip route add 0.0.0.0/0 via 1.1.0.53

iface vlan888 inet6 static
  vlan_raw_device bond0
  address 2222:2222:ffff::11
  netmask 124
  post-up sysctl -w net.ipv6.conf.${IFACE}.forwarding=1
  post-up ip -6 route add 2222:2222:ffff::/124 via 2222:2222:ffff::14
  post-up ip -6 route add ::/0 via 2222:2222:ffff::14

auto vlan889
iface vlan889 inet static
  vlan_raw_device bond0
  address 1.1.0.5
  netmask 255.255.255.240
  ip-proxy-arp 0
  post-up sysctl -w net.ipv4.conf.${IFACE}.forwarding=1
  post-up ip route add 10.0.0.0/8 via 1.1.0.1
  post-up ip route add 192.168.0.0/12 via 1.1.0.1
  post-up ip route add 1.1.0.0/20 via 1.1.0.1
  post-up ip route add 2.2.0.0/24 via 1.1.0.1
  post-up ip route add 3.3.0.0/24 via 1.1.0.1

iface vlan889 inet6 static
  vlan_raw_device bond0
  address 2222:2222:ffff::105
  netmask 120
  post-up sysctl -w net.ipv6.conf.${IFACE}.forwarding=1
  post-up ip -6 route add 2222:2222::/32 via 2222:2222:ffff::101
---
AlsaDevices:
 total 0
 crw-rw---T 1 root audio 116, 1 May 8 07:31 seq
 crw-rw---T 1 root audio 116, 33 May 8 07:31 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu7
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=3a88e785-42be-4d09-a3d2-6509e148b49a
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Beta amd64 (20120327)
MachineType: HP ProLiant DL380 G7
Package: linux (not installed)
PciMultimedia:

ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-24-generic root=UUID=db7e17c0-5861-4074-9467-cffce02483c0 ro
ProcVersionSignature: Ubuntu 3.2.0-24.37-generic 3.2.14
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic N/A
 linux-backports-modules-3.2.0-24-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.2.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 05/05/2011
dmi.bios.vendor: HP
dmi.bios.version: P67
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL380 G7
dmi.sys.vendor: HP

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/996369/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 996369

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : AcpiTables.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : BootDmesg.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : IwConfig.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : Lspci.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : Lsusb.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : ProcEnviron.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : ProcModules.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : UdevDb.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : UdevLog.txt

apport information

Revision history for this message
Tom van Leeuwen (tom-vleeuwen) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Do you know if this issue happened in a previous version of Ubuntu, or is this a new issue?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc7-precise/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: needs-upstream-testing
Revision history for this message
annunaki2k2 (russell-knighton) wrote :

Just though I would add a comment that I am also seeing this behaviour. It (might) be related to having a "post-up" command attached to the ethernet configuration. Here is my interfaces file:
xfers ~ # cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto bond0
iface bond0 inet static
 address 172.16.1.10
 netmask 255.255.255.0
 broadcast 172.16.1.255
 network 172.16.1.0
 gateway 172.16.1.1
 dns-nameservers 10.0.0.120 10.0.1.120
 dns-search mps.lan wilts.mps.lan
 dns-domain mps.lan
 bond-slaves none
 bond_mode 802.3ad
 bond_miimon 40
 bond_lacp_rate 1
 bond_use_carrier 1
 post-up /usr/local/sbin/check-bond.sh $IFACE
 pre-down /usr/local/sbin/check-bond.sh stop $IFACE
 ## ftp.mps.lan - for internal access
 up ip addr add 172.16.1.20/24 dev $IFACE

 ## ftp-sohonet.mps.lan - for FTP/Aspera Connect over Sohonet
 up ip addr add 172.16.1.21/24 dev $IFACE
 ## ftp-abovenet-7a.mps.lan - for FTP/Aspera Connect over the Abovenet 7A link
 up ip addr add 172.16.1.22/24 dev $IFACE
 ## ftp-abovenet-7a.mps.lan - for FTP/Aspera Connect over the Abovenet 7B link
 up ip addr add 172.16.1.23/24 dev $IFACE

 ## faspex-sohonet.mps.lan - for FASPEX over Sohonet
 up ip addr add 172.16.1.24/24 dev $IFACE
 ## faspex-sohonet.mps.lan - for FASPEX over the Abovenet 7A link
 up ip addr add 172.16.1.25/24 dev $IFACE
 ## faspex-sohonet.mps.lan - for FASPEX over the Abovenet 7B link
 up ip addr add 172.16.1.26/24 dev $IFACE

# Slave Definition for bond0
auto eth2
iface eth2 inet manual
 bond-master bond0
 bond-primary eth2 eth3
auto eth3
iface eth3 inet manual
 bond-master bond0
 bond-primary eth2 eth3

If I comment out my post-up line, I appear to get a reliable network interface brought up at boot time, however, with the post-up line enabled, I often see one of the slaves fail on boot.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Daniel Lynes (dblynes) wrote :

I just thought I'd point out that this problem occurs in 10.04.4 as well.

What happens is not that the other interfaces don't come up, but that they're taking an extended period of time to come up (sometimes as much as 5 or 6 minutes for the :0 and :1 interfaces to come up.)

Revision history for this message
Daniel Lynes (dblynes) wrote :

I will be installing the latest mainline kernel on one of the machines where we're having this problem this afternoon to see if it fixes the problem. (We're having this problem on at least 7 machines, currently.)

If the latest mainline kernel fixes the problem, I will try the latest stable branch mainline kernel to see if that also fixes the problem.

I'd like to avoid upgrading udev if at all possible just because it complicates things a bit more.

Revision history for this message
Daniel Lynes (dblynes) wrote :

FWIW, all of the machines where we're running into this problem are running the driver code as downloaded from Intel.com for both the Intel 1GbE drivers and the 10GbE drivers. The bonded interfaces are on a dual port 10GbE Intel adapter.

The machines are not connected to the Internet, so I cannot run apport-collect on them because as I understand it, the tool requires an Internet connection?

If there's any additional information anyone needs, please feel free to ask and I will try to accommodate your request.

Revision history for this message
Daniel Lynes (dblynes) wrote :

Btw, this also affects the package ifenslave-2.6

Daniel Lynes (dblynes)
Changed in linux (Ubuntu):
status: Expired → New
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel Lynes (dblynes) wrote :

Upgrade to kernel 3.7.0-999.201210220405_amd64 doesn't fix anything. The extreme delay of startup of the bonded interfaces is still there.

Revision history for this message
Daniel Lynes (dblynes) wrote :

Fwiw, our interfaces file is as follows:

auto lo
iface lo inet loopback

auto bond0 bond0:0 bond0:1
auto gb0

# Set up BONDed ports (link aggregation)
iface bond0 inet static
  address 10.0.0.49
  gateway 10.0.0.5
  netmask 255.255.0.0
  bond-slaves c10gb0 c10gb1
# LACP configuration
  bond_mode 802.3ad
  bond_miimon 100
  bond_lacp_rate 1

iface bond0:0 inet static
  address 10.0.0.100
  netmask 255.255.0.0

iface bond0:1 inet static
  address 10.0.0.128
  netmask 255.255.0.0

iface gb0 inet static
  address 10.1.0.128
  netmask 255.255.0.0

Revision history for this message
Daniel Lynes (dblynes) wrote :

Also, when it's in this state, the module 'bonding' is loaded, and when I issue the command 'sudo ifenslave bond0 c10gb0 c10gb1', it reports that c10gb0 and c10gb1 are already slaves.

Revision history for this message
Daniel Lynes (dblynes) wrote :

If I wait long enough (2 minutes to 6 minutes), the interfaces will eventually come up. If I issue a sudo /etc/init.d/networking restart command, they'll come up as soon as the command has finished executing.

So, it seems maybe there's something getting locked up during startup, because of a race condition or something. However, I'm not seeing any evidence of any of the networking scripts hanging in the output of 'sudo ps auxffww'.

Revision history for this message
Daniel Lynes (dblynes) wrote :

Also, fwiw, the mac addresses of c10gb0 and c10gb1 have been changed to reflect the mac address of bond0. So, there's definitely some bonding of some sort happening. It's just not being reflected in bond0:0 and bond0:1 showing up.

Revision history for this message
Daniel Lynes (dblynes) wrote :

Is there anything else I need to advance the status of this ticket?

Revision history for this message
Daniel Lynes (dblynes) wrote :

Problem solved.

It appears the way in which ifenslave is invoked has changed, and most of the documentation on the net on how to configure it doesn't work anymore.

Step by step instructions for getting it working are at the URL, http://www.ubuntugeek.com/how-to-setup-bond-or-team-network-cards-in-ubuntu-10-1010-04.html

My network interfaces are called c10gb0 and c10gb1 respectively. My bonding interface is bond0.

My new /etc/network/interfaces file is as follows:
auto lo
iface lo inet loopback

auto bond0 bond0:0 bond0:1
auto gb0

# Set up BONDed ports (link aggregation)
iface bond0 inet static
  address 10.0.0.49
  gateway 10.0.0.5
  netmask 255.255.0.0
  bond_mode 802.3ad
  bond_miimon 100
  bond_lacp_rate 1
  up /usr/local/bin/enslave bond0 c10gb0 c10gb1
  down /usr/local/bin/enslave -d bond0 c10gb0 c10gb1

iface bond0:0 inet static
  address 10.0.0.100
  netmask 255.255.0.0

iface bond0:1 inet static
  address 10.0.0.128
  netmask 255.255.0.0

My /etc/modprobe.d/aliases.conf is the same as the example at ubuntugeek.com:
alias bond0 bonding
options mode=0 miimon=100 downdelay=200 updelay=200

And my /usr/local/bin/enslave script is as follows:
#!/bin/bash
if [[ ! -z "$1" && "$1" = "-d" ]];
then
  /sbin/ifconfig bond0:1 down
  /sbin/ifconfig bond0:0 down
else
  /sbin/ifconfig bond0:0 10.0.0.100 netmask 255.255.0.0
  /sbin/ifconfig bond0:1 10.0.0.128 netmask 255.255.0.0
fi
/sbin/ifenslave $@

After everything is up, I have bond0 (10.0.0.49), bond0:0 (10.0.0.100) and bond0:1 (10.0.0.128)

Revision history for this message
Daniel Lynes (dblynes) wrote :

Have had to make a few more adjustments. I realized the above still wasn't putting it into aggregated mode (cat /proc/net/bonding/bond0), and one of the links that was down was showing as being up.

Please find the new updates attached, and ignore all my other previous files.

Revision history for this message
Daniel Lynes (dblynes) wrote :

Updated interfaces file

Revision history for this message
Daniel Lynes (dblynes) wrote :

Updated /etc/modules file

This file is required so that the bonding driver gets preloaded, before the networking subsystem starts up, and to make sure that the bonding strategy chosen is actually applied.

Revision history for this message
Daniel Lynes (dblynes) wrote :

And, finally if you're using jumbo frames, you'll want to apply jumbo frames for ipv6 as well (not sure if this is the recommended method, or not), but here's my /etc/sysctl.conf as well.

Revision history for this message
Kaizoku (neoark) wrote :

I am also experiencing issues on reboot. Secondary iP's won't come up even though I can see then in ifconfig. I have to do service networking restart after a reboot to get things working.

Chris J Arges (arges)
tags: added: kernel-da-key
Revision history for this message
Chris J Arges (arges) wrote :

I've been able to reproduce a similar issue.
To test:
1) Create a VM with the latest image, and add 12 network interfaces.
2) Use the attached interfaces.lp996369
3) Add something like this to rc.local to reproduce the issue:
sleep 60
if [ `ifconfig | grep eth | wc -l` = 12 ]; then
 echo "Everything is A-ok!"
 reboot
else
 echo "We have a problem."
fi
4) once the machine quits rebooting you will eventually find the one of the interfaces did not come up.

Chris J Arges (arges)
tags: added: kernel-key
removed: kernel-da-key
Changed in linux (Ubuntu):
importance: Medium → High
Changed in linux (Ubuntu Raring):
status: New → Confirmed
Changed in linux (Ubuntu Quantal):
status: New → Confirmed
Changed in linux (Ubuntu Precise):
status: New → Confirmed
Changed in linux (Ubuntu Raring):
importance: Undecided → High
Changed in linux (Ubuntu Quantal):
importance: Undecided → High
Changed in linux (Ubuntu Precise):
importance: Undecided → High
tags: added: quantal raring saucy
tags: removed: kernel-key
Revision history for this message
Chris J Arges (arges) wrote :

I believe this issue is related to bug 1160490.
I have test ifupdown packages there that may eliminate a race condition based on an upstream patch. Please give them a test and provide feedback. This solved my issue in #35.

penalvch (penalvch)
tags: added: bios-outdated-2013.07.02
penalvch (penalvch)
tags: added: regression-potential
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Please test latest development kernel (3.11.0-7.14)

Given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We are approaching release and would like to confirm if this bug is still present. Please test again with the latest development kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.11.0-7.14
Revision history for this message
Daniel Lynes (dblynes) wrote :

Unfortunately I cannot test for this bug any longer. I no longer have access to a switch with aggregation support.

Revision history for this message
Metin de Vreugd (mdevreugd) wrote :

We can confirm similar issues with Precise. However, testing a default Precise 12.04.3 installation with any generic Quantal or Raring LTS kernel seems to work fine.

Tests preformed:
Continues looping boot using a similar script as mentioned by Chris J Arges. Counted amount of successful boots - working bond0 with both slaves up - with different kernels.

Symptoms after boot:
- Bond0 down with no slaves
- Bond0 comes up with only 1 slave
- Bond0 comes up with 2 slaves with 1 interface marked down

Hardware:
Dell M610-II
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709S Gigabit Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709S Gigabit Ethernet (rev 20)

Network configuration:
root@test:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual
  bond-master bond0
  bond-primary eth0 eth1

auto eth1
iface eth1 inet manual
  bond-master bond0
  bond-primary eth0 eth1

auto bond0
iface bond0 inet static
  address 10.1.1.2
  netmask 255.255.255.0
  broadcast 10.1.1.255
  gateway 10.1.1.1
  bond_arp_validate 3
  bond_mode active-backup
  bond_arp_interval 200
  bond_arp_ip_target 10.1.1.1
  bond_slaves none

root@test:~# cat /etc/modprobe.d/bonding.conf
alias bond0 bonding
options bonding mode=1 arp_interval=200 arp_ip_target=10.1.1.1

Test results:
Ubuntu 12.04.3 LTS with linux-image-3.2.0-54-generic (3.2.0-54.82) - Failed several boots with the listed symptoms
Ubuntu 12.04.3 LTS with linux-image-3.5.0-41-generic (3.5.0-41.64~precise1) - 300+ boots
Ubuntu 12.04.3 LTS with linux-image-3.8.0-31-generic (3.8.0-31.46~precise1) - 300+ boots

Revision history for this message
Alexander List (alexlist) wrote :

I can reproduce the delay problem on Trusty.

machine is pingable after around 8s, which means that bonding, vlans and bridges all work.

However, startup is delayed by ~120s for unknown reasons. After that, startup resumes and all services come up.

Revision history for this message
Alexander List (alexlist) wrote :
Changed in linux (Ubuntu Trusty):
status: Incomplete → New
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie saucy. The bug task representing the saucy nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Saucy):
status: Incomplete → Won't Fix
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This bug was nominated against a series that is no longer supported, ie raring. The bug task representing the raring nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Raring):
status: Confirmed → Won't Fix
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This bug was nominated against a series that is no longer supported, ie quantal. The bug task representing the quantal nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Quantal):
status: Confirmed → Won't Fix
khuanglim (khuang0411)
Changed in linux (Ubuntu):
assignee: nobody → khuanglim (khuang0411)
status: Confirmed → Incomplete
Revision history for this message
Bryan Quigley (bryanquigley) wrote :

@alexlist You can check the upstart logs for more on the cause of that. For the key issue described in this bug (interfaces not coming up at all) that does appear to be fixed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.