ifup does not work as documented with bonding interfaces

Bug #1015199 reported by Tyanko Aleksiev on 2012-06-19
82
This bug affects 16 people
Affects Status Importance Assigned to Milestone
ifenslave-2.6 (Ubuntu)
Undecided
Unassigned

Bug Description

The file /usr/share/doc/ifenslave-2.6/README.Debian.gz states:

"Using ifup on a master interface will call ifup for all slaves that are
flagged with allow-bondX. (bondX being replaced by the master interface
name). This will allow for extra setup for special slave interfaces."

However I'm using this /etc/network/interfaces file:

# The bound interface
auto eth3
allow-bond0 eth3
iface eth3 inet manual
  bond-master bond0
  bond-primary eth3 eth4 eth5

auto eth4
allow-bond0 eth4
iface eth4 inet manual
  bond-master bond0
  bond-primary eth3 eth4 eth5

auto eth5
allow-bond0 eth5
iface eth5 inet manual
  bond-master bond0
  bond-primary eth3 eth4 eth5

auto bond0
iface bond0 inet static
  address 10.1.1.125
  network 10.1.0.0
  netmask 255.255.0.0
  broadcast 10.1.255.255
  bond-mode balance-rr
  bond-slaves eth3 eth4 eth5
  bond-miimon 100
  bond-downdelay 200
  bond-updelay 200

and issuing the 'ifup bond0' command I get:

root@:~# ifup bond0
Waiting for a slave to join bond0 (will timeout after 60s)
No slave joined bond0, continuing anyway
ssh stop/waiting
ssh start/running, process 9610

As you can see no slave interface has been started.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ifenslave-2.6 (Ubuntu):
status: New → Confirmed
Stéphane Graber (stgraber) wrote :

set bond-slaves to none and drop bond-primary, everything will just work then. Your config currently contains circular dependencies that will make ifup hang.

Changed in ifenslave-2.6 (Ubuntu):
status: Confirmed → Invalid
Hendrik Volkmer (hvolkmer) wrote :

I got this working using Stéphane's suggestion and the config mentioned above. However, when I try to manually setup the interfaces it only works if I bring up the slave interfaces using "ifup eth3; ifup eth4; ifup eth5". Using "ifup bond0" prints "Waiting for a slave to join bond0 (will timeout after 60s)" and then it times out.

Either way, I think the documentation could need some clarification. There are several different approaches documented how to configure bonding in Ubuntu. I tried them and none of them worked out of the box. For example, compare https://help.ubuntu.com/community/UbuntuBonding with the information in /usr/share/doc/ifenslave-2.6/README.Debian.gz

I have the same problem as the original submitter. Setting blond-slaves and removing bond-primary still doesn't allow "ifup bond0" to bring up slave interfaces, it just hangs. I have to do "ifup -a" to work. While it all works fine at boot time, reconfiguring bonding interface with "ifdown bond0; ifup bond0" just leaves it in an unusable state.

Changed in ifenslave-2.6 (Ubuntu):
status: Invalid → Confirmed

Also, removing the "auto" stanzas from ethX interfaces prevents bonding to run at boot, so it appears the bonding works at boot only because an "ifup -a" that is done at the end to "fix" the failures of the event-based network configuration.

Stéphane Graber (stgraber) wrote :

It's not a bug. "ifup bond0" will configure the bond and wait until the first slave joins to return, so it'll indeed "hang" but that's the wanted behaviour.

If you need to reconfigure your bond, you need to call "ifdown" on all bond members, then on the bond interface, change your config and then do "ifup" on all the bond members and then on the bond interface.

Removing the auto statement from all the bond members will also make ifup "hang" or rather wait for a minute and give up as the bond can't be considered as up until it has at least one member (otherwise, it doesn't have a MAC address which confuses a lot of other software).

Changed in ifenslave-2.6 (Ubuntu):
status: Confirmed → Invalid

On Tue, Nov 27, 2012 at 3:41 PM, Stéphane Graber <email address hidden> wrote:
> It's not a bug. "ifup bond0" will configure the bond and wait until the
> first slave joins to return, so it'll indeed "hang" but that's the
> wanted behaviour.

Errr.... this is not what the documentation says
(/usr/share/doc/ifenslave-2.6/README.Debian.gz):

"""
Using ifup on a master interface will call ifup for all slaves that are
flagged with allow-bondX. (bondX being replaced by the master interface
name).
"""

Instead, as you say the master interface just "hangs" and the slaves
are never ifup'ed. So, either the documentation is incorrect, or ifup
does not work as documented.

> If you need to reconfigure your bond, you need to call "ifdown" on all
> bond members, then on the bond interface, change your config and then do
> "ifup" on all the bond members and then on the bond interface.

Again, this is not what the documentation says
(/usr/share/doc/ifenslave-2.6/README.Debian.gz):

"""
Using ifdown on a master interface will cause all slaves to be freed and
disabled. The master is also cleaned-up to ensure reliable results if the
master is brought up later.
"""

I'm not asking for `ifup` to be changed, but the docs should be consistent :-)

Thanks for the information. It appears that allow-bondX stanza is next to useless then?

Stéphane Graber (stgraber) wrote :

Hmm, yeah, that README appears to be pretty out of date, I'll update it next time I upload ifenslave-2.6.

seph (seph) wrote :

I'm on 12.01, and I'm confused. It sounds like I should be following https://help.ubuntu.com/community/UbuntuBonding so I currently have:

> auto bond0
> iface bond0 inet static
> address 10.2.6.16
> bond-miimon 100
> bond-slaves none
> broadcast 10.2.6.31
> gateway 10.2.6.1
> netmask 255.255.255.224
> network 10.2.6.0
> auto eth0
> iface eth0 inet manual
> bond-master bond0
> auto eth1
> iface eth1 inet manual
> bond-master bond0

But that results in a 60s hang. I think because bond0 is started before eth0 or eth1. Is there a way to not have that hang?

seph (seph) wrote :

Playing around some more, it looks like ordering matters -- if I put the eth blocks above the bond0 block, then I don't have that hang. But this feels kind of fragile

Rubia Ramos (donabuba) wrote :

In my case, i'm getting this

root@salmonete:/etc/network# ifup bond0
Missing required variable: address
Missing required configuration variables for interface bond0/inet.
Failed to bring up bond0.

My network/interfaces:

root@salmonete:/etc/network# cat interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual
  bond-master bond0
  bond-primary eth0 eth1

auto eth1
iface eth1 inet manual
  bond-master bond0
  bond-primary eth0 eth1

auto bond0
iface bond0 inet static
  address 150.165.85.242
  network 150.165.85.0
  netmask 255.255.255.0
  gateway 150.165.85.100
  broadcast 150.165.85.255
  bond-mode balance-xor
  bond-slaves none
  bond-miimon 100
  bond-downdelay 200
  bond-updelay 200

This bug is confirmed and not invalid. I'm duplicating it with the other public but I'm working on for several ifupdown race conditions.

Changed in ifenslave-2.6 (Ubuntu):
status: Invalid → Confirmed
assignee: nobody → Rafael David Tinoco (inaddy)
Changed in ifenslave-2.6 (Ubuntu):
status: Confirmed → Fix Released
Ewen McNeill (ewen) wrote :
Download full text (3.3 KiB)

Even with the apparently fixed version, this "does not work after ifdown bond0/ifup bond0 cycle" appears to persist:

-=- cut here -=-
ewen@nas06:~$ cat /etc/issue
Ubuntu 14.04.4 LTS \n \l

ewen@nas06:~$ dpkg -l | grep ifenslave
ii ifenslave 2.4ubuntu1.2 all configure network interfaces for parallel routing (bonding)
ewen@nas06:~$
-=- cut here -=-

-=- cut here -=-
root@nas06:~# ifdown bond0
em1=em1
em2=em2
root@nas06:~# ifup bond0
Waiting for a slave to join bond0 (will timeout after 60s)
No slave joined bond0, continuing anyway
root@nas06:~# ifup em1
root@nas06:~# ifup em2
root@nas06:~#
-=- cut here -=-

After "ifdown bond0" the link does not work properly again until "ifup bond0" is done, followed by "ifup" on the individual interfaces. Which appears to be because "ifdown bond0" disables much more than "ifup bond0" is willing to enable. In particular it appears that "ifup bond0" does not make any attempt to start the slave interfaces -- it seems to solely rely on them being auto-started, which happens only on discover on boot.

The only way I seem to be able to get semi-sane behaviour is to add:

-=- cut here -=-
        pre-up (sleep 2 && ifup em1) &
        pre-up (sleep 2 && ifup em2) &
-=- cut here -=-

to the bond0 interface stanza. It has to:
(a) be pre-up, because post-up is called only after the 60 second up delay; and
(b) be delayed, because bond0 won't exist without an modprobe.d alias, until "ifup bond0" has mostly completed.

Which feels very fragile.

Surely the network scripts for ifenslave could iterate over bond-slaves and do the equivalent of "ifup" on those slave interfaces (or just enslave them directly)? Or have some other way to indicate the interfaces to auto-start. Given that the scripts are already auto-stopping them.

BTW, the documentation in README.Debian.gz is clearly wrong:

-=- cut here -=-
A bonding master is defined like this:

iface bond0 inet static
        address 208.77.188.166
        ...
        bond-primary eth0 eth1

The bonding slaves should then be defined like this:
[...]
-=- cut here -=-

as the whole bonding configuration in bond0 will be ignored if it does not contain "bond-slaves" (as noted later in the file).

My current configuration (works with the "pre-up" lines; fails "ifdown bond0; sleep 5; ifup bond0" cycle without):

-=- cut here -=-
auto em1
allow-bond0 em1
iface em1 inet manual
 bond-master bond0

auto em2
allow-bond0 em2
iface em2 inet manual
 bond-master bond0

auto bond0
iface bond0 inet static
 address [...]
        bond-mode 802.3ad
        bond-primary em1 em2
        bond-slaves em1 em2
        bond-downdelay 200
        bond-updelay 200
        bond-miimon 100
        bond-lacp-rate 1
        pre-up (sleep 2 && ifup em1) &
        pre-up (sleep 2 && ifup em2) &
-=- cut here -=-

(I've not yet rebooted to see what happens then, but I'm hopeful worst case I get some warnings about interfaces already being up.)

Ewen

PS: Ironically the referenced bug (https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1337873) says in a comment:

-=- cut here -=-
In your case ifupdown will be responsible for bringing eth2 a...

Read more...

Ewen McNeill (ewen) wrote :

FTR, system did boot to working network interfaces with the above configuration (including pre-up/sleep lines). I'm not sure if there were warnings issued as the default Ubuntu last action on boot is to clear all boot messages in favour of displaying the login prompt at the top of the screen :-( (/var/log/boot.log contains no messages other than "starting ..." lines.)

FWIW, it seems to me that "ifup bond0" should do the enslaving immediately of any bond-slave interfaces that actually already exist; leaving (and maybe waiting for at least one of ) any others that don't currently exist to appear and auto-config later on. That should work both with races on discover on first boot, and through ifdown/ifup bond0 cycle later on (when the slave interfaces most likely still exist, even though they've been disabled by the ifdown bond0).

Ewen

Changed in ifenslave-2.6 (Ubuntu):
assignee: Rafael David Tinoco (inaddy) → nobody
Changed in ifenslave-2.6 (Ubuntu):
assignee: nobody → Dariusz Gadomski (dgadomski)
Changed in ifenslave-2.6 (Ubuntu):
assignee: Dariusz Gadomski (dgadomski) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers