Comment 12 for bug 823366

Revision history for this message
Deric Sullivan (deric-sullivan) wrote :

Actually, the original post without the "auto bond0" is from mar_rio. I'm the second person reporting the bug. I assumed that you didn't put "auto bond0" because we've been quoting the README as authoritative and it doesn't have the auto for the bond. To reproduce the original bug I simplified my interfaces file to (only IPs have been changed):

----------------------------------------------
auto lo
iface lo inet loopback

auto bond0
iface bond0 inet static
   bond_slaves eth0 eth1
   bond_mode active-backup
   bond_primary eth1
   bond_miimon 100
   bond_updelay 2000
   address 192.168.1.100
   netmask 255.255.255.0
   broadcast 192.168.1.255
----------------------------------------------

So, if I keep "auto bond0" and change my interfaces file to:
----------------------------------------------
auto lo
iface lo inet loopback

auto bond0
iface bond0 inet static
   bond_slaves none
   bond_mode active-backup
   bond_primary eth0
   bond_miimon 100
   bond_updelay 2000
   address 192.168.1.100
   netmask 255.255.255.0
   broadcast 192.168.1.255

auto eth0
iface eth0 inet manual
   bond_master bond0
   bond_mode active-backup
   bond_primary eth0
   bond_miimon 100
   bond_updelay 2000

auto eth1
iface eth1 inet manual
   bond_master bond0
   bond_mode active-backup
   bond_primary eth0
   bond_miimon 100
   bond_updelay 2000
----------------------------------------------

Then I still have no primary after a reboot. What I think is happening is the following:

If the order that the interfaces come up is bond0, eth0, eth1 then the primary will be set:
        ifup bond0:
                creates bond0 (from add_master());
                no primary is set (from setup_master() with no slaves);
                no slaves are added (from enslave_slaves() with bond_slaves = none);
                bond0 comes UP (but probably not LOWER_UP yet)
        ifup eth0:
                sees that bond0 is already created (from add_master());
                should fail to set mode but continue anyway (from setup_master() with bond0 already up);
                no primary is set (from setup_master() with no slaves);
                eth0 is added as slave (from enslave_slaves() with bond_slaves = eth0);
                eth0 comes UP and LOWER_UP which should cause bond0 to come LOWER_UP;
        ifup eth1:
                sees that bond0 is already created (from add_master());
                should fail to set mode but continue anyway (from setup_master() with bond0 already up);
                primary is set to eth0 (from setup_master() with eth0 as slave);
                eth1 is added as slave (from enslave_slaves() with bond_slaves = eth1);
                eth1 comes UP and LOWER_UP;

If the order that the interfaces come up is bond0, eth1, eth0 then the primary will not be set:
        ifup bond0:
                creates bond0 (from add_master());
                no primary is set (from setup_master() with no slaves);
                no slaves are added (from enslave_slaves() with bond_slaves = none);
                bond0 comes UP (but probably not LOWER_UP yet)
        ifup eth1:
                sees that bond0 is already created (from add_master());
                should fail to set mode but continue anyway (from setup_master() with bond0 already up);
                no primary is set (from setup_master() with no slaves);
                eth1 is added as slave (from enslave_slaves() with bond_slaves = eth1);
                eth1 comes UP and LOWER_UP which should cause bond0 to come LOWER_UP;
        ifup eth0:
                sees that bond0 is already created (from add_master());
                should fail to set mode but continue anyway (from setup_master() with bond0 already up);
                no primary is set (from setup_master() with only eth1 as slave);
                eth0 is added as slave (from enslave_slaves() with bond_slaves = eth0);
                eth0 comes UP and LOWER_UP;

I would guess that moving the following code from setup_master() over to the end of enslave_slaves() will fix the problem:
--------------------------------------------
        # The first slave in bond-primary found in current slaves becomes the primary.
        # If no slave in bond-primary is found, then primary does not change.
        for slave in $IF_BOND_PRIMARY ; do
                if grep -sq "\\<$slave\\>" "/sys/class/net/$BOND_MASTER/bonding/slaves" ; then
                        sysfs "$slave" primary
                        break
                fi
        done
--------------------------------------------

Note: If that does fix the problem then moving the "IF_BOND_ACTIVE_SLAVE" section is probably also a good idea but not related specifically to this "bond_primary" bug report.