Comment 26 for bug 1003656

Revision history for this message
Stéphane Graber (stgraber) wrote :

I just spent a few minutes trying to figure out the ordering based on the scripts for Andrew's system, it's basically:
- eth0 appears
  - triggers udev
    - triggers upstart
      - triggers ifup eth0
        - triggers bonding
          - bond0 appears
            - triggers udev
              - triggers bridge-network-interface
                - triggers ifup br0
                  - br0 appears
                    - triggers udev
                      - triggers upstart
                        - triggers ifup br0 => fails, already configured
                  - dhclient br0 => fails as it's blocking and no interface in bond
              - triggers upstart
                - triggers ifup bond0
          - eth0 is joined in the bond
- eth1 appears
  - triggers udev
    - triggers upstart
      - triggers ifup eth1
        - triggers bonding
          - eth1 is joined in the bond

As you can see, it's relatively complex. The main problem as easily seen above is that udev being sequential, the "ifup br0" will be called before the bond interface is fully setup and so will fail to acquire an IP as nothing's in the bond at this point.

A possible way around the problem would be to only create the bridge from the udev hook but not actually call ifup, letting the upstart job take care of this. This would make the code similar to what vlan and ifenslave are currently doing where as far as I know we're not getting a similar deadlock.

I have a system reproducing this bug, so I'll now be trying my workaround and re-read all the scripts once more to see if I missed something.