juju deploy brings up interfaces in wrong order

Bug #1590052 reported by Hrvoje
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hi.

Upon using "juju bootstrap" for maas environment, deploying first machine get stuck, if machine have only bond interface.

I would seem that problem is with add-juju-bridge.py script, which is responsible for changing interfaces file, and bringing them up.

In this script, you have:

[snip]
    ifquery = "$(ifquery --interfaces={} --exclude=lo --list)".format(args.filename)

    print("**** Original configuration")
    print_shell_cmd("cat {}".format(args.filename))
    print_shell_cmd("ifconfig -a")
    print_shell_cmd("ifdown --exclude=lo --interfaces={} {}".format(args.filename, ifquery))

    print("**** Activating new configuration")

    with open(args.filename, 'w') as f:
        print_stanzas(stanzas, f)
        f.close()

    print_shell_cmd("cat {}".format(args.filename))
    print_shell_cmd("ifup --exclude=lo --interfaces={} {}".format(args.filename, ifquery))
    print_shell_cmd("ip link show up")
    print_shell_cmd("ifconfig -a")
    print_shell_cmd("ip route show")
    print_shell_cmd("brctl show")
[snip]

Problem here is how "ifquery --interfaces={} --exclude=lo --list" returns list of interfaces. If we assume interfaces:

bond0
eth0
eth1

sometimes, ifquery will return "bond0 eth0 eth1", and feed this to ifup. Ifup will first try to bring up bond0 - and wait for it! Since nobody will bring up eth0 or eth1 - this waits until infinity.

Possible solutions:
1. change add-juju-bridge.py to first bring up ordinary interfaces, then other ones
2. change ifenslave to internally call "ifup <slave>" before entering endless loop

I'm on trusty (14.4.04 LTS):

juju-core 1.25.5-0ubuntu3~14.04.1
ifenslave 2.4ubuntu1.2

Regards,

H.

Hrvoje (hrvoje-habjanic)
tags: added: add-juju-bridge.py bootstrap ifenslave juju stuck
Revision history for this message
Martin Pitt (pitti) wrote :

I'm not sure how ifquery works, whether it lists in the order of /e/n/interfaces or the order they get brought up with "ifup -a" (which is supposedly the same, though). So please reassign to ifupdown if appropriate.

affects: systemd (Ubuntu) → juju (Ubuntu)
Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :

Actually, title should be "juju bootstrap fails with only bonding interface".

Anyhow, steps to reproduce are:
- stop bond0
- stop udev
- try ifup .. eth0 eth1 bond0

Ifup ends up stuck.

udev _should_ trigger bringing of other devices, but when ifup is called from add-juju-bridge, it does not work as advertised.

Could it be that, because interfaces file is moved, and new one is created, with different content, that udev just get's confused?

H.

Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :
Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :
Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :
Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :
Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :

Those files where collected while using latest image from http://images.maas.io/ephemeral-v2/daily/ and latest juju & maas packages for Trusty from ppa:maas/stable and ppa:juju/stable.

Also - manually running "ifup eth1" makes this works, that is, add-juju-bridge.py gets "unstuck".

H.

Revision history for this message
Andrew McDermott (frobware) wrote :

If this is only happens with a bond then it's possible you are running into:

  https://bugs.launchpad.net/juju-core/+bug/1594855

Revision history for this message
Hrvoje (hrvoje-habjanic) wrote :

Hi.

Yes, i can confirm this - adding delay between ifdown and ifup does fix this issue.

I needed to use hex editor to actually change juju binary itself to test this. I did replace "--exclude" with "-X" and "--interface" with "-i" and then get enough space to add:

import time
time.sleep(5)

in the script itself.

H.

Revision history for this message
Andrew McDermott (frobware) wrote :

The fix for:

  https://bugs.launchpad.net/juju-core/+bug/1594855

will be in juju-2.0-beta10.

Revision history for this message
Andrew McDermott (frobware) wrote :

Closing this as it is bond related and fixed by:

 https://bugs.launchpad.net/juju-core/+bug/1594855

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.