Comment 0 for bug 1802322

Dmitrii Shcherbakov (dmitriis) wrote :

After an update for https://bugs.launchpad.net/netplan/+bug/1770082 was released for bionic and our systems started getting the new packages, *clean* MAAS + Juju + Bionic + LXD container deployments started to fail on bridge activation.

juju model-config logging-config='<root>=WARNING;unit=DEBUG;juju.network.netplan=TRACE'

2018-11-08 13:44:10 DEBUG juju.network.netplan activate.go:99 Netplan activation result "Traceback (most recent call last):
  File \"/usr/sbin/netplan\", line 23, in <module>
    netplan.main()
  File \"/usr/share/netplan/netplan/cli/core.py\", line 50, in main
    self.run_command()
  File \"/usr/share/netplan/netplan/cli/utils.py\", line 130, in run_command
    self.func()
  File \"/usr/share/netplan/netplan/cli/commands/apply.py\", line 43, in run
    self.run_command()
  File \"/usr/share/netplan/netplan/cli/utils.py\", line 130, in run_command
    self.func()
  File \"/usr/share/netplan/netplan/cli/commands/apply.py\", line 102, in command_apply
    stderr=subprocess.DEVNULL)
  File \"/usr/lib/python3.6/subprocess.py\", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ip', 'link', 'set', 'dev', 'enp5s0f0', 'name', 'enp4s0f0']' returned non-zero exit status 2.
" "" 1

From the Juju machine agent code:

command := fmt.Sprintf("%snetplan generate && netplan apply && sleep 10", params.RunPrefix)
// ...
logger.Debugf("Netplan activation result %q %q %d", result.Stderr, result.Stdout, result.Code)

The rename operation in question does not seem to be justified by anything that juju would want to do.

Inspecting closer it can be seen that 00:0a:f7:72:a7:28 is a mac address of enp4s0f0 which also happens to be a MAC address of the bond and gets applied to all members of a bond (enp5s0f0 is of specific interest) after the first run of netplan after the deployment.

It looks like a subsequent `netplan generate && netplan apply` invocation by Juju causes netplan to try to apply "enp4s0f0" name to "enp5s0f0" interface because it has "00:0a:f7:72:a7:28" for a mac address as a result of becoming a bond member.

netplan generated by cloud-init:
http://paste.ubuntu.com/p/QfR4f5yMYP/

        bond0:
            interfaces:
            - enp4s0f0
            - enp5s0f0

        enp4s0f0:
            match:
                macaddress: 00:0a:f7:72:a7:28
            mtu: 9000
            set-name: enp4s0f0

        enp5s0f0:
            match:
                macaddress: 00:0e:1e:ac:67:00
            mtu: 9000
            set-name: enp5s0f0

curtin config:
http://paste.ubuntu.com/p/NkvZKqZYjr/

# ip addr show enp5s0f0
8: enp5s0f0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond0 state DOWN group default qlen 1000
    link/ether 00:0a:f7:72:a7:28 brd ff:ff:ff:ff:ff:ff

# ip addr show enp4s0f0
6: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 00:0a:f7:72:a7:28 brd ff:ff:ff:ff:ff:ff

This is currently blocking all of our bionic deployments as all of them have bonds.