Comment 4 for bug 1795296

Revision history for this message
Trent Lloyd (lathiat) wrote :

Additionally on juju 2.3.8 at least I suspect the interface file updates are racing between juju (setting up multiple container bridges over time) and the charm updating the file. juju 2.4 may be better but I did not conclusively test that.

In some cases (unreliably) it failed the bridge setup - leaving no bridge in /etc/network/interfaces. One deploy it works, the next it didn't.

As an extra note when juju first converts data from /etc/network/interfaces.d/50-cloud-init.cfg to /etc/network/interfaces it actually includes all of /etc/network/interfaces.d/*.cfg - i.e. including the veth file. This is lost in a later update, I'm not 100% sure off hand why. But the file is rewritten multiple times as different interfaces get bridges added and the charm may also update them multiple times.

It's clear this method isn't going to be reliable on Xenial, however we have existing deployments using this option, so we need to find a way to fix that. This has broken network in multiple production environments.

The best hack fix I can think of at the moment is to move 50-cloud-init.cfg out of the way. The charm can probably do this. However we'd need to make sure that nothing is relying on that file later. i.e. juju doesn't try to re-read that and use it again (which I am wondering based on the above races if that does actually happen)

It also seems whether the network is broken on reboot may also be racy, depending on what order the routes get installed.

I would suggest this should be triaged High because it causes broken network on production deployments.