Comment 5 for bug 1636708

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Not quite.

On boot, there are multiple ways that ifup is called, and effectively it races with itself.

In my case I have two vlans, on top of a bond, of two NICs. By the time networking.service is called, the two NICs are present. networking.service is essentially `ifup -a`, it looks at the eni file and realises that it should bring bond0. It looks for bond0 in its internal state and creates it.

This is where the race starts.

ifupdown ships /lib/udev/rules.d/80-ifupdown.rules which calls /lib/udev/ifupdown-hotplug which effectively does
$ exec systemctl --no-block start $(systemd-escape --template ifup@.service $INTERFACE)
(very strange to do this on systemd systems, because one could have just did SYSTEMD_WANTS, but anyway)

At this point bond0 is being brought up by networking.service unit (ifup -a) and <email address hidden> (ifup bond0). Sometimes one can see "already configured" message from either of the two units in the logs.

But also, at this point it time, <email address hidden> and <email address hidden> may have been started as well.

In my case my machine manages to hit this race quite a bit. I am attaching a journal log, of what is happening.

The log is produced using:
journalctl -u ifup@*.service -u networking -o verbose | grep -e UTC -e UNIT -e MESSAGE

You can see messages that things are waiting on bond0 to be up; and that one or the other vlan is waiting on bond0 lock. To beat the locks and to prevent ifup@.service interfering with networking@.service, or executing in parallel and creating deadlocks, I had to encode the dependencies between these units in systemd brain by doing this:

# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>
# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>

This way the ordering is enforced for the ifup@.service hotplug. IMHO ifupdown should ship a generator, that would create these dependencies and orderings between interfaces. And possibly ifup -a should be reduced to starting ifup@%I.service for every interface it is meant to start for a given command.

I'm not sure if we can cheat and state that ifup@.service should be Wants=networking.service After=networking.service. Because I think then we may get ourselves into the situation that ifupdown fails to resolve cycles in the eni, when eni is specified out of order.

For cloud-init, this is more complicated. As on boot the generators will fire, before eni is populated. Therefore cloud-init should probably re-run this magical ifupdwon generator (just like it does for netplan) or cloud-init should create these symlinks directly, and reload systemd before moving onto networking.service.

Does above make sense at all?