Comment 4 for bug 1224007

Revision history for this message
Tomasz GÅ‚uch (tomekg) wrote :

This is definitely not a duplicate of bug #1065077. It is amazing that having LACP+VLAN+Jumboframes configuration doesn't work since 2013 on two consecutive LTS server editions, so I explored this problem.
It was probably introduced when Upstart started to manage interfaces.

A race condition occurs during ifup phase, because configuration of descdenant interface (especially setting MTU) depends on successful configuration of parent interface.

In this particular case, bond0.X's MTU cannot be set to value greater than MTU of bond0, so ifup fails on execution for bond0.X with
"RTNETLINK answers: Numerical result out of range" and status code of 2. As a result, interface is left in half-configured state.

Direct cause of problem is a lack of synchronization between starting networking tasks for main interface and its subinterfaces.
It is especially an issue for bonding with LACP, because it takes some time to LACP negotiation.
MTU is set by ifupdown binary itself, with some delay on bond0 (1-2s in my case).
Unfortunatelly, before it is finished, task for bond0.X is also fired and failed immediatelly.
I've checked timings and I needed to wait about 0.6s in bond0.X's pre-up to have MTU on bond0 properly set.

I attached a poor man's synchronization script, which solves the problem by implementing sleep until parent interface has correct MTU.

I'm unsure if it's possible to enforce correct order in Upstart, maybe a Upstart master is here and could confirm this?

It is very likely that the similar problem occurs in VLAN+Bridge+MTU configuration.