vlan configuration/unconfigured interfaces creates slow boot time

Bug #1565711 reported by James Page
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Won't Fix
High
Unassigned
cloud-init
Expired
Undecided
Unassigned
curtin
Triaged
Medium
Unassigned
ifupdown
New
Undecided
Unassigned

Bug Description

maas: 1.9.1+bzr4543-0ubuntu1~trusty1 (from proposed PPA)

Deploying juju bootstrap node on Ubuntu 14.04 with the following network configuration:

eth0
    static assigned IP address, default VLAN (no trunking)

eth1
   static assigned IP address, secondary VLAN

   eth1.2667
       static assigned IP address, VLAN 2667

   eth1.2668
       static assigned IP address, VLAN 2668

   eth1.2669
       static assigned IP address, VLAN 2669

   eth1.2670
       static assigned IP address, VLAN 2670

eth2
  unconfigured

eth3
  unconfigured

MAAS generates a /e/n/i which auto stanzas for the VLAN devices and the unconfigured network interfaces; the upstart process which checks that network configuration is complete waits for /var/run/ifup.XXXX to exists for all auto interfaces; these will never appear for either the VLAN interfaces or the unconfigured network interfaces.

As a result, boot time if very long as cloud-init and networking both take 2 minutes to timeout waiting for network interfaces that will never appear to be configured.

Tags: networking
James Page (james-page)
summary: - bridges/vlan configuration creates slow boot time
+ vlan configuration/unconfigured interfaces creates slow boot time
Revision history for this message
James Page (james-page) wrote :

Correction; auto is required on the VLAN interfaces.

Revision history for this message
James Page (james-page) wrote :

That said, the update if-up.d script still waits for an if-up.XXX file that will never exist for a VLAN interface.

Revision history for this message
James Page (james-page) wrote :

/etc/network/if-up.d/upstart is the script that waits for all interfaces to be configured, and then emits the static-network-up event.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Can you provide the created '/etc/network/interface' along with "maas [session] node get-curtin-config [system-id]
" and "maas [session] interfaces read [system-id]".

Revision history for this message
Blake Rouse (blake-rouse) wrote :

This might also be related to curtin and cloud-init. Targeting them as well.

Changed in maas:
status: New → Incomplete
importance: Undecided → High
milestone: none → 1.9.2
tags: added: networking
Changed in cloud-init:
status: New → Incomplete
Changed in curtin:
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :
Changed in maas:
status: Incomplete → New
Changed in cloud-init:
status: Incomplete → New
Changed in curtin:
status: Incomplete → New
Revision history for this message
James Page (james-page) wrote :

Blake

requested information added to bug - setting back to New.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

So the configuration that MAAS emits and curtin generates looks correct. Cloud-init just waits for the signal so I actually think the issue is with ifupdown. I have targeted that package as well, will leave the others for now just to track.

Revision history for this message
Ryan Harper (raharper) wrote :

I can recreate this issue with multiple vlan ifaces over eth2 in a guest.

Revision history for this message
Ryan Harper (raharper) wrote :

I concur with the ifupdown issue.

I copied out /run/network from the VM that had very slow networking start (timeout on base vlan device, eth2, and both unconfigured "manual" ifaces).

drwxr-xr-x 3 rharper rharper 4096 Apr 5 13:17 ../
-rw-r--r-- 1 rharper rharper 88 Apr 5 13:13 dynamic-interfaces
-rw-r--r-- 1 rharper rharper 96 Apr 5 13:15 ifstate
-rw-r--r-- 1 rharper rharper 5 Apr 5 13:13 ifstate.eth0
-rw-r--r-- 1 rharper rharper 1 Apr 5 13:15 ifstate.eth1
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifstate.eth2
-rw-r--r-- 1 rharper rharper 10 Apr 5 13:13 ifstate.eth2.2667
-rw-r--r-- 1 rharper rharper 10 Apr 5 13:13 ifstate.eth2.2668
-rw-r--r-- 1 rharper rharper 10 Apr 5 13:13 ifstate.eth2.2669
-rw-r--r-- 1 rharper rharper 10 Apr 5 13:13 ifstate.eth2.2670
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifstate.eth3
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifstate.eth4
-rw-r--r-- 1 rharper rharper 3 Apr 5 13:13 ifstate.lo
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 .ifstate.lock
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.eth0
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.eth2.2667
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.eth2.2668
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.eth2.2669
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.eth2.2670
-rw-r--r-- 1 rharper rharper 0 Apr 5 13:13 ifup.lo

Note, this same configuration (eni with vlans) passes under Xenial.

Revision history for this message
Ryan Harper (raharper) wrote :

It also passes on Vivid and Wily, so looking at changes between ifupdown-0.7.47.2ubuntu4.4 (trusty-updates) and ifupdown-0.7.48.1ubuntu10 (vivid). These looks particularly relevant:

+ifupdown (0.7.48.1ubuntu1) utopic; urgency=medium
+
+ [ Stéphane Graber ]
+ * Merge from Debian. Remaining changes:

+ - Allow setting the MTU and HWADDR on manual interfaces. (LP: #1294807)
+ - The above change also means that manual interfaces will now be
+ brought up and down (ias many users expected until now).
+ - Disable link.defn as it's not covering all the cases supported by
+ the vlan and bridge hooks and so causes more harm than good at this
+ point. (LP: #1295304)

Revision history for this message
Ryan Harper (raharper) wrote :

Just as a test, using 0.7.48-1ubuntu10 from vivid works fine. Now to narrow down changes.

Revision history for this message
Ryan Harper (raharper) wrote :

Unfortunately it's racy; I've had the exact config attached succeed and fail on trusty.

Revision history for this message
Ryan Harper (raharper) wrote :

And now, I can't get it to trigger at all. 300 runs last night with out triggering the fault.

James, is it reproducible for you on physical hardware?

Revision history for this message
Ryan Harper (raharper) wrote :

I don't think there is a curtin issue here but we may figure out a way to mitigate this issue if we can't figure out if there's a clear bug in ifupdown.

Changed in curtin:
importance: Undecided → Medium
status: New → Triaged
Changed in maas:
milestone: 1.9.2 → 1.9.3
Gavin Panella (allenap)
Changed in maas:
status: New → Triaged
Changed in maas:
milestone: 1.9.3 → 1.9.4
Changed in maas:
milestone: 1.9.4 → 1.9.5
Revision history for this message
Andres Rodriguez (andreserl) wrote :

We believe that this is not longer an issue in the latest releases of MAAS. If you believe this is still an issue, please re-open this bug report and target it accordingly.

Changed in maas:
status: Triaged → Won't Fix
Revision history for this message
Dan Watkins (oddbloke) wrote :

Similarly, please re-open the cloud-init task if you think anything is required from us.

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.