Deploying from a multi-homed cluster controller with external DHCP/DNS fails by giving the machine the wrong IP for the cluster controller

Bug #1463555 reported by Christian Reis on 2015-06-09
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
High
Unassigned

Bug Description

When a MAAS cluster controller is configured with multiple interfaces and none of them are set to manage DHCP and DNS, deploying a node can fail with something like this appearing in the install log:

--2015-06-09 19:15:13-- http://10.0.3.1/MAAS/static/images/ubuntu/amd64/generic/trusty/daily/root-tgz
Connecting to 10.0.3.1:80... failed: No route to host.
[...]

This specific cluster controller has interfaces like this:

   https://www.dropbox.com/s/8re057l9pgw8guj/Screenshot%202015-06-09%2012.38.40.png?dl=0

The node in question was given an IP address on the 10.16.0.0 network, so the correct cluster controller interface that should have been returned was eth0. Instead the code in pick_cluster_controller_address() guessed lxcbr0, which was not reachable from the 10.16.0.0 network, and in fact was only present in MAAS -- the underlying lxcbr0 interface had been added in the past and was gone at the OS level. Keep in mind in this situation the cluster controller isn't the router. See http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/src/maasserver/preseed.py#L588 for the code in question.

The first solution to this problem is probably to find the right network for the IP address allocated to the node; the code scans for the MAC address within DHCPLease, but that won't work if DHCP is managed externally. Instead it should be looking at the IP address handed to the node and matching it with the cluster controller interface.

If that proves impossible to make work reliably, then perhaps there should be a way to specify exactly which cluster interface a node boots from (which would override what pick_cluster_controller_address() does).

Changed in maas:
status: New → Triaged
importance: Undecided → High
Christian Reis (kiko) on 2015-06-09
description: updated
Blake Rouse (blake-rouse) wrote :

The cluster controller knows the IP address the node is requesting on, that local IP address needs to be passed to the region to generate the preseed.

Changed in maas:
milestone: none → 1.8.1
milestone: 1.8.1 → 1.9.0
Andres Rodriguez (andreserl) wrote :

We believe that this is not longer an issue in the latest releases of MAAS. If you believe this is still an issue, please re-open this bug report and target it accordingly.

Changed in maas:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers