Activity log for bug #1463555

Date Who What changed Old value New value Message
2015-06-09 20:32:32 Christian Reis bug added bug
2015-06-09 20:34:10 Blake Rouse maas: status New Triaged
2015-06-09 20:34:14 Blake Rouse maas: importance Undecided High
2015-06-09 20:34:16 Christian Reis description When a MAAS cluster controller is configured with multiple interfaces and none of them are set to manage DHCP and DNS, deploying a node can fail with something like this appearing in the install log: --2015-06-09 19:15:13-- http://10.0.3.1/MAAS/static/images/ubuntu/amd64/generic/trusty/daily/root-tgz Connecting to 10.0.3.1:80... failed: No route to host. [...] This specific cluster controller has interfaces like this: https://www.dropbox.com/s/8re057l9pgw8guj/Screenshot%202015-06-09%2012.38.40.png?dl=0 The node in question was given an IP address on the 10.16.0.0 network, so the correct cluster controller interface that should have been returned was eth0. Instead the code in pick_cluster_controller_address() guessed lxcbr0, which was not reachable from the 10.16.0.0 network -- keep in mind in this situation the cluster controller isn't the router. See http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/src/maasserver/preseed.py#L588 for the code in question. The first solution to this problem is probably to find the right network for the IP address allocated to the node; the code scans for the MAC address within DHCPLease, but that won't work if DHCP is managed externally. Instead it should be looking at the IP address handed to the node and matching it with the cluster controller interface. If that proves impossible to make work reliably, then perhaps there should be a way to specify exactly which cluster interface a node boots from (which would override what pick_cluster_controller_address() does). When a MAAS cluster controller is configured with multiple interfaces and none of them are set to manage DHCP and DNS, deploying a node can fail with something like this appearing in the install log: --2015-06-09 19:15:13-- http://10.0.3.1/MAAS/static/images/ubuntu/amd64/generic/trusty/daily/root-tgz Connecting to 10.0.3.1:80... failed: No route to host. [...] This specific cluster controller has interfaces like this:    https://www.dropbox.com/s/8re057l9pgw8guj/Screenshot%202015-06-09%2012.38.40.png?dl=0 The node in question was given an IP address on the 10.16.0.0 network, so the correct cluster controller interface that should have been returned was eth0. Instead the code in pick_cluster_controller_address() guessed lxcbr0, which was not reachable from the 10.16.0.0 network, and in fact was only present in MAAS -- the underlying lxcbr0 interface had been added in the past and was gone at the OS level. Keep in mind in this situation the cluster controller isn't the router. See http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/src/maasserver/preseed.py#L588 for the code in question. The first solution to this problem is probably to find the right network for the IP address allocated to the node; the code scans for the MAC address within DHCPLease, but that won't work if DHCP is managed externally. Instead it should be looking at the IP address handed to the node and matching it with the cluster controller interface. If that proves impossible to make work reliably, then perhaps there should be a way to specify exactly which cluster interface a node boots from (which would override what pick_cluster_controller_address() does).
2015-06-09 20:34:24 Blake Rouse maas: milestone 1.8.1
2015-06-09 20:34:37 Blake Rouse maas: milestone 1.8.1 1.9.0
2018-03-16 17:14:51 Andres Rodriguez maas: status Triaged Won't Fix