MAAS should allow active DHCP probing to be disabled
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Medium
|
Unassigned | ||
maas (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Release: Xenial/16.04
MaaS version: 2.0.0+bzr5189-
Local setup:
Internet -> 192.168.2.1/24 (GW) -> 192.168.2.19/24 (MaaS em0)
In the end all interfaces on the MaaS host were static but before that the untagged interface had been DHCP. For that reason there was a host section in the DHCP server on the GW host that assigned a define IP address to it.
With MaaS 2.0 the rack service will issue DHCP discover requests on all detected interfaces every 10 minutes. The GW hosts tries to reply with an address but doing so causes connectivity between GW and MaaS to be disrupted. SSH connections to MaaS managed nodes in the 192.168.123.0/24 subnet do even time out (that could be because the DHCP server for that subnet, which is managed by MaaS, is configured with a fix IP reply the same way as the one on the GW host is).
For now I was able to work around this by adding "ignore bootp; ignore booting;" statements to the DHCP host section on the GW host. That way I loose the protection against misconfiguration but at least got a stable connection between GW and MaaS.
It's strange that connectivity is interrupted when the DHCP discovery is attempted. It's not clear to me why connectivity should be interrupted at all. (For example, if an ARP entry is being cleared out, it should be added back relatively quickly.) Right now I don't know what could be changed in MAAS to make this better; I think to make progress on this issue, we need to determine exactly why connectivity is disrupted in this case.
It would be helpful to get a packet capture during the probe (and subsequent connectivity loss) to get a better idea about what's going on.
Note that the behavior changed drastically in MAAS 2.1, so it might be different if you re-test.