MAAS sets manage_etc_hosts=True always and wipes out /etc/hosts on every reboot

Bug #2088179 reported by Nobuto Murata
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Triaged
High
Unassigned
OpenStack RabbitMQ Server Charm
New
Undecided
Unassigned

Bug Description

1:3.4.5-14373-g.ab434d5402-0ubuntu1~22.04.1

Considering a situation where admins, users, or external automation to MAAS (e.g. Juju/Charms) modifies /etc/hosts after the initial provisioning by MAAS. Those changes will be lost immediately after a reboot.

By looking to the detail, the deployed hosts have the following:

[/etc/cloud/cloud.cfg.d/90_dpkg_local_cloud_config.cfg]
# written by cloud-init debian package per preseed entry
# cloud-init/local-cloud-config
manage_etc_hosts: true
manual_cache_clean: true
reporting:
  maas:
    consumer_key: Kmc8Wz45YnfQ2nT6EG
    endpoint: http://192.168.151.1:5248/MAAS/metadata/status/fhq6qa
    token_key: LDpwjgW5DVMPCdZRrH
    token_secret: 9neUuyUPdBwamfCw4JBLjvH86TAnQvDf
    type: webhook

And the explicit manage_etc_hosts=true (non default in cloud-init) is coming from MAAS generated curtin config.
https://cloudinit.readthedocs.io/en/latest/reference/yaml_examples/update_etc_hosts.html
> /etc/hosts will be re-written on every boot

[/root/curtin-install-cfg.yaml]
debconf_selections:
  grub2: grub2 grub2/update_nvram boolean false
  maas: 'cloud-init cloud-init/datasources multiselect MAAS

    cloud-init cloud-init/maas-metadata-url string http://192.168.151.1:5248/MAAS/metadata/

    cloud-init cloud-init/maas-metadata-credentials string oauth_consumer_key=Kmc8Wz45YnfQ2nT6EG&oauth_token_key=LDpwjgW5DVMPCdZRrH&oauth_token_secret=9neUuyUPdBwamfCw4JBLjvH86TAnQvDf

    cloud-init cloud-init/local-cloud-config string manage_etc_hosts: true\nmanual_cache_clean:
    true\nreporting:\n maas:\n consumer_key: Kmc8Wz45YnfQ2nT6EG\n endpoint:
    http://192.168.151.1:5248/MAAS/metadata/status/fhq6qa\n token_key: LDpwjgW5DVMPCdZRrH\n token_secret:
    9neUuyUPdBwamfCw4JBLjvH86TAnQvDf\n type: webhook\n

    '

Tags: sts
Nobuto Murata (nobuto)
description: updated
Revision history for this message
Nobuto Murata (nobuto) wrote :

If I'm not mistaken the explicit manage_etc_hosts is coming from here:
https://git.launchpad.net/maas/tree/src/maasserver/compose_preseed.py#n403
> def get_base_preseed(node=None):
> """Return the base preseed config used by all ephemeral environments."""
> cloud_config = {
> # The ephemeral environment doesn't have a domain search path set which
> # causes sudo to fail to resolve itself and print out a warning
> # message. These messages are caught when logging during commissioning
> # and testing. Allow /etc/hosts to be managed by cloud-init so the
> # lookup works. This may cause LP:1087183 to come back if anyone tries
> # to JuJu deploy in an ephemeral environment.
> "manage_etc_hosts": True
> }

It wasn't like that when manage_etc_hosts=True was introduced initially and it was purely for an ephemeral environment instead of deployed one.
https://git.launchpad.net/maas/commit/?id=976cd6ca5273da289c6b769d2870614a28de87e5
https://bugs.launchpad.net/maas/+bug/1670444

Then, it looks like this commit dropped a condition for manage_etc_hosts=False without much of explanation.
https://git.launchpad.net/maas/commit/?id=22641cffccea44ded1d20a72d84ba23856730d05
> - config = {
> - # Do not let cloud-init override /etc/hosts/: use the default
> - # behavior which means running `dns_resolve(hostname)` on a node
> - # will query the DNS server (and not return 127.0.0.1).
> - # See bug 1087183 for details.
> - "manage_etc_hosts": False,

manage_etc_hosts=False has the least surprises for MAAS users obviously but if there is a good reason to stick to manage_etc_hosts=True, we should document that explicitly and guide users how to deal with /etc/hosts properly, IMO.

Revision history for this message
Nobuto Murata (nobuto) wrote (last edit ):

In case MAAS sticks to manage_etc_hosts=True, we need to "fix" the rabbitmq-server charm.

It has a logic to manage /etc/hosts to have a mapping between a short hostname of a peer and an IP address but it edits /etc/hosts directly and those changes will be lost. i.e. the behavior before and after the reboot changes.

Looks like the logic was for IPv6 initially, but we are writing IPv4 addresses to /etc/hosts too.
https://github.com/openstack/charm-rabbitmq-server/commit/c2341c5c54e463df672bea9b8d602459e8165e36
https://github.com/openstack/charm-rabbitmq-server/commit/8272f214ab2190c17cf8c84f7cd9f227dcabdf47
And losing those explicit entries in /etc/hosts for IPv4 addresses should be an issue when there is a proper search domain is set and it's functional. i.e. machine-1 -> machine-1.maas then MAAS DNS server returns the expected IP addresses. But by having explicit entries in /etc/hosts masked those non functional chain for the initial deployment and the issue will be visible after the second boot only.

The charm logic was written before MAAS changed the behavior to all manage_etc_hosts=True. And if MAAS doesn't change the behavior back, the charm should update both /etc/hosts and /etc/cloud/templates/hosts.debian.tmpl so the content of /etc/hosts can be identical across reboots.

To be clear, rabbitmq-server charm deployment with MAAS provider but using LXD units is not affected since manage_etc_hosts=True is set for MAAS machines (regardless of KVM or physical) but Juju created LXD units on top of MAAS deployed machines do not have manage_etc_hosts=True. i.e. rabbitmq-server in our OpenStack deployment is not affected (we are using LXD), but rabbitmq-server in Landscape deployment can be affected.

Chuan Li (lccn)
tags: added: sts
Changed in maas:
status: New → Triaged
importance: Undecided → High
milestone: none → 3.6.x
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.