neutron-based instances should not use the nova-network 'dhcp_domain' option

Bug #1698010 reported by Ben Nemec on 2017-06-14
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Stephen Finucane

Bug Description

There seems to be an issue with how domains get assigned when booting instances. My understanding is that with neutron, the neutron dns_domain option should be what determines the resulting domain name of the instances. However, when creating instances with the following configuration:

(undercloud) [centos@undercloud-test ~]$ sudo grep dns_domain /etc/neutron/neutron.conf
#dns_domain = openstacklocal
dns_domain=nemebean.com
(undercloud) [centos@undercloud-test ~]$ sudo grep dhcp_domain /etc/nova/nova.conf
#dhcp_domain=novalocal
dhcp_domain=

I get the following in the instance:

[heat-admin@overcloud-controller-0 ~]$ sudo hostnamectl
   Static hostname: overcloud-controller-0.localdomain

It looks like this is being done by cloud-init:

Jun 14 21:07:34 host-9-1-1-12 cloud-init[1405]: [CLOUDINIT] cc_set_hostname.py[DEBUG]: Setting the hostname to overcloud-controller-0.localdomain (overcloud-controller-0)
Jun 14 21:07:34 host-9-1-1-12 cloud-init[1405]: [CLOUDINIT] util.py[DEBUG]: Running command ['hostnamectl', 'set-hostname', 'overcloud-controller-0.localdomain'] with allowed return codes [0] (shell=False, capture=True)

So cloud-init is likely getting the host and domain name from Nova metadata, even though Neutron is being used to manage networking.

If I also set dhcp_domain as follows:

(undercloud) [centos@undercloud-test ~]$ sudo grep dhcp_domain /etc/nova/nova.conf
#dhcp_domain=novalocal
dhcp_domain=nemebean.com

Then I get the expected results:

[heat-admin@overcloud-controller-0 ~]$ sudo hostnamectl
   Static hostname: overcloud-controller-0.nemebean.com

These are obviously tripleo overcloud instances being deployed via Ironic. I'm using some recent RDO packages:

$ sudo rpm -qa | grep nova
openstack-nova-conductor-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
python-nova-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
puppet-nova-11.1.0-0.20170605232112.27baec7.el7.centos.noarch
openstack-nova-common-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
python2-novaclient-8.0.0-0.20170517113627.e1b9e76.el7.centos.noarch
openstack-nova-placement-api-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
openstack-nova-api-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
openstack-nova-scheduler-16.0.0-0.20170521033533.99bd334.el7.centos.noarch
openstack-nova-compute-16.0.0-0.20170521033533.99bd334.el7.centos.noarch

99bd334 is the short sha of the commit the packages were built against

$ sudo rpm -qa | grep neutron
python-neutron-11.0.0-0.20170521040619.3f2e22a.el7.centos.noarch
openstack-neutron-ml2-11.0.0-0.20170521040619.3f2e22a.el7.centos.noarch
python2-neutronclient-6.2.0-0.20170418195232.06d3dfd.el7.centos.noarch
openstack-neutron-11.0.0-0.20170521040619.3f2e22a.el7.centos.noarch
openstack-neutron-openvswitch-11.0.0-0.20170521040619.3f2e22a.el7.centos.noarch
puppet-neutron-11.1.0-0.20170601210926.888c480.el7.centos.noarch
python-neutron-lib-1.6.0-0.20170503061451.449f079.el7.centos.noarch
openstack-neutron-common-11.0.0-0.20170521040619.3f2e22a.el7.centos.noarch

This is not ideal in any case, but it's particularly concerning since according to the opt docs dhcp_domain is deprecated.

Changed in nova:
assignee: nobody → Stephen Finucane (stephenfinucane)
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in nova:
assignee: Stephen Finucane (stephenfinucane) → nobody
Sean Dague (sdague) on 2017-06-28
Changed in nova:
status: New → Confirmed
importance: Undecided → High
summary: - dhcp-domain is deprecated, but required for correct FQDN behavior
+ neutron-based instances should not use the nova-network 'dhcp_domain'
+ option

Thanks for the detailed report. It looks like we're populating the config drive with the value of nova-network's 'dhcp_domain' option, rather than using neutron's 'dns_domain' as one would expect. I've updated the title of the bug report accordingly.

I'm going to take a look at this, but in the interim it's the config drive and/or metadata service that we should be looking at. I would guess this chunk of code [1] is at least somewhat related.

[1] https://github.com/openstack/nova/blob/0b9bacb/nova/api/metadata/base.py#L546-L549

Matt Riedemann (mriedem) wrote :

sfinucan is correct that the issue lies in this code:

https://github.com/openstack/nova/blob/0b9bacb/nova/api/metadata/base.py#L546-L549

That goes into the config drive.

This is a duplicate of another bug that came up about this same thing recently but I'm failing to find it.

Fix proposed to branch: master
Review: https://review.openstack.org/480616

Changed in nova:
assignee: nobody → Stephen Finucane (stephenfinucane)
status: Confirmed → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/480676

You can see a couple of patches open above. One item that came up in discussion [1] is whether we should actually be using fully qualified domain names for hostnames. I'm not sure myself tbh and would like more input on this. Keep an eye on openstack-dev.

[1] https://review.openstack.org/#/c/480616/

Reviewed: https://review.openstack.org/500010
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cc8da506cd5a1450224e763b7e71242adff86870
Submitter: Jenkins
Branch: master

commit cc8da506cd5a1450224e763b7e71242adff86870
Author: Stephen Finucane <email address hidden>
Date: Fri Sep 1 10:19:33 2017 +0100

    Add '_has_qos_queue_extension' function

    This is how we've standardized on checking for extensions. Use it for
    QoS too.

    Change-Id: I48c3e41df6133e04be3e25905ff4e168a44534c7
    Related-Bug: #1698010

Matt Riedemann (mriedem) wrote :

Was there ever a mailing list discussion about this? I can't seem to find anything.

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/480676

Changed in nova:
assignee: Stephen Finucane (stephenfinucane) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem) on 2019-04-15
Changed in nova:
assignee: Matt Riedemann (mriedem) → Stephen Finucane (stephenfinucane)

Reviewed: https://review.openstack.org/480616
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=886b0a5d748ae1deda3a039734f831d7c0cf0476
Submitter: Zuul
Branch: master

commit 886b0a5d748ae1deda3a039734f831d7c0cf0476
Author: Stephen Finucane <email address hidden>
Date: Wed Jul 5 15:29:17 2017 +0100

    conf: Undeprecate and move the 'dhcp_domain' option

    The metadata service makes use of the deprecated '[DEFAULT] dhcp_domain'
    option when providing a hostname to the instance. This is used by
    cloud-init to configure the hostname in the instance. This use was not
    captured when the option was initially deprecated. This option is now
    undeprecated and moved to the '[api]' group to ensure it won't be
    removed alongside the other nova-network options.

    Change-Id: I3940ebd1888d8019716e7d4eb6d4a413a37b9b78
    Closes-Bug: #1698010

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers