quantum DNS name does not match VM hostname

Bug #1175211 reported by Jack McCann
186
This bug affects 35 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Miguel Lavalle

Bug Description

Using Networking DHCP agent, cloud-init and Nova metadata, we're seeing a situation where the VM hostname does not match the DNS name. This causes things like 'sudo' to complain and 'hostname -f' to fail on the VM, and represents a regression from existing nova-network behavior.

An example of the problem and a potential solution are outlined below for comment.

For example:

$ nova list
+--------------------------------------+---------+--------+-----------------------------------+
| ID | Name | Status | Networks |
+--------------------------------------+---------+--------+-----------------------------------+
| 92a4075e-31a4-45e1-909d-f71ee4c55de4 | jvm44 | ACTIVE | jnet41=10.10.20.5|
+--------------------------------------+---------+--------+-----------------------------------+

On the VM:

ubuntu@jvm44:~$ hostname
jvm44

ubuntu@jvm44:~$ hostname -f
hostname: Temporary failure in name resolution

ubuntu@jvm44:~$ sudo bash
sudo: unable to resolve host jvm44
root@jvm44:~#

Changing the hostname to match DNS name fixes the problem:

ubuntu@10-10-20-5:~$ hostname
10-10-20-5

ubuntu@10-10-20-5:~$ hostname -f
10-10-20-5.openstacklocal

ubuntu@10-10-20-5:~$ sudo bash
root@10-10-20-5:~#

The VM name in DNS is 10-10-20-5.openstacklocal. as seen here:

root@jvm44:~# nslookup 10.10.20.5
Server: 10.10.20.2
Address: 10.10.20.2#53
5.20.10.10.in-addr.arpa name = 10-10-20-5.openstacklocal.

While DNSaaS may be the eventual answer to this problem, there may be a simpler fix for the case where DNSaaS is not available.

A potential, minimally invasive fix for this might be:

1) modify quantum/agent/linux/dhcp.py _output_hosts_file to set the DNS name from the port name (if port name is set and if it is a valid DNS name, otherwise set from fixed IP as currently done)

2) modify nova to set the port name to match the VM name

tags: added: l3-ipam-dhcp
description: updated
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Hi Jack, have you seen this behavior with other linux distros?

Changed in nova:
assignee: nobody → Harshavardhan Reddy M (hvreddy1110)
assignee: Harshavardhan Reddy M (hvreddy1110) → nobody
assignee: nobody → Harshavardhan Reddy M (hvreddy1110)
Changed in nova:
assignee: Harshavardhan Reddy M (hvreddy1110) → nobody
Changed in nova:
assignee: nobody → jagan kumar kotipatruni (jagankumar-k)
Revision history for this message
Nick Wilson (nickwilson) wrote :

This issue impacts all distributions, although in slightly different ways.

Those which acquire their hostname from the metadata server (eg, ubuntu) will result in a mismatch and throw fqdn warnings (for instance, when using sudo).

Those which get it from dhcp (eg, wheezy) can resolve their fqdn, but the hostname is inconsistent from nova, and is in the n-n-n-n format. Additionally, this is what's stored in dnsmasq's host file, and makes out of the box dns less usable than it is with nova-net.

Nick Wilson (nickwilson)
Changed in quantum:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/36133

Changed in nova:
assignee: jagan kumar kotipatruni (jagankumar-k) → Zack Feldstein (zack-feldstein)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/36138

Revision history for this message
Zack Feldstein (zack-feldstein) wrote :

Can't submit changes due to grenade bug:

https://bugs.launchpad.net/grenade/+bug/1199250

Revision history for this message
Zack Feldstein (zack-feldstein) wrote :

Submitted Fix in nova, awaiting review:

https://review.openstack.org/36138

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/37098

Changed in neutron:
assignee: nobody → Zack Feldstein (zack-feldstein)
status: Confirmed → In Progress
Changed in nova:
status: In Progress → Fix Committed
Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I see all the changes mentioned in this bug report still in progress. Marking this as committed may be confusing in the release process. Am I missing something?

Changed in nova:
status: Fix Committed → In Progress
Changed in neutron:
status: Fix Committed → In Progress
Revision history for this message
lalithsuresh (suresh-lalith) wrote :

Is there any known workaround for this bug? It's a showstopper for me.

Revision history for this message
Max Köhler (yatc18ks0g8zofezrpk3xa7828d3ooa6g-me) wrote :

Any updates about the status/patch/... ?
Its currently also a showstopper for me.

Revision history for this message
Nick Wilson (nickwilson) wrote :

+1 to this being a showstopper. If Zack isn't going to resume work on it in the next month or two, I'll probably pick up where he left off.

Revision history for this message
Dave Johnston (dave-johnston) wrote :

I{s there any progress/update on this?

Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

When fixing this bug, please consider bug #1240910.

dnsmasq will treat as invalid any host entry with hostname length > 63 thus it will allocate no IP address to the instance.

Revision history for this message
sodre (psodre) wrote :

+1 to being a showstopper.

Revision history for this message
Jack McCann (jack-mccann) wrote :

I'm wondering if we can restore some forward momentum to this. This is a nova parity issue, and a pain point for users.

Looking at Armando's comment on https://review.openstack.org/#/c/37098/5//COMMIT_MSG, he says "there should be a way to preserve the current behavior without using the port name as deciding factor". I think that's valid, and we could address it with a config option e.g. use_port_name_for_host_name=True/False.

Another option I had considered is adding a hostname attribute to the port object. I was trying to avoid this path as it brings in all the overhead of a new API extension.

A third option might be the extra_dhcp_opts extension. We could explicitly specify the hostname using this API, e.g. {"opt_value": "myvmname", "opt_name": "hostname"}. I think we'd need to tweak _output_hosts_file a bit to use this, but at least it is an existing API mechanism that could be used.

Thoughts?

Revision history for this message
Jack McCann (jack-mccann) wrote :

On further reflection, a global config option such as use_port_name_for_host_name is not desirable as it does not preserve behavior for existing deployments, e.g. if I've set my port name to something like "port5" while having come to depend on a DNS hostname such as 10-10-20-5.openstacklocal. (This is probably what Armando was driving at in his code review comment.)

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

Jack, that problem could be solved by including multiple DNS names that map to the same internal IP. Only one name will be sent as the hostname but both names can be resolved in DNS. Essentially, the 10-10-20-5.openstacklocal name will always map to the IP regardless.

Could the global defaut for use_port_name_for_host_name be overridden on a network-by-network basis?

What happens if the nova display name doesn't qualify as an allowable dns name because of some mismatch in constraints. Should we define a mapping that will map any valid nova display name to a valid dns name?

Second, what is this about duplicates? Is the nova display name not constrained to be unique? If someone decides to turn on use_port_name_for_host_name and has duplicate display names on the same network what should be done?

In my opinion, we need to resolve these questions. Some of you may not like the idea of using nova display names as DNS names but end users want it. They don't want to have to name their instances twice and they want to be able to get to their instances by name using dns.

This should be designed in a blueprint.

Revision history for this message
TorstenSchlabach (tschlabach) wrote :
Download full text (3.8 KiB)

Coming back to Jack's comment #16 ("can we restore some forward momentum to this?") I indeed wonder if there is a small, short term solution possible which suits at least most people. I am actually not sure that just waiting for the availablity of Moniker in the next release is the proper answers, especially as I failed to understand so far it it's not going to leave us with the same issue, just on a different technical platform.

Basically, I understand that discussion is all about how to find a name that can be assigned to a node. One solution discussed here but not actually accepted for reasons of backward compatibility IIUC was to use a port name. IMHO this would indeed onyl be the 2nd best option anyway as it would only buy us something if one sets up the port name properly *before* launching an instance.

Basically, the subject of DNS names and IP addresses comes down two two possible scenarios:

1. Either I do advanced planning, i.e. I know I will be launching 5 instances which are supposed to work together (for example 2 load balancers, two web servers and one admin instance). So I could make up names for them in advance (lb-1, lb-2, web-1, web-2, admin) and assign an IP address to each of them in the DHCP server, so once the instance will launch it will get assiged the proper name and IP address and all will be fine. Except, how would I know the MAC address that the new instances will be using in order to configure them in the DHCP server prior to having launched the instances. Sounds a bit like a chickend and egg problem to me. (I understand MAC addresses are generated with some randomness during the instance launch, right?)

2. In the other scenario I just hit the "Launch me an instance" button in Horizon five times in a row and type in my names, i.e. again lb-1, lb-2, you get it. So each instance will receive an arbitraty IP address from the DHCP pool and there is no way for the five instances to find each other on the network unless I would manually setup /etc/hosts entries for all of them in each of them, is there? Would they be able to refer to each other by given name I would have a prepared confuguration for example for the load balancers in which I write that the two backends are web-1 and web-2, without ever having to care about the IP addresses used. I guess this is what DNS was made for, right? So I guess I don't expect too much if I expect this to work.

A trivial solution IMO would be:

Add the --addn-hosts and the --dhcp-script options to the dnsmasq launched by Neutron.

The DHCP request of each instance which launches contains the node name which is derived from the instance name by making it DNS-compatible, i.e. replace spaces by dashed etc. This is happening today, at least with images which use Cloud Init (tested on Ubuntu). The dhcp script could just write the assiged IP address together with the name into the additonal hosts file and you're done.

To be honest, there would be some shortcomings, such as:

If an instance gets "terminated" (which is OpenStack's way of saying it's gone forever) the entry will stay there unless manually removed. I am not sure if there would be any hook which would allow to make sure ...

Read more...

Revision history for this message
Matt Riedemann (mriedem) wrote :

Took this out of in-progress status since the nova and neutron changes were abandoned.

tags: added: network neutron
Changed in nova:
status: In Progress → New
Changed in neutron:
assignee: Zack Feldstein (zack-feldstein) → nobody
status: In Progress → New
Changed in nova:
assignee: Zack Feldstein (zack-feldstein) → nobody
tags: removed: neutron
Changed in neutron:
importance: Undecided → Low
Brent Eagles (beagles)
tags: added: neutron-core
tags: added: neutron
removed: neutron-core
Sean Dague (sdague)
Changed in nova:
status: New → Confirmed
Changed in neutron:
status: New → Confirmed
Sean Dague (sdague)
Changed in nova:
importance: Undecided → Low
Tom Fifield (fifieldt)
tags: added: ops
Changed in nova:
assignee: nobody → Hiroyasu OHYAMA (user-localhost2000)
Changed in nova:
assignee: Hiroyasu OHYAMA (user-localhost2000) → nobody
Revision history for this message
Zeeshan Ali Shah (zashah-o) wrote :

this is still undecided or patch has been released ? i am running juno and same issue

Revision history for this message
Barrow Kwan (barrowkwan) wrote :

Same here. I am wondering if there is patch somewhere we can patch on Juno. I am having the same problem after upgrade from Havana to Juno. I was using flat networking on Havana and change to Neutron (vxlan/gre) after upgrade to Juno. Lot of things are broken with the VM ( since eg hostname -f return unknown host ). Temporary workaround is to update VM's /etc/hosts but with few hundred VMs to manage, this is not very appropriate. Since flat network probably won't be support in the future, if we have to use Neutron but with this bug still exists, I will think many people will be affected.

Li Ma (nick-ma-z)
Changed in neutron:
assignee: nobody → Li Ma (nick-ma-z)
Li Ma (nick-ma-z)
Changed in neutron:
assignee: Li Ma (nick-ma-z) → nobody
Changed in neutron:
assignee: nobody → Dariusz Smigiel (smigiel-dariusz)
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I believe that this is taken care of in the context of

https://blueprints.launchpad.net/neutron/+spec/external-dns-resolution

I am afraid, we won't be able to patch Juno, as it's security fixes only.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Changed in neutron:
status: Confirmed → Fix Released
assignee: Dariusz Smigiel (smigiel-dariusz) → Miguel Lavalle (minsel)
no longer affects: nova
Revision history for this message
unmesh desale (unmeshdesale) wrote :

Is this available in Liberty or Mitaka ? I am facing this problem.

Revision history for this message
Tardis Xu (xiaoxubeii) wrote :

I cannot find the patch either

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.