libvirt's dnsmasq setup will read /etc/hosts on the host, resulting in odd resolution behaviour on the VM

Bug #1326536 reported by Jason Harvey on 2014-06-04
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Medium
Unassigned
lxc (Ubuntu)
Medium
Stéphane Graber

Bug Description

When libvirt configures / starts up dnsmasq on the host, it does not pass --no-hosts, resulting in it reading in the /etc/hosts file from the host.

The default ubuntu setup will have the host's hostname in /etc/hosts under 127.0.1.1. Since libvirt's dnsmasq is reading this file, anything querying that dnsmasq instance will resolve the host's hostname out of /etc/hosts.

The result of this is any VM running on the host will resolve the host's hostname as 127.0.1.1. For example, if the host's hostname is BoxA, any VM running on the host will resolve BoxA to 127.0.1.1, which is not BoxA's actual address.

Would recommend passing --no-hosts to dnsmasq when libvirt starts it up. If a user wants hardcoded hosts for their libvirt network, they can add them to /var/lib/libvirt/dnsmasq/default.addnhosts . If this is an acceptable solution, I'd be happy to write the patch up.

summary: libvirt's dnsmasq setup will read /etc/hosts on the host, resulting in
- odd behaviour for the domain
+ odd resolution behaviour on the VM
Serge Hallyn (serge-hallyn) wrote :

Thanks for submitting this bug. Tested and reproduced the same, and agreed this should be fixed. Looks like this should be pretty simple to fix at src/network/bridge_driver.c:networkBuildDhcpDaemonCommandLine(). The patch however should also go upstream for feedback there.

Thanks for offering to write the patch, please go ahead and let us know if you end up not having time due i.e. to reprioritizations...

Changed in libvirt (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Jason Harvey (jason-alioth) wrote :

I created a small patch to place the no-hosts option in the config file which libvirt creates. (I figured that was more appropriate than setting the flag, since all other options are set in those config files by libvirt).

Unfortunately I'm unfamiliar with libvirt's testing suite, and couldn't figure out how to appropriately edit the tests for these changes. It might be best if someone more familiar with the codebase take it from there.

I've also realized that this may be an expected behaviour for some (many users may have placed stuff in /etc/hosts and expect that to work for their VMs). As such, users will need to be warned if this file is excluded by default.

Patch attached.

The attachment "Applies to a7b0040 on git://libvirt.org/libvirt.git" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Serge Hallyn (serge-hallyn) wrote :

Hi Stéphane,

assigned this to you just to get your input, as I imagine you have a
perfect solution right offhand.

On my servers, /etc/hosts lists the hostname as public ip address, and
there is no problem. On my laptops with network-manager, when I start
a container and 'ping <laptop-name>', it pings 127.0.1.1. Adding
--no-hosts to the dnsmasq line is imo wrong, but is there anything else
we can to handle the 127.0.1.1 case?

I suppose we could use -E with a one-line hosts file which lists the
hostname as 10.0.3.1. Or use -S.

Changed in lxc (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Stéphane Graber (stgraber)
Darragh Bailey (dbailey-k) wrote :

Perhaps a change could be added to ignore 127.0.0.1 and 127.0.1.1 addresses returned by default?

Dnsmasq supports such an option, though it would need to be added to libvirt as it regerates the dnsmasq conf file used each time a network is started.

Add following by default to dnsmasq.conf generated by libvirt
---------------
ignore-address=127.0.0.1
ignore-address=127.0.1.1
---------------

Darragh Bailey (dbailey-k) wrote :

Never mind, I took some time to patch libvirt to see if it would have any effect. Appears to only apply to the records dnsmasq receives the DNS queries it makes, doesn't ignore any addresses read in from /etc/hosts.

Darragh Bailey (dbailey-k) wrote :

Did some more testing (found how to adjust libvirts dnsmasq.conf and restart it to pick up conf changes):

To test, get the dnsmasq pid using the vagrant-libvirt.conf config and check the environment set for the process with:
sudo cat /proc/2586/environ
VIR_BRIDGE_NAME=virbr0

After killing the relevant dnsmasq you can manually restart using
sudo VIR_BRIDGE_NAME=virbr0 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/vagrant-libvirt.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper

Just make sure to check the environment for the dnsmasq process before

Adding entries with the following format to the libvirt dnsmasq.conf:
host-record=<short>,<fqdn>,<ip>
interface-name=<short>,br0

And have entries with the following format in /etc/hosts
127.0.0.1 localhost
127.0.1.1 <fqdn> <short>

and nslookup and dig returns the configured <ip> for both short and fqdn instead of 127.0.1.1 as it used to.

It appears that host-record overrides entries read from hosts-files because record options are considered to be read before host-files, and only the first entry results in the PTR creation, so a name appearing in the host-record inhibits PTR-record creation based on the entries in /etc/hosts.

I also tried using:
host-record=<fqdn>,<ip>
host-record=<short>,<ip>

Basically without the interface-name being specified and dig/nslookup would start returning two records. So it means that 'interface-name=<short>,br0' appears to be required to prevent the short version from returning both records.

It would seem that the alternative would be to create a local copy of /etc/hosts pruned of all loop back address entries and provide that as the hosts file to read instead.

Nathan Dorfman (ndorf) wrote :

I had the same problem (on 16.04.02) and came up with a rather dirty, but quick and effective workaround.

Simply remove the libvirt-dnsmasq user's ability to read /etc/hosts:

    sudo setfacl -m user:libvirt-dnsmasq:--- /etc/hosts

The libvirt dnsmasq instances will syslog a complaint when they start or get HUP'd. Otherwise, it seems to work perfectly.

Thanks Nathan to share you workaround as well, but IMHO it is "just as good" as the --no-hosts flag. In that I mean while it prevent the reply of 127.0.1.1 for the Host it also stops any other entry in Hosts to be used (which users might want or even already rely on).

Darragh, Jason and Serge tried to find a solution which keeps the wanted behaviour (all but Host from /etc/hosts) but fixes the issue (Host itself resolved to 127.0.1.1) to make something suitable for libvirt upstream submission.

Despite the time passing since then, I didn't see this or a similar approach to be realized upstream yet.

Nathan Dorfman (ndorf) wrote :

Right, my hack is almost the same as --no-hosts, it just doesn't require patching libvirt.

Do you need that entry in your /etc/hosts? If you have a real DNS name, you might not need it at all. If not, but you have a static IP address, you could use that in the hosts file instead of 127.0.1.1.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers