Temporary failure in name resolution

Bug #599342 reported by Robert Sander
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Medium
Unassigned
Maverick
Invalid
Medium
Unassigned

Bug Description

libvirt-bin is not started correctly.

It tries to resolve the hostname when starting but at that time ther eis no network and no DNS available. It fails with error 6 and starts again immediately. After several attempts init disables the job completely:

Jun 28 14:33:05 natta libvirtd: 14:33:05.885: error : virGetHostname:1838 : internal error getaddrinfo failed for 'natta': Temporary failure in name resolution
Jun 28 14:33:05 natta init: libvirt-bin main process (1332) terminated with status 6
Jun 28 14:33:05 natta init: libvirt-bin main process ended, respawning
Jun 28 14:33:05 natta libvirtd: 14:33:05.892: error : virGetHostname:1838 : internal error getaddrinfo failed for 'natta': Temporary failure in name resolution
Jun 28 14:33:05 natta init: libvirt-bin main process (1338) terminated with status 6
Jun 28 14:33:05 natta init: libvirt-bin respawning too fast, stopped

We have removed the "bogus" hostname entry from /etc/hosts poiting to a 127.0.0.2 address. Our network DNS resolves hostnames to their "real" IP addresses.

If libvirt-bin needs to resolve a hostname to an IP address it should wait for the network when starting.

The libvirt-bin upstart script may need a dependency.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: libvirt-bin 0.7.5-5ubuntu27
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32.11-epi+drm33.2 x86_64
NonfreeKernelModules: fglrx
Architecture: amd64
Date: Mon Jun 28 14:34:41 2010
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, user)
 LANG=en_US.UTF-8
 LANGUAGE=
SourcePackage: libvirt

Revision history for this message
Robert Sander (gurubert) wrote :
Revision history for this message
Robert Sander (gurubert) wrote :

After reading bug #595388 I have added the lines:

        hostname=$(hostname)
        until host $hostname
        do
                sleep 2s
        done

to the pre-start section of libvirt-bin's upstart script. It works for me.

Chuck Short (zulcss)
Changed in libvirt (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Thierry Carrez (ttx) wrote :

Nominated for maverick -- tentatively assigned to Serge, feel free to reassign to Canonical Server team if needed.

Changed in libvirt (Ubuntu):
assignee: nobody → Serge Hallyn (serge-hallyn)
importance: High → Medium
milestone: none → ubuntu-10.10-beta
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This scares me because on my laptop, where I do still have

127.0.1.1 vostro2

in my /etc/hosts, 'host vostro2' returns 1. So most stock ubuntu
installations would apparently wait forever to start libvirt-bin.

Does the following /etc/init/libvirt-bin.conf work for you?

==========================================================

description "libvirt daemon"
author "Dustin Kirkland <email address hidden>"

start on runlevel [2345] and (started network-interface
          or started network-manager
          or started networking)
stop on runlevel [!2345]

expect daemon
respawn

pre-start script
 mkdir -p /var/run/libvirt
 # Clean up a pidfile that might be left around
 rm -f /var/run/libvirtd.pid
end script

# If you used to set $libvirtd_opts in /etc/default/libvirt-bin,
# change the 'exec' line here instead.

Changed in libvirt (Ubuntu Maverick):
status: Triaged → Incomplete
Revision history for this message
Chuck Short (zulcss) wrote :

Is this actually happening because of dnsmasq or is there another issue going on here? If its because of dnsmasq it might have to be converted as well.

chuck

Revision history for this message
Robert Sander (gurubert) wrote :

We have removed the local hostname from /etc/hosts pointing to 127.0.1.1 or any other address in 127.0.0.0/8, we only keep 127.0.0.1 for localhost in there. Every other name is resolved by our internal DNS.

The loop above just waits until the DNS answers and the hostname can be resolved. Otherwise libvirtd refuses to start.

We have reasons not to resolve the hostname to a 127.0.0.0/8 address.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Robert:

Naturally you can use the recipe in comment #2 on your own systems. It's
just not safe to put in the package.

Jamie, since you've been packaging libvirt-bin (and since I'm looking for
a sanity check), do you see any problems with changing the upstart script
to the one I put in comment #4?

(I guess I'll create a tree to propose in the next few minutes)

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Serge, I am not an upstart expert and highly recommend you get Scott to check this. While it seems sane I think we'd need to test 'airplane mode' for people trying to do work on their laptops and the network isn't up (though I would think the 'network-manager' bit should cover it).

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

debdiff for a proposed modification of the upstart script, to make libvirt wait
until networking is up to attempt to start. The resulting package is up on
ppa:serge-hallyn/virt.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Scott, would you mind taking a look at the proposed tweak
to /etc/init/libvirt-bin.conf? There is probably a better way...

Thanks!

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Reassigning to Scott, as you're asking for his input.

Scott, once you're done with this, please assign it back to Serge. Thanks!

Changed in libvirt (Ubuntu Maverick):
assignee: Serge Hallyn (serge-hallyn) → Scott James Remnant (scott)
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I'm pretty sure mixing and/or like that will have disastrous consequences (bug #447654) - please don't

Changed in libvirt (Ubuntu Maverick):
assignee: Scott James Remnant (scott) → Serge Hallyn (serge-hallyn)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks, Scott.

Robert, what do you think about shortcutting your additional upstart
script if
 grep '\<$hostname\>' /etc/hosts
returns 0? That should be save for both of us.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Debdiff of proposed package fix.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've posted this for testing to ppa:serge-hallyn/libvirt-host-upstart-fix. Once it is
built I"ll test it on my own vms. Assuming it works for both Robert and I, I'll
propose it for merge into maverick. IIUC it'll need to be fixed in maverick before
we do an equivalent fix to SRU for lucid.

tags: added: patch
papukaija (papukaija)
tags: added: maverick
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note - it started fine in my own VMs. Waiting for confirmation from Robert that this
works for him.

Thierry Carrez (ttx)
Changed in libvirt (Ubuntu Maverick):
milestone: ubuntu-10.10-beta → none
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Robert,

can you confirm whether this fix works for you? We'll need confirmation of
that before the fix can be pushed to the package.

Thierry Carrez (ttx)
tags: added: server-mrs
Revision history for this message
Kate Stewart (kate.stewart) wrote :

Robert, Serge - was this ever confirmed? Do we have chance of getting it in for 10.10 or is it likely to be a candidate for update?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 599342] Re: Temporary failure in name resolution

Quoting Kate Stewart (<email address hidden>):
> Robert, Serge - was this ever confirmed? Do we have chance of getting
> it in for 10.10 or is it likely to be a candidate for update?

I'd like confirmation from Robert that the patch suffices for him.
The resulting package works on my system, but I don't want to risk
baggage in the upstart script unless we're sure it actually fixes a
problem, even if it is (or so far appears) innocuous.

Revision history for this message
Dave Walker (davewalker) wrote :

Having discussed this issue with cjwatson, he suggested the proposed stanza be reconsidered and done slightly differently - and postponed for an SRU into Maverick.

Thanks

Thierry Carrez (ttx)
Changed in libvirt (Ubuntu Maverick):
milestone: none → maverick-updates
tags: removed: server-mrs
Revision history for this message
Mark - Syminet (mark-syminet) wrote :

But what if your DNS servers, are also your guests?

Revision history for this message
Robert Sander (gurubert) wrote :

I am sorry for the delay.

It looks like the grep from comment #13 will do the trick on standalone systems.

Unfortunately we have moved away from KVM on our systems, so that I am not able to test it right now.

Revision history for this message
Mark - Syminet (mark-syminet) wrote :

Might stand to rephrase:

What if your DNS servers, are your guests?

And btw? Thanks for all you do.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@Mark

please give a more complete scenario so we can reason through it. For whom is one of hte guests the DNS server? If the *host* is a guest of the DNS server on one of its kvm guests, then I think the answer is that libvirt doesn't support that.

---
Ubuntu Bug Squad volunteer triager
http://wiki.ubuntu.com/BugSquad

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@DaveWalker,

regarding comment #20, could you give a little more detail? which proposed stanza should be reconsidered? I'll add it to the oneiric package (for starters) but don't know which you mean :)

Revision history for this message
Mark - Syminet (mark-syminet) wrote :

@Serge correct - for security (and portability) reasons, we've decided to run ssh only and nothing else on the hosts - all other services run on guests - so the host is using one of it's own guests as it's DNS server. It's backup nameservers are on different hosts which are usually up, so we never noticed this issue - but this time, they were not available at the moment we rebooted. The result was all guests failed to start.

For now, we've just updated /etc/hosts on each machine to make sure they automatically resolve themselves even without nameservice - this seems to be a decent workaround for now.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Assigned to Dave to get his response)

Changed in libvirt (Ubuntu):
assignee: Serge Hallyn (serge-hallyn) → Dave Walker (davewalker)
Changed in libvirt (Ubuntu Maverick):
assignee: Serge Hallyn (serge-hallyn) → nobody
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi,

recent changes should prevent libvirt from coming up if network is not ready. Can anyone reproduce this on updated natty, oneiric or precise?

Changed in libvirt (Ubuntu):
assignee: Dave Walker (davewalker) → nobody
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(or maverick for that matter)

Changed in libvirt (Ubuntu):
status: Incomplete → Fix Released
Changed in libvirt (Ubuntu Maverick):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.