DNSSEC-enabled named sometimes does not die on "service bind9 stop"

Bug #807153 reported by Matthias Andree
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
bind9 (Debian)
Fix Released
Unknown
bind9 (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

I am running the standard natty BIND9 package in an IPv6- and DNSSEC-enabled configuration as a resolver. The zones are unaltered.

After "service bind9 stop" (or equivalently "service bind9 restart"), named receives and logs the control command, but remains running. I have to use SIGKILL to get rid of it. I already had to do that during the latest security upgrade. Tagging regression-update.

lsof reports:

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
named 25405 bind cwd DIR 252,1 4096 2388192 /var/cache/bind
named 25405 bind rtd DIR 252,1 4096 2 /
named 25405 bind txt REG 252,1 522128 2037772 /usr/sbin/named
named 25405 bind mem REG 252,1 51728 2019878 /lib/x86_64-linux-gnu/libnss_files-2.13.so
named 25405 bind mem REG 252,1 47680 2019885 /lib/x86_64-linux-gnu/libnss_nis-2.13.so
named 25405 bind mem REG 252,1 97248 2019875 /lib/x86_64-linux-gnu/libnsl-2.13.so
named 25405 bind mem REG 252,1 35712 2019876 /lib/x86_64-linux-gnu/libnss_compat-2.13.so
named 25405 bind mem REG 252,1 14280 2019505 /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
named 25405 bind mem REG 252,1 67856 2152536 /usr/lib/x86_64-linux-gnu/libtasn1.so.3.1.9
named 25405 bind mem REG 252,1 543104 2019851 /lib/x86_64-linux-gnu/libm-2.13.so
named 25405 bind mem REG 252,1 499320 2019792 /lib/x86_64-linux-gnu/libgcrypt.so.11.6.0
named 25405 bind mem REG 252,1 659656 2152547 /usr/lib/x86_64-linux-gnu/libgnutls.so.26.14.12
named 25405 bind mem REG 252,1 105168 2037664 /usr/lib/libsasl2.so.2.0.23
named 25405 bind mem REG 252,1 96816 2019516 /lib/x86_64-linux-gnu/libz.so.1.2.3.4
named 25405 bind mem REG 252,1 101192 2019889 /lib/x86_64-linux-gnu/libresolv-2.13.so
named 25405 bind mem REG 252,1 10160 2019843 /lib/x86_64-linux-gnu/libkeyutils.so.1.3
named 25405 bind mem REG 252,1 14696 2019849 /lib/x86_64-linux-gnu/libdl-2.13.so
named 25405 bind mem REG 252,1 31104 2152677 /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1
named 25405 bind mem REG 252,1 14544 2019670 /lib/x86_64-linux-gnu/libcom_err.so.2.1
named 25405 bind mem REG 252,1 158080 2150312 /usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1
named 25405 bind mem REG 252,1 803128 2152585 /usr/lib/x86_64-linux-gnu/libkrb5.so.3.3
named 25405 bind mem REG 252,1 229688 2036171 /usr/lib/libGeoIP.so.1.4.7
named 25405 bind mem REG 252,1 1638120 2019728 /lib/x86_64-linux-gnu/libc-2.13.so
named 25405 bind mem REG 252,1 1388664 2039673 /usr/lib/libxml2.so.2.7.8
named 25405 bind mem REG 252,1 140254 2019888 /lib/x86_64-linux-gnu/libpthread-2.13.so
named 25405 bind mem REG 252,1 18832 2020006 /lib/libcap.so.2.20
named 25405 bind mem REG 252,1 55504 490901 /usr/lib/liblber-2.4.so.2.5.6
named 25405 bind mem REG 252,1 298304 490678 /usr/lib/libldap_r-2.4.so.2.5.6
named 25405 bind mem REG 252,1 1494392 2151244 /usr/lib/x86_64-linux-gnu/libdb-4.8.so
named 25405 bind mem REG 252,1 343392 2037182 /usr/lib/libisc.so.62.1.1
named 25405 bind mem REG 252,1 30672 2038203 /usr/lib/libisccc.so.60.0.0
named 25405 bind mem REG 252,1 125272 2038571 /usr/lib/libisccfg.so.62.0.0
named 25405 bind mem REG 252,1 46880 2038897 /usr/lib/libbind9.so.60.0.4
named 25405 bind mem REG 252,1 1620736 2019544 /lib/libcrypto.so.0.9.8
named 25405 bind mem REG 252,1 217816 2151205 /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2
named 25405 bind mem REG 252,1 1542624 2038650 /usr/lib/libdns.so.69.1.2
named 25405 bind mem REG 252,1 71584 2038578 /usr/lib/liblwres.so.60.0.1
named 25405 bind mem REG 252,1 141088 2019706 /lib/x86_64-linux-gnu/ld-2.13.so
named 25405 bind 0u CHR 1,3 0t0 6437 /dev/null
named 25405 bind 1u CHR 1,3 0t0 6437 /dev/null
named 25405 bind 2u CHR 1,3 0t0 6437 /dev/null
named 25405 bind 3w FIFO 0,8 0t0 6115524 pipe
named 25405 bind 4u unix 0xffff880056944340 0t0 6115539 socket
named 25405 bind 5u CHR 1,3 0t0 6437 /dev/null
named 25405 bind 6r FIFO 0,8 0t0 6115035 pipe
named 25405 bind 8w FIFO 0,8 0t0 6115035 pipe
named 25405 bind 9u 0000 0,9 0 6428 anon_inode
named 25405 bind 10r CHR 1,8 0t0 6441 /dev/random

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: bind9 1:9.7.3.dfsg-1ubuntu2.2
ProcVersionSignature: Ubuntu 2.6.38-8.42-generic 2.6.38.2
Uname: Linux 2.6.38-8-generic x86_64
Architecture: amd64
Date: Thu Jul 7 19:23:36 2011
ProcEnviron:
 LANGUAGE=de:en
 PATH=(custom, no user)
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions:
 bind9utils 1:9.7.3.dfsg-1ubuntu2.2
 apparmor 2.6.1-0ubuntu3
SourcePackage: bind9
UpgradeStatus: Upgraded to natty on 2011-05-02 (66 days ago)
mtime.conffile..etc.apparmor.d.usr.sbin.named: 2011-05-04T21:25:56.095231
mtime.conffile..etc.bind.named.conf.default.zones: 2011-06-20T18:47:12.474753
mtime.conffile..etc.bind.named.conf.options: 2011-06-29T23:32:32.963462

Revision history for this message
Matthias Andree (matthias-andree) wrote :
Revision history for this message
Jürgen Kreileder (jk) wrote :

Same problem on several machines here.

Revision history for this message
Matthias Andree (matthias-andree) wrote :

Marked "Confirmed"

Changed in bind9 (Ubuntu):
status: New → Confirmed
Revision history for this message
Brian Vaughan (bgvaughan) wrote :

This looks like the bug I want to report. I find that shutting down bind9 is erratic: sometimes it succeeds, sometimes not. This is a common result of shutting it down at the command line:
<code>brian@$ sudo service bind9 stop
 * Stopping domain name service... bind9
rndc: connect failed: 127.0.0.1#953: connection refused</code>
I need to send a break (usually with CTRL-C) to get past this.

I often have difficulty shutting down or rebooting my system, as it often hangs early in the process; bind9 seems to be among the first services to be shut down, so I suspect it is the root of that problem as well.

I notice that Matthias pointed out that he is using IPv6 and dnssec. I am using an IPv6 tunnel and have configured bind9 for dnssec validation; I had purged and reinstalled bind9 in trying to troubleshoot this problem, and it didn't hang on restarts before I re-enabled dnssec validation. That was only a few restarts; with an intermittent problem, it's hard to be sure. I will try disabling dnssec validation for a few days and see if that helps.

Revision history for this message
Brian Vaughan (bgvaughan) wrote :

I disabled dnssec validation, and after a few days with a number of reboots and restarts of bind9, I haven't had any further trouble with stopping the bind9 service or with shutting down my system. So I'm pretty sure at this point that there's some bug with dnssec validation that intermittent prevents bind9 from shutting down properly.

Thomas Hood (jdthood)
summary: - named does not shut down after "service bind9 stop"
+ DNSSEC-enabled named sometimes does not die on "service bind9 stop"
Revision history for this message
Ben Coleman (b-coleman) wrote :

I'm also seeing this problem on 2 of the machines under my control. Two others running bind9 do notpo have this problem. The two that have this problem do have DNSSEC enabled (though only one of them is actually serving a DNSSEC-using zone).

This seems to be related to Debian bug #570852 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570852), which is fixed by the upstream 9.8.1 release, which is included in Precise. I'll try to upgrade one of the affected servers this weekend and see if that fixes it.

Revision history for this message
Ben Bird (bbird) wrote :

I have the same scenario of IPv6 + bind9 + dnssec.

b-coleman's suspicion that this is related to debian bug# 578052 seems plausible. I upgraded my affected machine to Precise (12.04), and the problem has stopped.

Revision history for this message
Matthias Andree (matthias-andree) wrote : Re: [Bug 807153] Re: DNSSEC-enabled named sometimes does not die on "service bind9 stop"

Am 02.05.2012 18:46, schrieb Ben Bird:
> I have the same scenario of IPv6 + bind9 + dnssec.
>
> b-coleman's suspicion that this is related to debian bug# 578052 seems
> plausible. I upgraded my affected machine to Precise (12.04), and the
> problem has stopped.
>

The problem isn't with the init script as in the Debian bug, but instead
with bind9 itself, which won't die.

Revision history for this message
Ben Coleman (b-coleman) wrote :

The problem with the init script in the debian bug was that bind9 itself wasn't dying, and the init script was set up to wait until bind9 died. It was made moot by by upstream release of bind 9.8.1, which fixed the problem of bind9 not dying.

James Page (james-page)
Changed in bind9 (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
Brian Murray (brian-murray) wrote :

I'm setting this to Fix Released as it is fixed in 12.04 which contains bind9 version (1:9.8.1.dfsg.P1-4) and Natty (11.04) is no longer supported.

Changed in bind9 (Ubuntu):
status: Triaged → Fix Released
Changed in bind9 (Debian):
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.