Bind/named does not initialize on boot due to missing IPv6 address

Bug #510587 reported by Vihai
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
bind9 (Ubuntu)
Expired
Low
Unassigned

Bug Description

Binary package hint: bind9

Hello,

After a reboot I noticed that named did not start. Starting it manually worked fine.

In the log I found:

Jan 21 11:27:40 sid named[1315]: starting BIND 9.6.1-P2 -u bind
Jan 21 11:27:40 sid named[1315]: built with '--prefix=/usr' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/bind' '--localstatedir=/var' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-gnu-ld' '--with-dlz-postgres=no' '--with-dlz-mysql=no' '--with-dlz-bdb=yes' '--with-dlz-filesystem=yes' '--with-dlz-ldap=yes' '--with-dlz-stub=yes' '--with-geoip=/usr' '--enable-ipv6' 'CFLAGS=-fno-strict-aliasing -DDIG_SIGCHASE -O2' 'LDFLAGS=-Wl,-Bsymbolic-functions' 'CPPFLAGS=' 'CXXFLAGS=-g -O2' 'FFLAGS=-g -O2'
Jan 21 11:27:40 sid named[1315]: adjusted limit on open files from 1024 to 1048576
Jan 21 11:27:40 sid named[1315]: found 4 CPUs, using 4 worker threads
Jan 21 11:27:40 sid named[1315]: using up to 4096 sockets
Jan 21 11:27:40 sid named[1315]: loading configuration from '/etc/bind/named.conf'
Jan 21 11:27:41 sid named[1315]: using default UDP/IPv4 port range: [1024, 65535]
Jan 21 11:27:41 sid named[1315]: using default UDP/IPv6 port range: [1024, 65535]
Jan 21 11:27:41 sid named[1315]: listening on IPv4 interface eth0:auth2, 62.212.1.11#53
Jan 21 11:27:41 sid named[1315]: listening on IPv6 interface eth0, 2a02:20:0:101::20#53
Jan 21 11:27:41 sid named[1315]: could not listen on UDP socket: address not available
Jan 21 11:27:41 sid named[1315]: creating IPv6 interface eth0 failed; interface ignored
Jan 21 11:27:41 sid named[1315]: could not get query source dispatcher (2a02:20:0:101::20#0)
Jan 21 11:27:41 sid named[1315]: additionally listening on IPv6 interface eth0, 2a02:20:0:101::20#53
Jan 21 11:27:41 sid named[1315]: could not listen on UDP socket: address not available
Jan 21 11:27:41 sid named[1315]: creating IPv6 interface eth0 failed; interface ignored
Jan 21 11:27:41 sid named[1315]: loading configuration: address not available
Jan 21 11:27:41 sid named[1315]: exiting (due to fatal error)

It looks like the IPv6 address was not yet assigned to the interface, thus named wasn't able to listen on it.

Revision history for this message
Chuck Short (zulcss) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Unfortunately, we can't fix it because your description didn't include enough information. You may find it helpful to read "How to report bugs effectively" http://www.chiark.greenend.org.uk/~sgtatham/bugs.html. We'd be grateful if you would then provide a more complete description of the problem. We have instructions on debugging some types of problems at http://wiki.ubuntu.com/DebuggingProcedures.
At a minimum, we need:
1. the specific steps or actions you took that caused you to encounter the problem,
2. the behavior you expected, and
3. the behavior you actually encountered (in as much detail as possible).
Thanks!

When reporting bugs in the future please use apport, either via the appropriate application's "Help -> Report a Problem" menu or using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

Changed in bind9 (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Revision history for this message
Mark Schouten (mark-prevented) wrote :

This is odd. You would expect the address to be available when you configured it. Can you include your network config in /etc/network/interfaces for this interface?

Revision history for this message
Vihai (daniele-orlandi) wrote :

Mark,
Attached you find a copy of /etc/network/interfaces.

Chuck,
> 1. the specific steps or actions you took that caused you to encounter the problem,

rebooted the box

> 2. the behavior you expected, and

bind not failing

> 3. the behavior you actually encountered (in as much detail as possible).

bind failed to listen

Okay, irony apart, what other informations do you need?

> When reporting bugs in the future please use apport, either via the appropriate application's "Help -> Report a Problem"
> menu or using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at
> https://wiki.ubuntu.com/ReportingBugs.

Yup, thanks.

Revision history for this message
Mark Schouten (mark-prevented) wrote :

Hmm, the interfaces file looks ok.

Funny thing is, if you didn't change anything between booting and starting bind, the interfaces are created to slow and the bug should be filed on something else than bind. It's the right thing to do for bind, if the interface isn't there.

Revision history for this message
Vihai (daniele-orlandi) wrote :

Yes, it could well be an issue in upstart or in some related-script, unfortunately I'm not familiar with upstart, thus I'd have to spend some time to figure out where the problem is (and at the moment the time is lacking...).

Revision history for this message
Vihai (daniele-orlandi) wrote :

This bug is still present in lucid release, just upgraded 4 boxes and Bind didn't start at boot while it started when launched manually.

May 3 13:10:08 sid named[893]: starting BIND 9.7.0-P1 -u bind
May 3 13:10:08 sid named[893]: built with '--prefix=/usr' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/bind' '--localstatedir=/var' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-gnu-ld' '--with-dlz-postgres=no' '--with-dlz-mysql=no' '--with-dlz-bdb=yes' '--with-dlz-filesystem=yes' '--with-dlz-ldap=yes' '--with-dlz-stub=yes' '--with-geoip=/usr' '--enable-ipv6' 'CFLAGS=-fno-strict-aliasing -DDIG_SIGCHASE -O2' 'LDFLAGS=-Wl,-Bsymbolic-functions' 'CPPFLAGS='
May 3 13:10:08 sid named[893]: adjusted limit on open files from 1024 to 1048576
May 3 13:10:08 sid named[893]: found 4 CPUs, using 4 worker threads
May 3 13:10:08 sid named[893]: using up to 4096 sockets
May 3 13:10:08 sid named[893]: loading configuration from '/etc/bind/named.conf'
May 3 13:10:08 sid named[893]: reading built-in trusted keys from file '/etc/bind/bind.keys'
May 3 13:10:08 sid named[893]: using default UDP/IPv4 port range: [1024, 65535]
May 3 13:10:08 sid named[893]: using default UDP/IPv6 port range: [1024, 65535]
May 3 13:10:08 sid named[893]: not listening on any interfaces
May 3 13:10:08 sid named[893]: generating session key for dynamic DNS
May 3 13:10:08 sid named[893]: could not get query source dispatcher (62.212.1.11#0)
May 3 13:10:08 sid named[893]: loading configuration: address not available
May 3 13:10:08 sid named[893]: exiting (due to fatal error)

It may be triggered by the fact that I explicitly bind on specific network addresses, however is is still present.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for bind9 (Ubuntu) because there has been no activity for 60 days.]

Changed in bind9 (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Vihai (daniele-orlandi) wrote :

This problem is still very present on Maverick

A notable fact might be that I explicitly bind to certain IPv6 addresses instead of the default "all interfaces"

Vihai (daniele-orlandi)
Changed in bind9 (Ubuntu):
status: Expired → New
Chuck Short (zulcss)
Changed in bind9 (Ubuntu):
status: New → Triaged
Revision history for this message
mibus (mibus) wrote :

I have a near-identical issue. During boot, this gets logged:

named[770]: could not get query source dispatcher (2001:xxxx:xxxx:xxxx::xxx#0)
named[770]: loading configuration: address not available
named[770]: exiting (due to fatal error)

Manually starting BIND fixes the issue.

This is the named.conf line in question:
query-source-v6 address 2001:xxxx:xxxx:xxxx::xxx port *;

I believe it's because the IPv6 address is marked 'tentative' while the IPv6 duplicate-address-detection process is in progress, and BIND is starting before it's finished.

Revision history for this message
Stéphane Loeuillet (leroutier) wrote :

bind9 starts at boot but before all interfaces are UP.
It's because of parallel boot (upstart)

Revision history for this message
Vihai (daniele-orlandi) wrote :

Just upgraded to oneiric, bug persists.... sigh....

Revision history for this message
Alessio Bravi (twstr) wrote :

The problem is still present, and it is really annoying.

I can't understand why it has been classified with a "Low" importance.

Revision history for this message
Giuseppe Ravasio (giuseppe-ravasio) wrote :

I was having the same problem on a new dns server (Ubuntu 10.04 LTS) but not on an old one (Ubuntu 10.04LTS).
I noticed that the two /etc/network/interface had a different order for iface lines and moving inet6 before ipv4 ipaliases solved the problem.

### WORKING ONE ###
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
        address 93.xxxxx
        netmask 255.255.254.0
        network 93.xxxxxx
        broadcast 93.xxxxxx
iface eth0 inet6 static
        address 2a01:xxxxxxxxxxxxxxxx
        netmask 64
        gateway 2a01:xxxxxxxxxxxxxxxx
        up ip -6 addr add 2a01:xxxxxxxxxxxxx/64 dev eth0
auto eth0:NS1
iface eth0:NS1 inet static
        address 93.xxxxxx
        netmask 255.255.254.0
######### END ##############

### BROKEN ONE ###
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
        address 93.xxxxx
        netmask 255.255.254.0
        network 93.xxxxxx
        broadcast 93.xxxxxx
auto eth0:NS1
iface eth0:NS1 inet static
        address 93.xxxxxx
        netmask 255.255.254.0

iface eth0 inet6 static
        address 2a01:xxxxxxxxxxxxxxxx
        netmask 64
        gateway 2a01:xxxxxxxxxxxxxxxx
        up ip -6 addr add 2a01:xxxxxxxxxxxxx/64 dev eth0
############ END #############

Revision history for this message
Rami Lehti (ramilehti) wrote :

This bug is still present in 20.04.
It does not matter if I have listen-on-ipv6 set to a specific address or any. Still the described behaviour is seen.

As a workaround if I set a static IPv6 address on the interface it works. But if the interface is set to auto, or there is a slight delay in setting the address otherwise the problem reappears.

IMHO the solution is that network.target in should not pass until ipv6 auto address mechanism has had a chance to run. This will delay the boot by a second or two. But it will ensure that the interface has an address when daemons start to bind to it.

Revision history for this message
Robie Basak (racb) wrote :

I'm not sure if everyone is having the issue here. But if in named.conf you're explicitly binding to a particular address, then you need to *also* configure named to start only after that interface is configured. See https://lists.ubuntu.com/archives/ubuntu-devel/2021-May/041455.html for some fairly recent discussion on this. The exact nature of this has changed over the years since this bug was first opened; since systemd was introduced in 16.04, it's systemd that is relevant. Some users might find that adding After=network-online.target works for them. The ML thread goes more into why this isn't the default.

tags: added: network-online-ordering
Revision history for this message
Simon Déziel (sdeziel) wrote :

I tested on a Jammy machine running bind9 1:9.18.1-1ubuntu1 and there, bind9 won't complain if the IPv6 address it is supposed to listen on is missing. Bind9 will simply start listening when the IP finally shows up. This makes it more resilient to IPv6 DAD taking time.

Revision history for this message
Paride Legovini (paride) wrote :

@Simon thanks for testing this on Jammy. AIUI Jammy's bind9 behaves in the best possible way:

 - Doesn't spam the logs with errors if an address is missing;
 - Starts listening on the address as soon as it becomes available.

This looks like a Fix Released to me, but I'd like confirmation from you and/or Vihai. I'm marking this as Incomplete for now.

I don't think this is SRU material, but we can consider adding Triaged+Low tasks for the stable releases for tracking.

Changed in bind9 (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for bind9 (Ubuntu) because there has been no activity for 60 days.]

Changed in bind9 (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.