Ubuntu

ntpd does not listen on 127.0.1.1, the IP address associated with the system hostname

Reported by Brian Burch on 2010-07-11
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
bind9 (Ubuntu)
Undecided
Unassigned
ntp (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: ifupdown

release: Ubuntu 10.04 LTS (but probably all releases back to Hardy and forward to Maverick)
package: ifupdown: Installed: 0.6.8ubuntu29

Since upgrading my systems to lucid and cleaning up the configurations, I've noticed strange behaviour where client tcp or udp sessions to local tcp/ip servers have timed out even though the servers were running and netstat showed they were listening on all interfaces. Wireshark traces showed that successful connections went to 127.0.0.1, while any connection attempt to 127.0.1.1 would fail.

The systems are fully controlled by NetworkManager <offtopic>a long and painful story!</offtopic> and so /etc/network/interfaces contains only two lines:
    auto lo
    iface lo inet loopback

This "minimal configuration" is not only highly recommended to avoid wierd behaviour with NetworkManager, but also to avoid problems with upstart (e.g. https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/497299). All the other network interfaces are managed by NM.

"In the old days", a typical /etc/hosts file would define all the local hostnames as follows:
    127.0.0.1 myhostname.localdomain myhostname localhost.localdomain localhost

For several releases (back as far as Hardy), ubuntu installations have been creating a more sophisticated version of the hosts file (presumably because of http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316099), e.g.
    127.0.0.1 localhost
    127.0.1.1 myhostname.mydomain myhostname

This modern variant allows the system to always resolve its own hostname and canonical name, even when there is no external network.

It appears to work really well... e.g. "ping hostname" and "ping localhost" both succeed. "hostname" and "hostname -f" return the expected values. (note: I don't fully understand why ping hostname actually works - perhaps because the class A 127.0.0.0 subnet has been defined for ever as a loopback subnet?).

The problem comes when a local client tries to connect to a local server... a well-behaved server will listen on "all available network interfaces", e.g.
    brian@myhostname:~$ netstat -ln | grep 53
    tcp 0 0 10.1.252.115:53 0.0.0.0:* LISTEN
    tcp 0 0 192.168.252.115:53 0.0.0.0:* LISTEN
    tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
    udp 0 0 10.1.252.115:53 0.0.0.0:*
    udp 0 0 192.168.252.115:53 0.0.0.0:*
    udp 0 0 127.0.0.1:53 0.0.0.0:*

... so, when a client tries to connect to the local server (in my not very realistic example you would have to edit /etc/resolv.conf to point to myhostname instead of the default localhost), the nslookup connection would never be received by the local bind server and so will eventually time out.

If we are to keep the "new style" hosts file structure (and I think we should), then we MUST ensure that local servers are able to access all available interfaces, including BOTH of the loopback addresses mentioned in the hosts file, so we need to define an additional interface for the myhostname loopback address.

This can be easily demonstrated by defining a second (virtual) address for the lo interface:
    sudo ifconfig lo:0 127.0.1.1 netmask 255.0.0.0

Stop the server and it will discover the extra interface when it is restarted:
    brian@myhostname:~$ netstat -ln | grep 53
    tcp 0 0 10.1.252.115:53 0.0.0.0:* LISTEN
    tcp 0 0 192.168.252.115:53 0.0.0.0:* LISTEN
    tcp 0 0 127.0.1.1:53 0.0.0.0:* LISTEN
    tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
    udp 0 0 10.1.252.115:53 0.0.0.0:*
    udp 0 0 192.168.252.115:53 0.0.0.0:*
    udp 0 0 127.0.1.1:53 0.0.0.0:*
    udp 0 0 127.0.0.1:53 0.0.0.0:*

Of course, this circumvention will only last until the next reboot. The solution can be made permanent by updating the /etc/networking/interfaces file to automatically bring up both loopback interfaces early in the boot process:

    # This file describes the network interfaces available on your system
    # and how to activate them. For more information, see interfaces(5).

    # automatically bring up both these interfaces at boot (ifup -a == all)
    auto lo lo:0

    # The standard loopback network interface
    iface lo inet loopback

    # another loopback interface for that pesky dual loopback hosts file
    iface lo:0 inet static
      address 127.0.1.1
      netmask 255.0.0.0

I do not know if there is a more elegant solution to the same problem (but remember that some of the server processes are started very early, hence the upstart bug reference above).

If acceptable, this bug should be pushed upstream to encompass the other distros that use the same multiple loopback interface /etc/hosts file organisation.

Nick Burch (ubuntu-gagravarr) wrote :

A possibly cleaner fix would be to set the iface entry to:

auto lo
iface lo inet loopback
 post-up ip addr add 127.0.1.1/8 dev lo

However, if you want to fix the problem for good, apply this patch to ifupdown and rebuild the package:

--- ifupdown-0.6.8ubuntu29/ifupdown.nw.sav 2010-07-17 13:31:19.155758540 +0100
+++ ifupdown-0.6.8ubuntu29/ifupdown.nw 2010-07-17 13:30:32.587758481 +0100
@@ -4031,6 +4031,7 @@

   up
     ifconfig %iface% 127.0.0.1 up
+ ifconfig %iface%:0 127.0.1.1 up
     route add -net 127.0.0.0 if ( mylinuxver() < mylinux(2,1,100) )

   down

Brian Burch (brian-pingtoo) wrote :

I built a local ifupdown package including Nick's patch. It works perfectly, so as far as I'm concerned the bug has an acceptable permanent fix. Thanks for your help!

Because of the debian bug 316099 (link above), I presume Nick's fix needs to be pushed upstream (at least) that far.

tags: added: patch
Harald Meland (hmeland) wrote :

Any ETA on when this will be fixed in 10.04?

I think this problem is likely why I've had problems getting e.g. "mvn jetty:run" to work on my laptop -- but only when a VPN connection (split-tunnel, I think) to my workplace is in place.

Brian Burch (brian-pingtoo) wrote :

I'm disappointed that my fix hasn't been implemented yet. I suppose most people running local tcpip servers have already kludged their hosts files and so don't encounter this problem...

Local clients that connect to "standard" local servers (such as bind and ntpd) seem to historically hardcode either "localhost" or "127.0.0.1" as the address of their local server. This works because the ubuntu/debian default /etc/hosts file defines 127.0.0.1 as localhost and so the definition will be available no matter what the network interfaces might look like at the time.

However, any client that takes a less stone-age approach (especially java programs) will use their own hostname to address their local server. The new-format default hosts file will ALWAYS resolve the local hostname to 127.0.1.1. UNLESS we provide an interface that corresponds to this "new" local address, the server will never know it should be listening on that address! My fix defines a secondary address for the loopback interface, so the local server will discover it and listen on it.

Since opening this bug, ifupdown-0.6.8ubuntu29.1 has been released and it still does not implement my fix. Therefore, I have attached an improved patch for the latest version of the package. Please would someone apply it?

Brian Burch (brian-pingtoo) wrote :
tags: added: udd-find
Changed in ifupdown (Debian):
status: Unknown → New
Brian Burch (brian-pingtoo) wrote :

Encouraging progress with the experimental version of ifupdown which appears to resolve this problem, see...
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=617268

Steve Langasek (vorlon) wrote :

I don't believe there's any bug in ifupdown here. As mentioned in the upstream Debian bug, you do not need an explicit 127.0.1.1 network interface to receive requests on that address, *as long as* you are listening on the "any" address. This is indeed what openssh is doing by default; being able to ssh to 127.0.1.1 has nothing to do with any ifupdown changes.

Out of the many services I run, the only ones I find that have problems with 127.0.1.1 are bind, ntpd, and nmbd. nmbd is entirely uninteresting, because it's only relevant on broadcast interfaces (which is part of why it must bind by IP). ntpd's behavior could be considered a bug, as could bind9. Since you mention bind9 explicitly, I think reassigning this report to the bind9 package is the reasonable course of action here.

affects: ifupdown (Ubuntu) → bind9 (Ubuntu)
Changed in bind9 (Ubuntu):
status: New → Invalid
status: Invalid → Won't Fix
status: Won't Fix → New
tags: removed: patch
Download full text (3.2 KiB)

On 10/08/11 06:42, Steve Langasek wrote:
> I don't believe there's any bug in ifupdown here. As mentioned in the
> upstream Debian bug, you do not need an explicit 127.0.1.1 network
> interface to receive requests on that address, *as long as* you are
> listening on the "any" address. This is indeed what openssh is doing by
> default; being able to ssh to 127.0.1.1 has nothing to do with any
> ifupdown changes.
> Out of the many services I run, the only ones I find that have problems
> with 127.0.1.1 are bind, ntpd, and nmbd. nmbd is entirely
> uninteresting, because it's only relevant on broadcast interfaces (which
> is part of why it must bind by IP). ntpd's behavior could be considered
> a bug, as could bind9. Since you mention bind9 explicitly, I think
> reassigning this report to the bind9 package is the reasonable course of
> action here.
> ** Package changed: ifupdown (Ubuntu) => bind9 (Ubuntu)
> ** Tags removed: patch

Steve,

Thanks for taking interest. I went quiet because I am very worried about
causing undesirable side effects by rushing into conclusions about a
fix. I have several different production systems that I can test
carefully, but I don't have a lab system to play with.

There has been a lot of update activity on ifupdown since our latest
stable release, 0.6.10ubuntu4 (0.6.10 on debian sid). The code is
currently a moving target in the form of the debian experimental
version. (In particular, it seems all interface changes are now made
with the ip program, rather than ifconfig and route). ifupdown handles
so many different kinds of interface and I can't test most of them.

My bug relates specifically to handling of the IPv4 loopback addresses
and that logic HAS changed in the new version. The current version
explicitly creates lo 127.0.0.1, so my patch explicitly creates lo:0 as
127.0.1.1 to deal with the "new style" debian hosts file. Once I
stripped down my own customised hosts files, my simplistic patch really
works for all of my own network server software.

However, the experimental ifupdown does "the right thing" (in agreement
with your statement above) because (I think) it doesn't explicitly
define ANY loopback interfaces at all. Server sockets listening on "any
interface" (0.0.0.0) certainly work fine and I haven't //yet// found any
applications that don't work as desired.

Nevertheless, it seems foolhardy to take a snapshot of the experimental
package and toss it into any production ubuntu repository. I'm not yet
confident enough to simply strip out and backport the IPv4 loopback
logic from the current alpha version either. I suspect many people have
hacked their hosts files to get round problems in the past, and there
are a lot of applications that might have been hacked too. All of these
could be at risk if we release a fix - even if we are making the package
"work properly at last".

I think we should keep this bug filed against ifupdown - it will
certainly be changed in the next release and will certainly fix this
"bug". I honestly can't say yet whether bind (amongst other
applications) is also at fault for listening on the "wrong" local
addresses. I think we should leave that...

Read more...

Marking importance in bind as medium. I can see a definite need for the "FQDN" of the machine to always be addressable for services, and bind would need to work the same as other services that listen on "0.0.0.0". There are workarounds, and this is only some use cases, so Medium seems appropriate.

Changed in bind9 (Ubuntu):
importance: Undecided → Medium

On 11/08/11 19:06, Clint Byrum wrote:
> Marking importance in bind as medium. I can see a definite need for the
> "FQDN" of the machine to always be addressable for services, and bind
> would need to work the same as other services that listen on "0.0.0.0".
> There are workarounds, and this is only some use cases, so Medium seems
> appropriate.
>
> ** Changed in: bind9 (Ubuntu)
> Importance: Undecided => Medium

The Bind Manual from the latest ubuntu bind9-doc package says:

   If no listen-on is specified, the server will listen on
   port 53 on all IPv4 interfaces.

This statement is ambiguous. However, my /etc/bind/named.conf.options
DID NOT contain a listen-on clause and yet it does not listen on
0.0.0.0:53 (or :::53). It listens only on the explicit addresses of
localhost (127.0.0.1) and my ethernet interface.

The Bind Manual goes on to say:

   When { any; } is specified as the address_match_list for the
   listen-on-v6 option, the server does not bind a separate socket
   to each IPv6 interface address as it does for IPv4....

This implies that the IPv4 listen addresses will be selected after
enumerating the available interfaces. If true, then bind9 will not
discover the 127.0.1.1 address assigned in the default hosts file
because it isn't defined as an interface.

I tried a bypass of explicitly coding within the options section:

   listen-on { 127.0.0.1; 127.0.1.1; 10.1.252.11; };

After I restarted bind9, I was disappointed to see only:

tcp 0 0 10.1.252.11:53 0.0.0.0:* LISTEN 15451/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 15451/named
udp 0 0 10.1.252.11:53 0.0.0.0:* 15451/named
udp 0 0 127.0.0.1:53 0.0.0.0:* 15451/named

... it still isn't listening on 127.0.1.1!

p.s. cvsd doesn't listen on 0.0.0.0:2401, but it probably doesn't matter.

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in bind9 (Ubuntu):
status: New → Confirmed
Thomas Hood (jdthood) wrote :

With reference to comment #8, does this issue still affect ntpd and should it therefore be additionally-assigned accordingly?

Thomas Hood (jdthood) on 2012-04-04
summary: - network servers do not listen on 127.0.1.1
+ Certain services do not listen on 127.0.1.1

I am convinced this bug has more-or-less become irrelevant now that most applications have been reworked to use the dual ip stack, so I think it can be closed against ifupdown.

With specific reference to ntp, I have looked carefully at the behaviour of ntpd on oneiric 11.10 version 1:4.2.6.p2+dfsg-1ubuntu12.

It is listening on udp 0.0.0.0:123, which confirms that it would accept a packet addressed to 127.0.1.1. However, I note that my own ntpd is also listening on udp 127.0.0.1:123, which is redundant.

This is undoubtedly triggered by my /etc/ntp.conf, which uses one of the recommended set of restrict statements as follows:

# By default, exchange time with everybody, but don't allow configuration.
restrict -4 default kod notrap nomodify nopeer noquery
restrict -6 default kod notrap nomodify nopeer noquery

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

It seems as if ntpd uses specific sockets to implement its restrict rules, which is a very peculiar design decision. After reading the documentation, I tried changing my configuration as follows:

restrict 127.0.0.0 mask 255.0.0.0

... but that was not 100% successful - here is the syslog:

Apr 6 09:15:04 schizo ntpd[24872]: ntpd 4.2.6p2@1.2194 Fri Sep 2 18:37:15 UTC 2011 (1)
Apr 6 09:15:04 schizo ntpd[24873]: proto: precision = 0.596 usec
Apr 6 09:15:04 schizo ntpd[24873]: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
Apr 6 09:15:04 schizo ntpd[24873]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: Listen and drop on 1 v6wildcard :: UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: Listen normally on 2 lo 127.0.0.1 UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: Listen normally on 3 eth0 10.1.252.200 UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: Listen normally on 4 eth0 fe80::218:f3ff:fe43:7e4f UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: Listen normally on 5 lo ::1 UDP 123
Apr 6 09:15:04 schizo ntpd[24873]: attempt to configure invalid address 127.0.1.1

I do not have any restrict statements for the individual interfaces, so it isn't clear to me why ntpd needs to have different sockets for each of its implicit and explicit restrict rules.

Curiously, my attempt to supply the correct mask for the lo interface was partially acceptable, because ntpd discovered the 127.0.1.1 interface and subsequently found something wrong with it. Without looking at the code, I can't say whether there is a bug in ntpd.

Thomas Hood (jdthood) wrote :

See comment #8.

affects: bind9 (Ubuntu) → ntp (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in bind9 (Ubuntu):
status: New → Confirmed
Thomas Hood (jdthood) on 2012-12-04
affects: ifupdown (Debian) → bind9 (Ubuntu)
Changed in bind9 (Ubuntu):
importance: Unknown → Undecided
Thomas Hood (jdthood) wrote :

I assign this back to bind9 only because, so far as I can tell, I inadvertently de-assigned it from bind9 a while back. I don't mean to imply that I think that there is a problem with bind9.

I'll go further. Now that the NetworkManager-controlled dnsmasq process listens at 127.0.1.1, it's an advantage that named doesn't listen at 127.0.1.1. Because of this, bind9 doesn't conflict with nm-dnsmasq.

To check that there is no conflict I just installed bind9 alongside nm-dnsmasq on Ubuntu 12.10 and ran "netstat -nl4p".

    udp 0 0 127.0.1.1:53 0.0.0.0:* 7365/dnsmasq
    udp 0 0 127.0.0.1:53 0.0.0.0:* 8898/named
    [...]

Named listens at all addresses assigned to interfaces but not on the wildcard address.

Now, I know that dnsmasq has a new mode of operation called "bind-dynamic" which is like "bind-interfaces" mode except that it updates its listen addresses when network interfaces get configured and deconfigured.

It appears that bind9 operates in the same way. I added a virtual interface eth0:0 with a bogus address and named started listening on that without being restarted. I brought up a wireless interface and named started listening on that, too. Nice.

Thomas Hood (jdthood) on 2012-12-06
Changed in bind9 (Ubuntu):
status: New → Incomplete
Thomas Hood (jdthood) wrote :

For the reason explained in comment #17, setting to Invalid for bind9.

Changed in bind9 (Ubuntu):
status: Incomplete → Invalid
Changed in ntp (Ubuntu):
status: Confirmed → Incomplete
Thomas Hood (jdthood) wrote :

@Brian Burch: What is your opinion now about this report, insofar as it affects ntp?

summary: - Certain services do not listen on 127.0.1.1
+ ntpd does not listen on 127.0.1.1, the IP address associated with the
+ system hostname

On 17/12/12 14:34, Thomas Hood wrote:
> For the reason explained in comment #17, setting to Invalid for bind9.
>
> ** Changed in: bind9 (Ubuntu)
> Status: Incomplete => Invalid
>
> ** Changed in: ntp (Ubuntu)
> Status: Confirmed => Incomplete
>

Sorry about the long delay. I have just worked through the explanation
in comment #17 and can confirm that I see the same (desirable) behaviour
on a reasonably standard ubuntu 12.10 system. I agree that the original
bind9 problem is no longer relevant.

Thanks,

Brian

Brian Burch (brian-pingtoo) wrote :

On 17/12/12 14:38, Thomas Hood wrote:
> @Brian Burch: What is your opinion now about this report, insofar as it
> affects ntp?
>
> ** Summary changed:
>
> - Certain services do not listen on 127.0.1.1
> + ntpd does not listen on 127.0.1.1, the IP address associated with the system hostname
>

I did a quick check on a 12.10 ubuntu system and it superficially looks
as if the problem is still valid.

ntpd is listening on 127.0.0.1:123 and 0.0.0.0:123 (among others)

ntpq -p works OK.
ntpq -p localhost works OK.
ntpq -p myhostname fails with "timeout, nothing received".

/etc/hosts has localhost defined 127.0.0.1 and myhostname defined as
127.0.1.1. Both of these hostnames can be pinged successfully, as you
would expect.

I don't want to be pedantic, because this bug doesn't affect me. It
simply leads to illogical behaviour from ntpd. If there was a way to
close it with a status "nobody cares", I wouldn't complain. On the other
hand, "invalid" doesn't feel right. What about "wontfix"?

Brian

Thomas Hood (jdthood) wrote :

Yes, I'd say that "wontfix" is appropriate unless someone comes forward with a reason why ntpd *should* listen at 127.0.1.1.

P.S. I earlier wrote the following.

> It appears that bind9 operates in the same way.
> I added a virtual interface eth0:0 with a bogus address
> and named started listening on that without being
> restarted.

I later discovered that this occurs because bind9 has a hook script in /etc/network/if-up.d/ which does "rndc reconfig".

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.