mdns lookups fail over ipv6

Bug #369008 reported by Matt LaPlante
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
nss-mdns (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: libnss-mdns

I have avahi running on a couple hosts (Jaunty) in my dual-stack lan. All ipv4 mdns resolution seems to work fine, but ipv6 resolution seems to be failing, even though resolving the same ipv6 hosts works fine using bind. I'm not sure exactly where in the chain the fault lies, but something seems to be amiss. Additionally, a Mac running Bonjour has no problem resolving the same hosts using IPv6.

/etc/avahi/avahi-daemon.conf
[server]
#host-name=foo
#domain-name=local
#browse-domains=0pointer.de, zeroconf.org
use-ipv4=yes
use-ipv6=yes
#check-response-ttl=no
#use-iff-running=no
#enable-dbus=yes
#disallow-other-stacks=no
#allow-point-to-point=no

[wide-area]
enable-wide-area=yes

[publish]
#disable-publishing=no
#disable-user-service-publishing=no
#add-service-cookie=no
#publish-addresses=yes
#publish-hinfo=yes
#publish-workstation=yes
#publish-domain=yes
#publish-dns-servers=192.168.50.1, 192.168.50.2
#publish-resolv-conf-dns-servers=yes
#publish-aaaa-on-ipv4=yes
publish-a-on-ipv6=yes

[reflector]
#enable-reflector=no
#reflect-ipv=no
...

----------------

root@host1:/# ping host2
PING host2 (192.168.4.1) 56(84) bytes of data.
64 bytes from host2 (192.168.4.1): icmp_seq=1 ttl=64 time=0.243 ms
^C
--- host2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.243/0.243/0.243/0.000 ms

root@host1:/# ping host2.local
PING host2.local (192.168.4.1) 56(84) bytes of data.
64 bytes from host2 (192.168.4.1): icmp_seq=1 ttl=64 time=0.212 ms
^C
--- host2.local ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.212/0.212/0.212/0.000 ms

root@host1:/# ping6 host2
PING host2(2001:X:X:X::1) 56 data bytes
64 bytes from 2001:X:X:X::1: icmp_seq=1 ttl=64 time=0.189 ms
^C
--- host2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.189/0.189/0.189/0.000 ms

root@host1:/# ping6 host2.local
unknown host

----------------

The kicker is, the Mac has no problem with this last lookup, meaning the issue has to be with my linux client implementation:

macbookpro:~$ ping6 host2.local
PING6(56=40+8+8 bytes) 2001:X:X:X:X:X:X:X --> 2001:X:X:X::1
16 bytes from 2001:X:X:X::1, icmp_seq=0 hlim=64 time=98.522 ms
16 bytes from 2001:X:X:X::1, icmp_seq=1 hlim=64 time=2.524 ms
^C
--- host2.local ping6 statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 2.524/50.523/98.522 ms

Tags: patch
Revision history for this message
AndrewD (andrewd-lists) wrote :

This works for me:

in /etc/avahi/avahi-daemon.conf [server] section:

use-ipv6=yes

Also change nsswitch.conf:

$ diff -u nsswitch.conf /etc/nsswitch.conf
--- nsswitch.conf 2009-04-16 11:30:17.132488848 +1000
+++ /etc/nsswitch.conf 2009-04-16 11:30:28.000000000 +1000
@@ -8,7 +8,7 @@
group: compat
shadow: compat

-hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
+hosts: files mdns_minimal [NOTFOUND=return] dns mdns
networks: files

It would make sense for these changes to be the default. Avahi works well on IPV6 and IPV6 is perfect for zero configuration link local devices.

Revision history for this message
AndrewD (andrewd-lists) wrote :

Note that I have reported Bug 374674 which is indirectly related to this - it prevents using IPV6 addresses in some cases

Revision history for this message
greg (grigorig) wrote :

+1 for this. I'd welcome a working IPv6 mDNS - and this is really trivial to implement, see AndrewD's post.

Revision history for this message
Jens Jorgensen (jorgensen) wrote :

Actually I'm a bit surprised to hear that the patch mentioned in 374674 actually fixes this bug. The problem with nss-mdns for .local addresses is that even when it finds the address (via avahi) which works just fine, getaddrinfo returns the IPV6 address but always leaves scope_id empty. Linux doesn't like you trying to talk to IPV6 Link-local (fe80::) address without a valid scope_id which makes sense because the same fe80:: address can be used on multiple links.

On the nss internal side only _nss_gethostbyname4_r has a place to return the scope id. I've written a patch to the nss-mdns.c code and submitted it to the debian bug tracking this problem but after a week I've had no response. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=644912 If you're interested in trying out my fix you could 'apt-get source nss-mdns' and then apply my patch and see how it works for you? I've had it in place on my ubuntu machine for a couple weeks now and everything is working smoothly. FYI I also sent the patch upstream to the nss-mdns author but got no response yet there either :-(

Revision history for this message
Jens Jorgensen (jorgensen) wrote :

I should also note that it seems nss-mdns normally configures such only ipv4 lookups are done:

hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

so you need to remove the "4" from mdns in order to get the code that will do ipv6 lookups.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "nss-mdns-ipv6-scope-id-patch.diff" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nss-mdns (Ubuntu):
status: New → Confirmed
Revision history for this message
Craig McQueen (cmcqueen1975) wrote :

I've tried applying the patch. I can confirm via Python socket.getaddrinfo() that the IPv6 scope ID is now being set. So that seems to be a good fix.

However, if I try ping6 or ssh -6 to my own PC craig-linux.local, I still get an "Invalid argument" error:
$ ping6 craig-linux.local
connect: Invalid argument
$ ssh -6 craig-linux.local
ssh: connect to host craig-linux.local port 22: Invalid argument

But this works:
$ ping6 -I eth2 craig-linux.local
PING craig-linux.local(fe80::21b:21ff:febb:73b2) from fe80::21b:21ff:febb:73b2 eth2: 56 data bytes
...

If I strace it, I see:
connect(3, {sa_family=AF_INET6, sin6_port=htons(1025), inet_pton(AF_INET6, "fe80::21b:21ff:febb:73b2", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)

So the scope ID is still set to 0.

Revision history for this message
Craig McQueen (cmcqueen1975) wrote :

By the way, I'm currently testing on Ubuntu 13.04. I guess I should try 13.10 or 14.04.

Revision history for this message
Craig McQueen (cmcqueen1975) wrote :

Okay I think I've found a problem with getaddrinfo() when the AF_INET6 family is specified. For example, Python's socket.getaddrinfo() provides scope ID if _all_ families are looked-up, but it returns scope ID of 0 if only IPv6 family is looked up. E.g. in Python:

>>> import socket
>>> from pprint import pprint
>>> a = socket.getaddrinfo("craig-linux.local", None)
>>> pprint(a)
[(10, 1, 6, '', ('fe80::21b:21ff:febb:73b2%eth2', 0, 0, 3)),
 (10, 2, 17, '', ('fe80::21b:21ff:febb:73b2%eth2', 0, 0, 3)),
 (10, 3, 0, '', ('fe80::21b:21ff:febb:73b2%eth2', 0, 0, 3)),
 (2, 1, 6, '', ('192.168.5.3', 0)),
 (2, 2, 17, '', ('192.168.5.3', 0)),
 (2, 3, 0, '', ('192.168.5.3', 0))]
>>> a = socket.getaddrinfo("craig-linux.local", None, socket.AF_INET6)
>>> pprint(a)
[(10, 1, 6, '', ('fe80::21b:21ff:febb:73b2', 0, 0, 0)),
 (10, 2, 17, '', ('fe80::21b:21ff:febb:73b2', 0, 0, 0)),
 (10, 3, 0, '', ('fe80::21b:21ff:febb:73b2', 0, 0, 0))]

So for some reason the scope ID is present when doing all-families look-up, but lacking in the IPv6-only family look-up.

ping6 fails because it is specifying AF_INET6.

I've got a test server listening on both IPv4 and IPv6. If I connect to it with "telnet craig-linux.local 12100" then it is able to connect successfully to the IPv6 address. But if I specify the -6 switch, "telnet -6 craig-linux.local 12100" then it fails to connect.

Revision history for this message
Craig McQueen (cmcqueen1975) wrote :

Debugging the case of scope ID == 0 when family is set to AF_INET6...

I'm finding that when family is set to AF_INET6, then _nss_gethostbyname2_r is called instead of _nss_gethostbyname4_r. This decision is made by libresolv. I did 'apt-get source libc6', which got me eglib-2.17 source (this is on Ubuntu 13.10 at the moment). In sysdeps/posix/getaddrinfo.c, I indeed see that it's only calling _nss_gethostbyname4_r when family is set to AF_UNSPEC. A comment says "gethostbyname4_r sends out parallel A and AAAA queries and is thus only suitable for PF_UNSPEC."

So that looks like a clash of requirements:

a) _nss_gethostbyname4_r is needed for IPv6 scope ID
b) _nss_gethostbyname4_r is allegedly not suitable for IPv6-only queries, because it sends out parallel A and AAAA queries.

Surely _nss_gethostbyname4_r SHOULD be used in order to get IPv6 scope ID.

What should we do next?

Revision history for this message
Craig McQueen (cmcqueen1975) wrote :

In eglibc SVN repository revision 20392, Tue Aug 28 14:14:43 2012 UTC, the getaddrinfo() code was changed to call _nss_gethostbyname4_r only for family AF_UNSPEC. The commit comment just says "Merge changes between r20213 and r20391 from /fsf/trunk."

Looking at /fsf/trunk/libc/sysdeps/posix/getaddrinfo.c, I trace it to revision 20296. The commit comment just says "Import glibc-mainline for 2012-08-23".

This I trace to glibc commit 8479f23aa1:
http://repo.or.cz/w/glibc.git/commit/8479f23aa1d5e5477a37f46823856bdafaedfa46

I think this commit is ultimately problematic because it breaks IPv6 scope ID retrieval. I think it should be reverted, so that _nss_gethostbyname4_r is always called for IPv6. Whatever problem it was aiming to solve, it should probably be solved a different way.

Revision history for this message
Craig McQueen (cmcqueen1975) wrote :
Revision history for this message
Gregory P Smith (gpshead) wrote :

Just a note that Andrew's comment#1 https://bugs.launchpad.net/ubuntu/+source/nss-mdns/+bug/369008/comments/1 is still useful as the fix.

On Ubuntu 18.04 /etc/nsswitch.conf is still defaulting to mdns4_minimal and mdns4. Rather than the much more useful mdns_minimal and mdns which don't cripple IPv6 name resolution.

macOS comes with mdns .local resolution supporting IPv6 by default these days. Surely Ubuntu should as well.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.