Don't write search domains to resolv.conf in the case of split DNS

Bug #1592721 reported by Mathieu Trudel-Lapierre on 2016-06-15
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Medium
Mathieu Trudel-Lapierre
Xenial
Medium
Mathieu Trudel-Lapierre

Bug Description

[Impact]
All VPN users meaning to use split-tunnelling in a situation where leaking DNS data to the internet about what might be behind their VPN is undesirable.

[Test case]
1) connect to VPN
2) Use dig to request a name that should be on the VPN
3) Verify (kill -USR1 the dnsmasq binary from NM) that the request has only gone through the VPN nameservers (only its request number should have increased by one)
4) Use dig to request a name off-VPN, such as google.com.
5) Verify (kill -USR1 again) that the request has caused the non-VPN nameserver request number to increase, and that the VPN number has not increased.

It is easier to verify this when there is as little traffic as possible on the system, to avoid spurious DNS requests which would make it harder to validate the counters.

[Regression potential]
This change modifies the way in which DNS nameservers and search domains are passed to dnsmasq, as such, if a VPN is configured in a non-standard way and intended to be used to resolve all network requests (which is typically not the case for split-tunnelling) or if the public network is intended to always resolve all requests while the VPN still provides search domains, one might observe incorrect behavior.

Possible failure cases would include failure to resolve names correctly (resulting in NXDOMAIN or REFUSED from dnsmasq) or resolving to the incorrect values (if the wrong nameserver is used).

---

Currently, NM will write all search domains to both any DNS-handling plugins running, and also to resolv.conf / resolvconf; in all cases.

The issue is that doing so means that in the split-DNS case on VPNs, you might get a negative response from all nameservers, then a new request by glibc with the search tacked on, to nameservers again, which might cause DNS requests for "private" resources (say, on the VPN) to be sent to external, untrusted resolvers, or for DNS queries not meant for VPN nameservers to be sent through the VPN anyway.

This is fixable in the case where we have a caching plugin running (such as dnsmasq). dnsmasq will already know about the search domains and use that to limit queries to the right nameservers when a VPN is running. Writing search domains to resolv.conf is unnecessary in this case.

We should still write search domains if no caching gets done, as we then need to expect glibc to send requests as it otherwise would.

Changed in network-manager (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 1.2.2-0ubuntu4

---------------
network-manager (1.2.2-0ubuntu4) yakkety; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * debian/patches/libnm-Check-self-still-NMManager-or-not.patch: updated and
    refreshed to make gbp pq happy.
  * debian/patches/Read-config-from-run.patch: also read configuration from
    /run, which is to override whatever might be shipped in /usr/lib; and be
    overriden by /etc or command-line arguments. (LP: #1591898)
  * debian/10-globally-managed-devices.conf: ship a default config to
    explicitly unmanage anything that is not wifi or wwan: we definitely want
    NM to manage wifi and mobile data; and probably don't want it to touch
    wired in many cases.
  * debian/network-manager.postinst: on upgrade from previous versions of NM,
    make sure we migrate from no global "unmanaged" policy to something
    equivalent where we may have a global policy, but explicitly override it
    to be disabled; so that on upgrade users do not suddenly see some of their
    network devices no longer being handled by NM.
  * debian/patches/dns-manager-don-t-merge-split-DNS-search-domains.patch: do
    not add split DNS search domains to resolv.conf; doing so would risk
    leaking names to non-VPN DNS nameservers when attempting to resolve non-
    FQDN names. (LP: #1592721)

  [ Martin Pitt ]
  * debian/NetworkManager.conf: Put back dns=dnsmasq for now. Some important
    applications such as Chrome don't use NSS but reimplement DNS resolution,
    for those we need a local DNS server. Until resolved gets one, we continue
    to use the NM specific dnsmasq on the desktop. Correspondingly, revert
    libnss-resolve recommends back to dnsmasq-base depends.

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 16 Jun 2016 09:54:02 +0300

Changed in network-manager (Ubuntu):
status: In Progress → Fix Released
description: updated
Changed in network-manager (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)

Hello Mathieu, or anyone else affected,

Accepted network-manager into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/network-manager/1.2.2-0ubuntu0.16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in network-manager (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Andy Whitcroft (apw) wrote :

Hello Mathieu, or anyone else affected,

Accepted network-manager into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/network-manager/1.2.2-0ubuntu0.16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

James Troup (elmo) wrote :

I'm afraid this didn't work for me. I installed network-manager
1.2.2-0ubuntu0.16.04.3 from xenial-proposed, ran 'systemctl restart
NetworkManager' as root and reconnected to the VPN and I see the same
behaviour (i.e. I get DNS resolution failure for non-VPN domains). I
am on wifi only and both v4 and v6 are set to route all outbound
traffic.

Martin Pitt (pitti) wrote :

James: I'm not familiar with this specific fix, but I know that restarting NetworkManager will not kill/restart the spawned dnsmasq instances (by its own choice: KillMode=process). Does it help to stop NM, kill the dnsmasqs, and then restart it? (Or just a reboot)

Martin Pitt (pitti) wrote :

I'm releasing the SRU now to land the fixes for the other two bugs. I'm fairly convinced that this at least does not regress the situation here.

If it is not fixed for you after rebooting/restarting dnsmasq, can you please reopen? Thanks!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 1.2.2-0ubuntu0.16.04.3

---------------
network-manager (1.2.2-0ubuntu0.16.04.3) xenial; urgency=medium

  * debian/tests/wpa-dhclient: Don't assume that the IPv6 prefix length from
    the DHCP server is /64. (LP: #1609898)

network-manager (1.2.2-0ubuntu0.16.04.2) xenial; urgency=medium

  [ Martin Pitt ]
  * Read config and system connections from /run/NetworkManager/ to support
    netplan (LP: #1627641)
  * debian/gbp.conf: Set debian-branch to xenial

  [ Mathieu Trudel-Lapierre ]
  * Add dns-manager-don-t-merge-split-DNS-search-domains.patch: do not add
    split DNS search domains to resolv.conf; doing so would risk leaking names
    to non-VPN DNS nameservers when attempting to resolve non- FQDN names.
    (LP: #1592721)

 -- Martin Pitt <email address hidden> Tue, 27 Sep 2016 16:29:22 +0200

Changed in network-manager (Ubuntu Xenial):
status: Fix Committed → Fix Released
Thomas M Steenholdt (tmus) wrote :

This bug seems to bite me in a slightly different way. Please let me know if you feel that this is really a separate bug...

Also, this is happening on Yakkety - network-manager-1.2.4-0ubuntu1

When I connect to my work VPN, I'm not using split-tunnelling. Still the DNS resolution is split, causing DNS resolution to only be correct for my primary VPN search domain.

This log snippet seems to explain it all:
- 192.168.0.1 is my home ADSL DNS.
- example.local is the search suffix provided by my VPN.
- 10.10.10.12 and 10.10.10.13 are my VPN provided DNS servers.

Oct 12 06:43:04 bar14860 NetworkManager[870]: <info> [1476261784.8455] dns-mgr: Writing DNS information to /sbin/resolvconf
Oct 12 06:43:04 bar14860 dnsmasq[1226]: setting upstream servers from DBus
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 192.168.0.1#53(via enp0s31f6)
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.12#53 for domain example.local
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.12#53 for domain 20.10.10.in-addr.arpa
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.12#53 for domain 21.10.10.in-addr.arpa
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.13#53 for domain example.local
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.13#53 for domain 20.10.10.in-addr.arpa
Oct 12 06:43:04 bar14860 dnsmasq[1226]: using nameserver 10.10.10.13#53 for domain 21.10.10.in-addr.arpa
Oct 12 06:43:04 bar14860 NetworkManager[870]: <info> [1476261784.9543] policy: set 'vpn0' (vpn0) as default for IPv4 routing and DNS

The result of this is that all my other internal work-domains does not work at all, and may, as the bug describes, leak onto the internet as well.

If you need more info, let me know.

gpothier (gpothier) wrote :

I think I am observing a regression caused by this fix: after disconnecting/reconnecting a VPN connection, DNS resolution is broken. Here are the details:

- VPN is set up as OpenVPN with split-tunneling ("Use this connection only for resources on its network" is checked). The VPN's DNS domain is ozone.caligrafix.cl, and the DNS server is 192.168.0.2. The local (non-VPN) DNS server is 192.168.50.2.

- Right after boot, and after connecting to the VPN for the first time, I can ping a host on the VPN's network (ping somehost.ozone.caligrafix.cl)

- If I disconnect and reconnect to the VPN, I cannot ping the same host by name (I get Name or service not known). I can ping it by IP.

Strangely enough, dnsmask says it does use the VPN's resolver, as shown by this syslog extract:

Nov 1 23:09:28 tadzim3 dnsmasq[1671]: setting upstream servers from DBus
Nov 1 23:09:28 tadzim3 dnsmasq[1671]: using nameserver 192.168.50.2#53(via wlan0)
Nov 1 23:09:28 tadzim3 dnsmasq[1671]: using nameserver 192.168.0.2#53 for domain ozone.caligrafix.cl
Nov 1 23:09:28 tadzim3 dnsmasq[1671]: using nameserver 192.168.0.2#53 for domain 1.8.10.in-addr.arpa
Nov 1 23:09:28 tadzim3 dnsmasq[1671]: using nameserver 192.168.0.2#53 for domain 0.168.192.in-addr.arpa
Nov 1 23:09:28 tadzim3 dnsmasq[1450]: reading /etc/resolv.conf
Nov 1 23:09:28 tadzim3 dnsmasq[1450]: using nameserver 127.0.1.1#53
^C
gpothier@tadzim3:~$ ping somehost.ozone.caligrafix.cl
ping: somehost.ozone.caligrafix.cl: Name or service not known

gpothier (gpothier) wrote :

PS: Restarting network-manager an reconnecting to the VPN works around the issue.

Is there a way to backport this to 16.04? Thanks!

Caleb (robotrising) wrote :

Yes, please bring to 16.04

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers