DNS domain search paths not updated when VPN started

Bug #1726124 reported by Paul Smith on 2017-10-22
48
This bug affects 9 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
High
Unassigned
network-manager-openvpn (Ubuntu)
High
Unassigned
systemd (Ubuntu)
High
Unassigned

Bug Description

I connect to work with openvpn through network-manager-openvpn. I'm selecting automatic (DHCP) to get an IP address, and "Use this connection only for resources on its network" to support split tunneling.

In the last few versions of Ubuntu I used, this all worked fine. In Ubuntu 17.10 (fresh install, not upgrade) I can access hosts on both my VPN network and the internet, BUT I have to use FQDN for my VPN network hosts: the updates to the DNS search path provided by my VPN DHCP server are never being applied.

Investigating the system I see that /etc/resolv.conf is pointing to /run/systemd/resolve/stub-resolv.conf and that resolv.conf does not have any of the VPN's search path settings in it:

  # This file is managed by man:systemd-resolved(8). Do not edit.
  #
  # 127.0.0.53 is the systemd-resolved stub resolver.
  # run "systemd-resolve --status" to see details about the actual nameservers.
  nameserver 127.0.0.53

  search home

In previous versions of Ubuntu, where NetworkManager controlled the resolver not systemd, /etc/resolv.conf pointed to /run/NetworkManager/resolv.conf and there was a local dnsmasq instance that managed all the complexity. In Ubuntu 17.10 when I look in /run/NetworkManager/resolv.conf file, I see that the search paths ARE properly updated there:

  $ cat /run/NetworkManager/resolv.conf
  # Generated by NetworkManager
  search internal.mycorp.com other.mycorp.com home
  nameserver 127.0.1.1

However this file isn't being used, and also there's no dnsmasq running on the system so if I switch my /etc/resolv.conf to point to this file instead, then all lookups fail.

Strangely, if I look at the systemd-resolv status I see that in theory systemd-resolve does seem to know about the proper search paths:

  $ systemd-resolve --status
     ...
  Link 3 (tun0)
        Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         LLMNR setting: yes
  MulticastDNS setting: no
        DNSSEC setting: no
      DNSSEC supported: no
           DNS Servers: 10.3.0.10
                        10.8.42.2
            DNS Domain: ~internal.mycorp.com
                        ~other.mycorp.com

but for whatever reason the search domains are not getting put into the resolv.conf file:

  $ host mydesk
  ;; connection timed out; no servers could be reached

  $ host mydesk.internal.mycorp.com
  mydesk.internal.mycorp.com has address 10.8.37.74

(BTW, the timeout in the failed attempt above takes 10s: it is SUPER frustrating when all your host lookups are taking that long just to fail).

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: systemd 234-2ubuntu12
ProcVersionSignature: Ubuntu 4.13.0-16.19-generic 4.13.4
Uname: Linux 4.13.0-16-generic x86_64
ApportVersion: 2.20.7-0ubuntu3
Architecture: amd64
CurrentDesktop: GNOME
Date: Sun Oct 22 15:08:57 2017
InstallationDate: Installed on 2017-10-21 (1 days ago)
InstallationMedia: Ubuntu 17.10 "Artful Aardvark" - Release amd64 (20171018)
MachineType: System manufacturer System Product Name
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.13.0-16-generic root=UUID=4384306c-5fed-4b48-97a6-a6d594c4f72b ro quiet splash vt.handoff=7
SourcePackage: systemd
SystemdDelta:
 [EXTENDED] /lib/systemd/system/rc-local.service → /lib/systemd/system/rc-local.service.d/debian.conf
 [EXTENDED] /lib/systemd/system/user@.service → /lib/systemd/system/user@.service.d/timeout.conf

 2 overridden configuration files found.
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/02/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2101
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: M5A78L-M/USB3
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2101:bd12/02/2014:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnM5A78L-M/USB3:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.family: To Be Filled By O.E.M.
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Paul Smith (psmith-gnu) wrote :
Dimitri John Ledkov (xnox) wrote :

network-manager has resolved integration and it does push search domains to resolved and they are updated in the /etc/resolv.conf.

This should continue to work in 17.10. Hence marking this bug affect network-manager and network-manager-openvpn

Paul Smith (psmith-gnu) wrote :

To be clear, they are NOT updated in /etc/resolv.conf. They ARE updated in /run/NetworkManager/resolv.conf, but that has no effect since we're not using that for resolution: instead we're using systemd-resolve.

And as I noted above, it does appear that systemd-resolve is notified about the search domains, because "systemd-resolve --status" does show them associated with the openvpn tun0 device.

However, those search domains are never manifested in the /run/systemd/resolve/stub-resolv.conf which is what /etc/resolv.conf is pointing to, so they never take effect.

Dimitri John Ledkov (xnox) wrote :

Thank you for the update. I will check the code, and will try to resolve this as soon as possible.

Dimitri John Ledkov (xnox) wrote :

Does everything work, if you route all traffic via VPN?

Dimitri John Ledkov (xnox) wrote :

Right, so it seems like NM checks the never-default setting, and doesn't push them as resolve domains =/ I will test things out to be sure.

Specifically:

$ nmcli c show YOUR_VNP_CONNECTION_NAME | grep never-default
ipv4.never-default: yes
ipv6.never-default: yes

And if it is never-default, it doesn't push search domains out, only routing domains.

I believe this might be actually correctly intended configuration of the openvpn and network-manager and resolved. Since all traffic is not routed to the VPN connection, its domain names should not be used for resolution. E.g. resolving "foo.company.example.net" should work and should end up being resolved via nameserver on the vpn, and not leak DNS resolution to public DNS server, and one should be able to connect to that host. But resovling "foo" should be failing.

I understand this is a change of behaviour, thus does seem like a regression. I will consult on what is correct behaviour and how to best fix it.

tags: added: regression-release
Paul Smith (psmith-gnu) wrote :

I'm not sure I understand what you mean. In a typical configuration you never send "foo" to the nameservice, there's always a search domain and those lookups are always tried first (because the default value for the ndots is 1). This is handled by the libc resolver linked into every program, it's not handled by a central service.

I don't have any idea how systemd-resolve works. But, my understanding of how it used to work with dnsmasq using split tunneling was that the nameservice mapped domains to DNS servers, and the resolver would only forward requests to the DNS server that matched the domain for the host being requested.

If you have a VPN interface that adds "mycorp.com" to the search domain that appears in /etc/resolv.conf search, so it contains "mycorp.com localdomain" for example. Then when someone tries to resolve "myhost", the libc resolver sees that there are no dots here and so it starts appending search paths to the hostname, in order, and sending them to the DNS service to look up. So first it will send "myhost.mycorp.com" to the resolver. The resolver sees that the hostname ends in "mycorp.com" and it knows that the VPN DNS servers "own" that domain, so it forwards that request to those servers for lookup.

If that doesn't match, then the libc resolver will try to look up "myhost.localdomain". That does NOT match the VPN domain, so it will not try to forward that to the DNS servers for the VPN connection and instead use a different resolver.

I don't think there's any information leakage here.

tags: added: rls-bb-incoming
Changed in network-manager (Ubuntu):
importance: Undecided → High
Changed in network-manager-openvpn (Ubuntu):
importance: Undecided → High
Sebastien Bacher (seb128) wrote :

It would probably be a good idea to forward it to n-m upstream as well on bugzilla.gnome.org

Changed in systemd (Ubuntu):
importance: Undecided → High
Paul Smith (psmith-gnu) wrote :

Hi Sebastien, thanks for your interest. This issue is causing me extreme pain. I believe I'm going to have to reconfigure N-M to use the old-style dnsmasq setup and throw out systemd-resolve pretty soon.

It's not clear to me whether this is an N-M bug vs. a systemd bug, but I'm happy to file more bugs. Did you want me to file a bug with N-M?

Paul Smith, what you describe is information leakage and shouldn't IMHO work as you say by default.

Consider that I'm connected to a corporate network and have an (untrusted) VPN active which I only want to use to access resources on its network (never-default: yes). Then by having the resolver adding the domain of the VPN network to short name lookups could leak those local names to the remote VPN (depending on the order the lookups are performed in) and potentially allow the untrusted network to take over internal services that are accessed using short names. This could happen by mistake also (such as setting "mail" as your smtp server if the remote network uses the same name).

I don't think the order of the lookups can be controlled to prevent this, for example what should determine the order when you have to VPN active?

to->two

BTW, the long timeout you see on short name lookup failures is most likely due to LLMNR being on by default. I think this is insane and I always switch it off in /etc/systemd/resolved.conf .

Paul Smith (psmith-gnu) wrote :

Andreas: unfortunately disallowing short name lookups is not acceptable: many environments use short names in embedded URLs all over the place, and without domain search paths the entire environment is rendered completely unusable (e.g., URLs are simply https://tools/foo or https://wiki/foo or whatever; this means no cross-links work).

Connecting to an untrusted VPN is a tiny sliver of a minority of the usage of VPN; 99.99% of the time when someone connects to a VPN they're trying to create a secure link to another trusted network (e.g., working remotely).

If you want to provide support for both modes of handling short names with some kind of checkbox to select between them that's fine with me, but the current behavior is a serious loss of functionality from previous solutions.

Steve Langasek (vorlon) wrote :

This change in behavior is deliberate. There are two mutually incompatible interpretations of DNS search lists provided via a VPN connection. One is for split DNS, to say "this is the list of domains for which you should send lookups to the accompanying DNS server". The other is to use it as a search list for resolv.conf. Unfortunately, interpreting wrongly in either direction breaks client configs. But whereas there are other ways that one can configure the behavior of resolv.conf to add search domains, the only reasonable way to configure split DNS is to do so by providing this information directly from network-manager-openvpn to systemd-resolved.

It may be that network-manager-openvpn needs an additional configuration option, to allow the user to declare which of these two ways (or both, or neither) they want to use the VPN server-provided DNS search list.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed
Changed in network-manager-openvpn (Ubuntu):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
Jeroen Hoek (mail-jeroenhoek) wrote :

I hope something can be worked out before this lands in the next LTS-release. This breaks one of the most common uses of a VPN (remote work).

Right now a workaround is using this script after connecting to a VPN-server:

https://github.com/jonathanio/update-systemd-resolved

It calls systemd-resolved via DBus to add the proper settings after the OpenVPN connection is established.

Of course this means abandoning NetworkManager for use with OpenVPN, because there is also no way to specify --up and --down parameters.

Paul Smith (psmith-gnu) wrote :

I'm not sure why it's being asserted that these two uses are mutually incompatible, especially since I've been using them in a coordinated way forever and can still do so (at the expense of junking systemd-resolve and going back to using dnsmasq and a custom configuration, which is obviously a big pain). I've even, in the past, used MULTIPLE VPNs, even at the same time, and it still just works.

If the resolver library (e.g., gethostbyname etc.) gets an unqualified hostname, it uses the search path just as it always has including ndots and all that stuff, to generate FQ hostnames. No change there.

When the local resolver caching service (dnsmasq, systemd-resolv) gets a FQ hostname it looks through the extensions provided by the VPN DHCP information and if the hostname matches that extension it forwards the lookup to the DNS server for that VPN. If it doesn't match, it doesn't forward the request. If it doesn't match any of the VPN search paths, it forwards the request to the default DNS servers.

I honestly don't understand why we're considering these uses incompatible. They seem to me to be exactly compatible and exactly what you want to do, at least the vast majority of the time.

tags: added: id-5a7491099adc12270ee9c94d
Mister Hippo (mista.hippo) wrote :

I would also like to chime in, that it is very useful to have the ability to append a DNS-suffix when doing short-name DNS lookups on a split-tunnel VPN connection.

It would be great if the devs could find a way to include this functionality in the upcoming LTS! :)

Will Cooke (willcooke) wrote :

We won't be able to add the option to select between the two options this cycle.

tags: removed: rls-bb-incoming
Iain Lane (laney) on 2018-03-13
tags: added: rls-bb-notfixing
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers