Name resolution stops working after resume from suspend

Bug #1631241 reported by Michael Gratton
136
This bug affects 28 people
Affects Status Importance Assigned to Milestone
dnsmasq
Fix Released
Unknown
dnsmasq (Ubuntu)
Confirmed
Undecided
Unassigned
network-manager (Ubuntu)
Confirmed
High
Unassigned

Bug Description

After upgrading to Yakkety, when my Ubuntu GNOME laptop resumes from suspend DNS resolution stops working.

* Also affect Xenial (Unity/GnomeFlashback)since Network-Manager stack was upgraded to 1.2.6.

After resuming, systemd-resolved is running and libnss-resolve is installed, but /etc/resolv.conf contains 127.0.1.1 as the the only name server. The dnsmasq-base package is installed since it is pulled in by both network-manager and lxc1, and both NM and libvirt have spawned instances of dnsmasq:

> 1155 pts/2 S+ 0:00 grep dnsmasq
> 2724 ? S 0:00 dnsmasq -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --listen-address 10.0.3.1 --dhcp-range 10.0.3.2,10.0.3.254 --dhcp-lease-max=253 --dhcp-no-override --except-interface=lo --interface=lxcbr0 --dhcp-leasefile=/var/lib/misc/dnsmasq.lxcbr0.leases --dhcp-authoritative
> 2992 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper
> 2993 ? S 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper
> 22879 ? S 0:00 /usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-file=/var/run/NetworkManager/dnsmasq.pid --listen-address=127.0.1.1 --cache-size=0 --conf-file=/dev/null --proxy-dnssec --enable-dbus=org.freedesktop.NetworkManager.dnsmasq --conf-dir=/etc/NetworkManager/dnsmasq.d

Let me know if you need any extra info.

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: libnss-resolve 231-9git1
ProcVersionSignature: Ubuntu 4.8.0-17.19-generic 4.8.0-rc7
Uname: Linux 4.8.0-17-generic x86_64
ApportVersion: 2.20.3-0ubuntu7
Architecture: amd64
CurrentDesktop: GNOME
Date: Fri Oct 7 13:52:40 2016
InstallationDate: Installed on 2015-07-22 (443 days ago)
InstallationMedia: Ubuntu-GNOME 15.04 "Vivid Vervet" - Release amd64 (20150422)
SourcePackage: systemd
UpgradeStatus: Upgraded to yakkety on 2016-10-05 (1 days ago)

Revision history for this message
Michael Gratton (mjog) wrote :
Revision history for this message
Martin Pitt (pitti) wrote :

With NetworkManager we switched back to the "dnsmasq" plugin, so it is indeed correct that resolv.conf only contains 127.0.1.1. It also does that without suspending.

So what is actually not working?

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Michael Gratton (mjog) wrote :

Name resolution, in everything I tried (ping, Epiphany, Geary).

Is it normal for dnsmasq to not listen on UDP?

> mjg@payens:~$ host vee.net 127.0.1.1
> Using domain server:
> Name: 127.0.1.1
> Address: 127.0.1.1#53
> Aliases:
>
> Host vee.net not found: 5(REFUSED)
> mjg@payens:~$ host -T vee.net 127.0.1.1
> Using domain server:
> Name: 127.0.1.1
> Address: 127.0.1.1#53
> Aliases:
>
> vee.net has address 203.18.245.244
> vee.net mail is handled by 10 mail.quuxo.net.

Revision history for this message
Martin Pitt (pitti) wrote :

Are you sure that you are actually online? What does "nmcli g" say? Does "dig A www.ubuntu.com @8.8.8.8" work?

Revision history for this message
Michael Gratton (mjog) wrote :

Well I'm posting this from the computer with the problem, so yes I'm online. :)

To fix the problem I need to manually edit /etc/resolv.conf and replace dnsmasq's 127.0.0.1 with resolve's 127.0.0.53 (or any other reachable name server).

> mjg@payens:~$ nmcli g
> STATE CONNECTIVITY WIFI-HW WIFI WWAN-HW WWAN
> connected full enabled enabled enabled enabled
> mjg@payens:~$ dig A www.ubuntu.com @8.8.8.8
>
> ; <<>> DiG 9.10.3-P4-Ubuntu <<>> A www.ubuntu.com @8.8.8.8
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16454
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 512
> ;; QUESTION SECTION:
> ;www.ubuntu.com. IN A
>
> ;; ANSWER SECTION:
> www.ubuntu.com. 164 IN A 91.189.90.58
>
> ;; Query time: 6 msec
> ;; SERVER: 8.8.8.8#53(8.8.8.8)
> ;; WHEN: Fri Oct 07 17:51:58 AEDT 2016
> ;; MSG SIZE rcvd: 59

Revision history for this message
Michael Gratton (mjog) wrote :

Err, I mean "dnsmasq's 127.0.1.1"

Revision history for this message
Martin Pitt (pitti) wrote :

> To fix the problem

Please tell us what actually *is* the problem.

> I need to manually edit /etc/resolv.conf and replace dnsmasq's 127.0.0.1 with resolve's 127.0.0.53

So are you trying to not use NM's dnsmasq instance because that stops working after suspend/resume?

Revision history for this message
Michael Gratton (mjog) wrote : Re: [Bug 1631241] Re: Name resolution stops working after resume from suspend

On Fri, Oct 7, 2016 at 10:44 PM, Martin Pitt <email address hidden>
wrote:
>> To fix the problem
>
> Please tell us what actually *is* the problem.

My apologies, I should have said "To work around the problem" in the
last comment.

>> I need to manually edit /etc/resolv.conf and replace dnsmasq's
> 127.0.0.1 with resolve's 127.0.0.53
>
> So are you trying to not use NM's dnsmasq instance because that stops
> working after suspend/resume?

It looks like it, yes. As I (obliquely) mentioned in comment 3, dnsmasq
seems to stop responding to UDP requests on 127.0.0.1 after resume.

--
⊨ Michael Gratton, Percept Wrangler.
⚙ <http://mjog.vee.net/>

Revision history for this message
Martin Pitt (pitti) wrote :

OK. This could be because NM does not feed it with correct DNS data, or dnsmasq itself gets confused after resuming. Assigning to NetworkManager for now.

affects: systemd (Ubuntu) → network-manager (Ubuntu)
Changed in network-manager (Ubuntu):
status: Incomplete → New
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed
Revision history for this message
Michael Gratton (mjog) wrote :

Some data points about this:

- Name resolution always works fine after first boot
- When name resolution is failing, dnsmasq seems to have stop accepting connections on UDP on 120.0.1.1 (TCP seems to work fine)
- I can reproduce the issue connecting to wifi at work - WPA2 enterprise wifi network
- I cannot reproduce the issue on the home wifi - WPA2 personal

Revision history for this message
drplix (pjr-1060) wrote :

Hi Martin - not sure if this is related. But this bug report is also indicating problems with dnsmasq when establishing a VPN connection...

https://bugs.launchpad.net/ubuntu/+source/openvpn-systemd-resolved/+bug/1636395

Thanks for any help with this.

Revision history for this message
Dominik (dominalien) wrote :

I have the same problem, but only some dns queries stop working. For instance, google.com works fine, while duckduckgo.com doesn't.

When trying to ping a server that doesn't work, it fails immediately with the message (translating from Polish, sorry): This name or service is unknown.

I have seen workarounds like restarting dnsmasq, or disabling it in /etc/NetworkManager/NetworkManager.conf, but these don't work for me. What works is doing dpkg-reconfigure resolvconf.

Revision history for this message
Rocko (rockorequin) wrote :

I have noticed that doing a "sudo service network-manager restart" usually fixes things, but not always and not for everything. For instance, on my laptop at the moment firefox can't find any webpages ("Server not found"), but Chrome is working fine; ntpd is complaining twice a minute in the log that it can't find 0.ubuntu.pool.ntp.org etc, even though "nslookup 0.ubuntu.pool.ntp.org" gives a valid address; "ping google.com" says "Name or service not known", even though "nslookup google.com" returns a valid address. "dig google.com" is working.

Aron Xu (happyaron)
tags: added: nm-improvements
Revision history for this message
Sebastien Bacher (seb128) wrote :

The issue reported there seems a bit similar to the one in https://bugzilla.redhat.com/show_bug.cgi?id=1373485

The redhat report has been fixed in dnsmasq with that commit
http://pkgs.fedoraproject.org/cgit/rpms/dnsmasq.git/commit/?id=cfdd2cf7648814c9e2c3f938458a2221721dc0c9

Changed in network-manager (Ubuntu):
importance: Undecided → High
Revision history for this message
Ernst Persson (ernstp) wrote :

The Redhat patch is the same patch as the one that fixes Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=834722

That patch is already upstream, where the debian package seems to be maintained also:
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commitdiff;h=0682b7795cd0c25553b91de89c2a3b3d3ff17014;hp=b637d7815da89b5fb04c27b1d9a361fe5b2622a0

Revision history for this message
Ernst Persson (ernstp) wrote :

2.76-4.1 should fix this, uploaded a build to my ppa: https://launchpad.net/~ernstp/+archive/ubuntu/ppa

affects: network-manager → dnsmasq
Changed in dnsmasq:
status: Unknown → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in dnsmasq (Ubuntu):
status: New → Confirmed
Revision history for this message
Travis Downs (travis-downs) wrote :

I just started seeing this recently in 16.04. I used the workaround suggested here:

http://askubuntu.com/questions/837575/dns-resolution-fails-after-wakeup-from-standby-ubuntu-16-10

of commenting out the dnsmasq entry in /etc/NetworkManager/NetworkManager.conf

Revision history for this message
Kostadin Stoilov (kmstoilov) wrote :

I have also been affected by this bug after upgrading network-manager and related components to 1.2.6-0ubuntu0.16.04.1 from 1.2.2-0ubuntu0.16.04.4

My current workaround is to do:

sudo apt install network-manager=1.2.2-0ubuntu0.16.04.4
sudo apt install libnm-glib-vpn1=1.2.2-0ubuntu0.16.04.4
sudo apt install libnm-glib4=1.2.2-0ubuntu0.16.04.4
sudo apt install libnm0=1.2.2-0ubuntu0.16.04.4
sudo apt install libnm-util2=1.2.2-0ubuntu0.16.04.4

Followed by:
sudo apt-mark hold libnm-glib-vpn1
sudo apt-mark hold libnm-glib4
sudo apt-mark hold libnm0
sudo apt-mark hold network-manager
sudo apt-mark hold libnm-util2

Revision history for this message
Lauri Laht (2iii7) wrote :

Same problem Dell XPS 9550 with 16.04.02. Kernel 4.8.0-44

Revision history for this message
Henrik Nilsson (it-henrik) wrote :

I see this bug marked as "fix released" in January and that it should be fixed 2.76-4.1, however I only see version 2.76-4 even if I in the "Software & Updates" settings set my "Ubuntu Software" to download from "Main Server".

$ apt list dnsmasq
Listing... Done
dnsmasq/yakkety,yakkety 2.76-4 all

or am i misunderstanding something?

Revision history for this message
Ernst Persson (ernstp) wrote :

It's fixed in Zesty but not in Yakkety.

Revision history for this message
pgramond (pgramond) wrote :

Hi, i've the same problem when i disconnect-reconnect a PPP connection (mobile broadband actually).
I 've found a workaround killing dnsmasq in pre-up script of Network-Manager (then, dnsmasq gets automatically restarted).
The problem appeared passing from network-manager 1.2.2 to 1.2.6 (while dnsmasq remained unchanged).

Will Network-Manager be patched, or should we wait for dnsmasq upagrade in Yakkety ?

Revision history for this message
Eugene San (eugenesan) wrote :

Confirming the bug in Xenial.

tags: added: xenial
removed: amd64
description: updated
Revision history for this message
windowsguy (something-f) wrote :

Also in artful. bug #1639776 is marked as "fixed" which is fake news.

The the other guy above, I often see this after disconnecting from VPN (Cisco AnyConnect).

My version:
dnsmasq/artful,artful 2.78-1

I installed dnscrypt-proxy, checked the IP in /etc/dnscrypt-proxy/dnscrypt-proxy.conf and set the system to use that IP. We'll see if that helps.

Revision history for this message
Adriano dos Santos Fernandes (adrianosf-gmail) wrote :

Is the fix released to Xenial or not? I'm having this problem from years already...

And I'm with Xenial still having the problem.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.