dnsmasq and resolvconf hangs on start

Bug #1778073 reported by Thomas
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
dnsmasq (Debian)
New
Unknown
dnsmasq (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

I installed today dnsmasq and I use resolvconf in background.

Problem is, that systemd takes 1 minute or so after service start and than reports:

root@proxy:~# service dnsmasq status

 dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
   Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
  Drop-In: /run/systemd/generator/dnsmasq.service.d
           50-dnsmasq-$named.conf, 50-insserv.conf-$named.conf
   Active: failed (Result: timeout) since Do 2018-06-21 15:58:13 CEST; 2min 10s ago
  Process: 3295 ExecStop=/etc/init.d/dnsmasq systemd-stop-resolvconf (code=killed, signal=TERM)
  Process: 3865 ExecStartPost=/etc/init.d/dnsmasq systemd-start-resolvconf (code=killed, signal=TERM)
  Process: 3837 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status=0/SUCCESS)
  Process: 3825 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SUCCESS)
 Main PID: 3862 (code=exited, status=0/SUCCESS)

Jun 21 15:56:43 proxy dnsmasq[3862]: Benutze Namensserver 192.168.23.1#53
Jun 21 15:56:43 proxy dnsmasq[3865]: * Awakening mail retriever agent:
Jun 21 15:56:43 proxy dnsmasq[3865]: ...done.
Jun 21 15:56:43 proxy postfix[3951]: Postfix is running with backwards-compatible default settings
Jun 21 15:56:43 proxy postfix[3951]: See http://www.postfix.org/COMPATIBILITY_README.html for details
Jun 21 15:56:43 proxy postfix[3951]: To disable backwards compatibility use "postconf compatibility_level=2" and "postfix reload"
Jun 21 15:58:13 proxy systemd[1]: dnsmasq.service: Start-post operation timed out. Stopping.
Jun 21 15:58:13 proxy systemd[1]: Failed to start dnsmasq - A lightweight DHCP and caching DNS server.
Jun 21 15:58:13 proxy systemd[1]: dnsmasq.service: Unit entered failed state.
Jun 21 15:58:13 proxy systemd[1]: dnsmasq.service: Failed with result 'timeout'.

when I look into the start script /etc/init.d/dnsmasq there is a func systemd-start-resolvconf which points to start-resolvconf.

There is this part:

        for interface in $DNSMASQ_EXCEPT
        do
                [ $interface = lo ] && return
        done

Before I had not defined DNSMASQ_EXCEPT in /etc/defaults/dnsmasq. Problem is, that this part MUST be faulty! When I commend it out, I can start dnsmasq! It looks like it loops forever there?!

Also if I define DNSMASQ_EXCEPT to my listen interface, it works - but is is really needed?

I found a other user which had the same problem:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=871958

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: dnsmasq 2.75-1ubuntu0.16.04.4 [modified: etc/default/dnsmasq]
ProcVersionSignature: Ubuntu 4.15.0-23.25~16.04.1-generic 4.15.18
Uname: Linux 4.15.0-23-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
Date: Thu Jun 21 16:12:14 2018
InstallationDate: Installed on 2017-02-27 (479 days ago)
InstallationMedia: Ubuntu-Server 16.04.2 LTS "Xenial Xerus" - Release amd64 (20170215.8)
PackageArchitecture: all
ProcEnviron:
 TERM=xterm
 SHELL=/bin/bash
 PATH=(custom, no user)
 LANG=de_DE.UTF-8
SourcePackage: dnsmasq
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.default.dnsmasq: 2018-06-21T16:07:24.818774

Revision history for this message
Thomas (t.c) wrote :
description: updated
Changed in dnsmasq (Debian):
status: Unknown → New
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Sounds related / similar to bug 1761096

tags: added: server-next
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The other bug had mentioned being resolved in latter versions.
Can you cross-try the same on e.g. 18.04 to be sure on the scope of this?

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Thomas, have you had a chance to re-test this on 18.04? From Christian's comment #3, sounds like this issue may already be resolved, but if not we can investigate further.

Changed in dnsmasq (Ubuntu):
status: New → Incomplete
tags: removed: server-next
Revision history for this message
Ciaby (ciaby) wrote :

I run into this bug and found the cause:
When starting dnsmasq, a call to resolvconf is made to update the server entries. If postfix is installed, the /etc/resolvconf/update-libc.d/postfix script is called which tries to reload it.
The problem is that dnsmasq is a nss-lookup.target, while postfix requires nss-lookup.target to be active in order to restart. That will create a deadlock, the ExecStartPost=/etc/init.d/dnsmasq systemd-start-resolvconf script in dnsmasq times out and the service is not started.
This is only triggered by having dnsmasq and postfix installed. If you also install nfs-common, the rpc-statd will also be part of nss-lookup.target and it won't be triggered.
I hope somebody will find this useful, took me days to debug it :D

Bryce Harrington (bryce)
tags: added: server-next
Revision history for this message
Utkarsh Gupta (utkarsh) wrote :

Hello Ciaby,

Thanks for your debugging, I am sure that'd be helpful!
I've forwarded this to the Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=871958.

Simon should hopefully have some time to take a look at this. Thanks!

Paride Legovini (paride)
Changed in dnsmasq (Ubuntu):
assignee: nobody → Paride Legovini (paride)
Revision history for this message
Paride Legovini (paride) wrote :

Hi, I had another look at this one and Ciaby's analysis looks correct to me, however I want to setup a full reproducer before attempting a fix. I assigned the bug to myself.

Paride Legovini (paride)
tags: removed: server-next
Paride Legovini (paride)
Changed in dnsmasq (Ubuntu):
assignee: Paride Legovini (paride) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.