dnsmasq crash when no servers in resolv.conf

Bug #2045570 reported by Alfred
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dnsmasq (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Andreas Hasenack

Bug Description

[ Impact ]

dnsmasq "keeps an eye" on /etc/resolv.conf, and reloads it whenever the file is updated. When that happens and for some reason there were no "nameserver" declarations in the updated file, dnsmasq can crash.

Here is a log of a reproducer:
$ dig +short @127.0.0.1 ubuntu.com
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: connection refused
;; communications error to 127.0.0.1#53: connection refused
;; no servers could be reached

We can see the startup, then when resolv.conf is read again and no nameservers were found, and the crash:
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: started, version 2.86 cachesize 150
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: DNS service limited to local subnets
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: reading /etc/resolv.conf
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: using nameserver 10.0.100.1#53
Jan 03 13:57:13 j-dnsmasq-2045570 dnsmasq[1507]: read /etc/hosts - 7 addresses
Jan 03 13:57:13 j-dnsmasq-2045570 systemd[1]: Started dnsmasq - A lightweight DHCP and caching DNS server.
Jan 03 13:58:01 j-dnsmasq-2045570 dnsmasq[1507]: no servers found in /etc/resolv.conf, will retry
Jan 03 13:58:22 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Main process exited, code=dumped, status=11/SEGV
Jan 03 13:58:22 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Failed with result 'core-dump'.

dnsmasq has provisions for this situation, we can see that in the 13:58:01 message where it says it will retry, but due to this bug, it crashes instead.

The problem was introduced[1] in version 2.86, and fixed in 2.87, so only jammy is affected.

1. https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=patch;h=d290630d31f4517ab26392d00753d1397f9a4114;hp=d2ad5dc073aaacaf22b117f16106282a73586803
The commit message says:
"""
This problem was introduced in 2.86.
"""

And indeed, I wasn't able to crash 2.80 shipped in focal.

[ Test Plan ]
It might take a few tries to reproduce the bug, but here is the general outline. Also keep in mind that it's important to use a DNS name that isn't cached already by a previous query.

# Create a jammy lxd container

lxc launch ubuntu-daily:jammy j-dnsmasq-2045570

# Enter the container

lxc shell j-dnsmasq-2045570

# From now on, all commands should be executed in the container.
# Install dnsmasq, and disable systemd-resolved

apt update && apt install -y dnsmasq

# Disable systemd-resolved, and start dnsmasq

systemctl disable --now systemd-resolved
systemctl enable --now dnsmasq

# In one terminal inside the container, watch the dnsmasq logs:

journalctl -u dnsmasq.service -f

# In another terminal, remove /etc/resolv.conf and create a new one, empty
rm /etc/resolv.conf
echo "nameserver 1.1.1.1" > /etc/resolv.conf

# restart dnsmasq
systemctl restart dnsmasq.service

# Perform a dns query

dig @127.0.0.1 +short linux.com

# Comment the namserver directive in resolv.conf
echo "#nameserver 1.1.1.1" > /etc/resolv.conf

# Observe in the dnsmasq logs that it notices the change with a message like:

Jan 03 14:14:51 j-dnsmasq-2045570 dnsmasq[2274]: no servers found in /etc/resolv.conf, will retry

# Perform a *different* DNS query

dig @127.0.0.1 +short ubuntu.com

# Observe in the dnsmasq logs that it crashes.
Jan 03 13:58:22 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Main process exited, code=dumped, status=11/SEGV
Jan 03 13:58:22 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Failed with result 'core-dump'.

If it doesn't crash right away, repeat these steps a few times, but using a different domain name each time:
- add "nameserver 127.0.0.1" to /etc/resolv.conf
- observe that dnsmasq notices the change to the file
- perform a query for some random domain using "dig @127.0.0.1 +short <domain-of-your-choosing>"
- remove "nameserver" from /etc/resolv.conf, observe that dnsmasq noticed the change
- perform a query for another random domain

The fixed version from proposed will not crash. That last query with no "nameserver" lines in resolv.conf won't work, but it won't crash the server.

[ Where problems could occur ]

This is doing some pointer/memory manipulation that could introduce memory leaks or other crashes. In fact, this is exactly what happened in the 2.86 release, which, and I quote, "Major rewrite of the DNS server and domain handling code. This should be largely transparent, but it drastically improves performance and reduces memory foot-print"[2]. 2.88 was then released with the fix used in this SRU (the commit is also in the 2.87 tag, but the upstream release notes only mention it in 2.88).

2. https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=blob;f=CHANGELOG;h=2ce53a81079810ae43588607f43851dabb5db38d;hb=HEAD#l224

[ Other Info ]
Not at this time.

[ Original description ]

upstream discussion: https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2022q3/016563.html

in my journal, my dns service crash and restart just after:
Dec 04 17:18:38 dnsmasq[199333]: no servers found in /run/NetworkManager/no-stub-resolv.conf, will retry

oops report: https://errors.ubuntu.com/oops/29cf5e2e-92b1-11ee-9bdf-fa163ec44ecd

ubuntu jammy, dnsmasq-base 2.86-1.1ubuntu0.3

Related branches

Alfred (alf-redyoung)
description: updated
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Thanks for taking the time to report this bug and trying to make Ubuntu better.

Also thanks for the pointers. According to the upstream discussion this is the needed fix:

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=d290630d31f4517ab26392d00753d1397f9a4114

It is included in version 2.87 onward, so it affects only Jammy.

Changed in dnsmasq (Ubuntu):
status: New → Triaged
tags: added: server-todo
Changed in dnsmasq (Ubuntu Jammy):
status: New → Triaged
Changed in dnsmasq (Ubuntu):
status: Triaged → Fix Released
Changed in dnsmasq (Ubuntu Jammy):
assignee: nobody → Andreas Hasenack (ahasenack)
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I was able to reproduce this after a few attempts. Good enough for a test plan/case.

Revision history for this message
Alfred (alf-redyoung) wrote (last edit ):

ps.
Since it is a UAF, the result is uncertain. I see twice, it does not crash, but just dead loop, use 100% cpu.

Dec 05 06:11:38 dnsmasq[359491]: read /etc/hosts - 7 addresses
Dec 06 07:58:41 dnsmasq[359491]: no servers found in /run/NetworkManager/no-stub-resolv.conf, will retry
Dec 06 21:23:52 systemd[1]: Stopping My DNS caching server for lxd and vms...
Dec 06 21:24:09 systemd[1]: my-dns.service: Main process exited, code=killed, status=9/KILL
Dec 06 21:24:09 systemd[1]: my-dns.service: Failed with result 'signal'.
Dec 06 21:24:09 systemd[1]: Stopped My DNS caching server for lxd and vms.
Dec 06 21:24:09 systemd[1]: my-dns.service: Consumed 13h 25min 27.822s CPU time.
Dec 06 21:24:09 systemd[1]: Started My DNS caching server for lxd and vms.

Dec 09 13:07:28 dnsmasq[464230]: read /etc/hosts - 7 addresses
Dec 11 03:44:37 dnsmasq[464230]: no servers found in /run/NetworkManager/no-stub-resolv.conf, will retry
Dec 11 06:28:39 systemd[1]: Stopping My DNS caching server for lxd and vms...
Dec 11 06:28:48 systemd[1]: my-dns.service: Main process exited, code=killed, status=9/KILL
Dec 11 06:28:48 systemd[1]: my-dns.service: Failed with result 'signal'.
Dec 11 06:28:48 systemd[1]: Stopped My DNS caching server for lxd and vms.
Dec 11 06:28:48 systemd[1]: my-dns.service: Consumed 2h 44min 11.010s CPU time.
Dec 11 06:28:48 systemd[1]: Started My DNS caching server for lxd and vms.

In the end, I manually restart it.

description: updated
Changed in dnsmasq (Ubuntu Jammy):
status: Triaged → In Progress
description: updated
description: updated
description: updated
description: updated
description: updated
description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

So with the upstream patch, I see no crash, but dnsmasq starts spinning CPU at 100%. This is not a good enough fix.

description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Ah, never mind, I built the package incorrectly. It's not a quilt package... :/

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Yep, false alarm, the patch works.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Uploaded to jammy unapproved, waiting on SRU team now.

Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Alfred, or anyone else affected,

Accepted dnsmasq into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dnsmasq/2.86-1.1ubuntu0.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in dnsmasq (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (dnsmasq/2.86-1.1ubuntu0.5)

All autopkgtests for the newly accepted dnsmasq (2.86-1.1ubuntu0.5) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

systemd/249.11-0ubuntu3.11 (ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#dnsmasq

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

While performing the jammy verification, it happened once again what I first saw in https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/2045570/comments/4: dnsmasq doesn't crash, but starts spinning 100% CPU and becomes unresponsive.

I double checked that the source package contains the patch applied correctly.

I'm repeating the test multiple times now, and not observing the 100% cpu usage anymore...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Ah, I now understand what happened in the comment above. I only upgraded the bin:dnsmasq package, and not bin:dnsmasq-base. Both need to be upgraded, i.e., like a normal "apt ugprade" or "apt dist-upgrade" would do.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Reproducing the bug

root@j-dnsmasq-2045570:~# apt-cache policy dnsmasq
dnsmasq:
  Installed: 2.86-1.1ubuntu0.4
  Candidate: 2.86-1.1ubuntu0.4
  Version table:
 *** 2.86-1.1ubuntu0.4 500
        500 http://br.archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages
        100 /var/lib/dpkg/status
     2.86-1.1ubuntu0.3 500
        500 http://br.archive.ubuntu.com/ubuntu jammy-security/universe amd64 Packages
     2.86-1.1 500
        500 http://br.archive.ubuntu.com/ubuntu jammy/universe amd64 Packages

# dig @127.0.0.1 +short linux.com
23.185.0.3

# echo "#nameserver 1.1.1.1" > /etc/resolv.conf
#

Log:
Jan 23 16:57:40 j-dnsmasq-2045570 dnsmasq[2222]: no servers found in /etc/resolv.conf, will retry

root@j-dnsmasq-2045570:~# dig @127.0.0.1 +short ubuntu.com
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: connection refused
;; communications error to 127.0.0.1#53: connection refused
;; no servers could be reached

And the log show a crash:
Jan 23 17:03:12 j-dnsmasq-2045570 dnsmasq[253]: no servers found in /etc/resolv.conf, will retry
Jan 23 17:03:16 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Main process exited, code=dumped, status=11/SEGV
Jan 23 17:03:16 j-dnsmasq-2045570 systemd[1]: dnsmasq.service: Failed with result 'core-dump'.

(it took about 3 attempts, but it crashed)

With the new package from jammy-proposed:
root@j-dnsmasq-2045570:~# apt-cache policy dnsmasq
dnsmasq:
  Installed: 2.86-1.1ubuntu0.5
  Candidate: 2.86-1.1ubuntu0.5
  Version table:
 *** 2.86-1.1ubuntu0.5 500
        500 http://br.archive.ubuntu.com/ubuntu jammy-proposed/universe amd64 Packages
        100 /var/lib/dpkg/status
     2.86-1.1ubuntu0.4 500
        500 http://br.archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages
     2.86-1.1ubuntu0.3 500
        500 http://br.archive.ubuntu.com/ubuntu jammy-security/universe amd64 Packages
     2.86-1.1 500
        500 http://br.archive.ubuntu.com/ubuntu jammy/universe amd64 Packages

When I perform the same test as before, I get an immediate empty result, and no crash, when resolv.conf contains no server:
root@j-dnsmasq-2045570:~# dig @127.0.0.1 +short ubuntu.com
root@j-dnsmasq-2045570:~#

And the previous result, which was cached, is still there:
root@j-dnsmasq-2045570:~# dig @127.0.0.1 +short linux.com
23.185.0.3

Logs remain silent:
Jan 23 17:08:14 j-dnsmasq-2045570 dnsmasq[1350]: no servers found in /etc/resolv.conf, will retry

If I revert resolv.conf to a working content:
root@j-dnsmasq-2045570:~# echo "nameserver 1.1.1.1" > /etc/resolv.conf

The log notices that:
Jan 23 17:57:47 j-dnsmasq-2045570 dnsmasq[7370]: reading /etc/resolv.conf
Jan 23 17:57:47 j-dnsmasq-2045570 dnsmasq[7370]: using nameserver 1.1.1.1#53

And the server resumes working:
root@j-dnsmasq-2045570:~# dig @127.0.0.1 +short ubuntu.com
185.125.190.21
185.125.190.29
185.125.190.20
root@j-dnsmasq-2045570:~# dig @127.0.0.1 +short linux.com
23.185.0.3
root@j-dnsmasq-2045570:~#

Jammy verification succeeded.

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Alfred (alf-redyoung) wrote :

confirm the proposed package does fix the problem.

Bryce Harrington (bryce)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dnsmasq - 2.86-1.1ubuntu0.5

---------------
dnsmasq (2.86-1.1ubuntu0.5) jammy; urgency=medium

  * src/dnsmasq.c: Fix a crash that can happen when an empty resolv.conf is
    reloaded (LP: #2045570)
  * src/helper.c: Fix wrong client address for dhcp-script when DHCPv4 relay
    in use (LP: #2042587)

 -- Andreas Hasenack <email address hidden> Thu, 11 Jan 2024 09:21:27 -0300

Changed in dnsmasq (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Update Released

The verification of the Stable Release Update for dnsmasq has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.