After updating ubuntu, the network to which the subnet address is assigned does not become active in KVM.

Bug #2055776 reported by Kazuhiro MIYASHITA
280
This bug affects 5 people
Affects Status Importance Assigned to Milestone
dnsmasq (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

phenomenon:
  After updating ubuntu, the network to which the subnet address is assigned does not become active in KVM.

Cause:
  This is because the following dnsmasq update operation performed by apt's automatic update causes an error. It worked properly with dnsmasq 2.80, but does not work properly with 2.90.

$ cat /var/log/apt/history.log
(snip)
Start-Date: 2024-02-27 06:17:31
Commandline: /usr/bin/unattended-upgrade
Upgrade: dnsmasq-base:amd64 (2.80-1.1ubuntu1.7, 2.90-0ubuntu0.20.04.1)
End-Date: 2024-02-27 06:17:44
(snip)
$

Cause details:
  As a premise, bind-dynamic is set in the dnsmasq config file for KVM. Below is an example.

$ cat default.conf
##WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
##OVERWRITTEN AND LOST. Changes to this configuration should be made using:
## virsh net-edit default
## or other application using the libvirt API.
##
## dnsmasq conf file created by libvirt
strict-order
user=libvirt-dnsmasq
pid-file=/run/libvirt/network/default.pid
except-interface=lo
bind-dynamic
interface=virbr0
dhcp-range=192.168.122.2,192.168.122.254,255.255.255.0
dhcp-no-override
dhcp-authoritative
dhcp-lease-max=253
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
$

When starting the network with KVM (virsh net-start), dnsmasq started from KVM executes the make_sock function twice as shown below.

   $ cat network.c
   (snip)
   1087 static struct listener *create_listeners(union mysockaddr *addr, int do_
   1087 tftp, int dienow)
   1088 {
   1089 struct listener *l = NULL;
   1090 int fd = -1, tcpfd = -1, tftpfd = -1;
   1091
   1092 (void)do_tftp;
   1093
   1094 if (daemon->port != 0)
   1095 {
   1096 fd = make_sock(addr, SOCK_DGRAM, dienow);
   1097 tcpfd = make_sock(addr, SOCK_STREAM, dienow);
   1098 }
   (snip)

The following code causes an issue with the update made in dnsmasq 2.90.

   $ cat network.c
   (snip)
    895 static int make_sock(union mysockaddr *addr, int type, int dienow)
    896 {
    (snip)
    934 if (!option_bool(OPT_CLEVERBIND) || errno != EADDRNOTAVAIL)
    935 {
    936 if (dienow)
    937 die(s, daemon->addrbuff, EC_BADNET);
    938 else
    939 my_syslog(LOG_WARNING, s, daemon->addrbuff, strerror(errno)) 939 ;
    940 }
    (snip)

function "make_sock" in network.c:1096 binds the socket to 192.168.122.1/24, and then make_sock in network.c:1097 tries to bind to the same address. However, in network.c:934, when errno==98 occurs, network.c:937 is executed, so dnsmasq does not cause a startup error. As a result, virsh net-start fails.

As a temporary workaround, it will work if you try not to die.

$ diff -u network_c_back network.c
--- network_c_back 2024-02-29 15:36:05.156467935 +0000
+++ network.c 2024-02-29 15:36:38.733324350 +0000
@@ -934,7 +934,8 @@
       if (!option_bool(OPT_CLEVERBIND) || errno != EADDRNOTAVAIL)
  {
    if (dienow)
- die(s, daemon->addrbuff, EC_BADNET);
+ my_syslog(LOG_WARNING, s, daemon->addrbuff, strerror(errno));
+ //die(s, daemon->addrbuff, EC_BADNET);
    else
      my_syslog(LOG_WARNING, s, daemon->addrbuff, strerror(errno));
  }
$

If bind-dynamic is set, it should be modified so that it works even if errno==98.

For reference, in the case of dnsmasq 2.80, the code is as follows, so no error occurs.

    network.c
    699 static int make_sock(union mysockaddr *addr, int type, int dienow)
    700 {
    701 int family = addr->sa.sa_family;
    702 int fd, rc, opt = 1;
    (snip)
        715 err:
    716 errsave = errno;
    717 port = prettyprint_addr(addr, daemon->addrbuff);
    718 if (!option_bool(OPT_NOWILD) && !option_bool(OPT_CLEVERBIND))
    719 sprintf(daemon->addrbuff, "port %d", port);
    720 s = _("failed to create listening socket for %s: %s");
    721
    722 if (fd != -1)
    723 close (fd);
    724
    725 errno = errsave;
    726
    727 if (dienow)
    728 {
    729 /* failure to bind addresses given by --listen-address at this
    729 point
    730 is OK if we're doing bind-dynamic */
    731 if (!option_bool(OPT_CLEVERBIND))
    732 die(s, daemon->addrbuff, EC_BADNET);
    733 }
    734 else
    735 my_syslog(LOG_WARNING, s, daemon->addrbuff, strerror(errno));
    736
    737 return -1;
    738 }

If bind-dynamic is set (option_bool(OPT_CLEVERBIND)==true), it will not die.

Revision history for this message
Kazuhiro MIYASHITA (miyakz1192) wrote :

my environment is as follows.

$ cat /etc/issue
Ubuntu 20.04.5 LTS \n \l

$ virsh --version
6.0.0
$ /usr/sbin/dnsmasq --version
Dnsmasq version 2.90 Copyright (c) 2000-2024 Simon Kelley
Compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset no-nftset auth no-cryptohash no-DNSSEC loop-detect inotify dumpfile

This software comes with ABSOLUTELY NO WARRANTY.
Dnsmasq is free software, and you are welcome to redistribute it
under the terms of the GNU General Public License, version 2 or 3.
$

Revision history for this message
Kazuhiro MIYASHITA (miyakz1192) wrote :

The error that I got is follow.

$ virsh net-start default
error: Failed to start network default
error: internal error: Child process (VIR_BRIDGE_NAME=virbr0 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper) unexpected exit status 2:
dnsmasq: failed to create listening socket for 192.168.122.1: Address already in use
$

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

Thanks for filing this bug, and the excellent analysis.

So it looks like the dnsmasq change was introduced here:
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=744231d99505cdead314d13506b5ff8c44a13088

That was in response to this mailing list discussion:
https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2023q4/017333.html

I think we need to report this issue upstream, perhaps we can revert that commit in the meantime.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

I will prepare updates for testing with the problematic commit reverted.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

Out of curiosity, what is the contents of your /etc/dnsmasq.d directory? Is there a symlink in there to /etc/dnsmasq.d-available/libvirt-daemon? What is the contents of that file?

Revision history for this message
Marc Deslauriers (mdeslaur) wrote (last edit ):

Do you know what else could be listening on that interface? What's the output of "sudo netstat --tcp --udp --listening --programs --numeric"?

information type: Public → Public Security
Revision history for this message
Kazuhiro MIYASHITA (miyakz1192) wrote :
Download full text (18.8 KiB)

> Out of curiosity, what is the contents of your /etc/dnsmasq.d directory? Is there a symlink in there to /etc/dnsmasq.d-available/libvirt-daemon? What is the contents of that file?

$ ls -l /etc/dnsmasq.d
total 0
lrwxrwxrwx 1 root root 39 Aug 29 2020 libvirt-daemon -> /etc/dnsmasq.d-available/libvirt-daemon
$ cat /etc/dnsmasq.d-available/libvirt-daemon
bind-interfaces
except-interface=virbr0
$

> Do you know what else could be listening on that interface? What's the output of "sudo netstat --tcp --udp --listening --programs --numeric"?

$ sudo netstat --tcp --udp --listening --programs --numeric
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 192.168.0.2:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1078/named
tcp 0 0 12...

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

By default bind will listen on all interfaces. I don't understand why we're not seeing anything listening on 192.168.122.1 but you are still getting the error message.

I suggest adding a listen-on directive to your /etc/bind/named.conf.options file, restarting bind, and seeing if libvirt will now successfully listen on virbr0.

Revision history for this message
Kazuhiro MIYASHITA (miyakz1192) wrote :

After starting named with my named.conf.options setting, I manually generated virbr0 with brctl and set the IP address (192.168.122.1), and named started listening to 192.168.122.1 using TCP.

$ cat /etc/bind/named.conf.options
options {
  directory "/var/cache/bind";

  // If there is a firewall between you and nameservers you want
  // to talk to, you may need to fix the firewall to allow multiple
  // ports to talk. See http://www.kb.cert.org/vuls/id/800113

  // If your ISP provided one or more IP addresses for stable
  // nameservers, you probably want to use them as forwarders.
  // Uncomment the following block, and insert the addresses replacing
  // the all-0's placeholder.

  // forwarders {
  // 0.0.0.0;
  // };

  //========================================================================
  // If BIND logs error messages about the root key being expired,
  // you will need to update your keys. See https://www.isc.org/bind-keys
  //========================================================================
  dnssec-validation auto;
  listen-on port 53 { localhost; 192.168.122.0/24; };
  allow-query { localhost; 192.168.122.0/24; };
};
$

Because of this behavior, I think that the operations of dnsmasq and named conflicted, resulting in an error on the dnsmasq side(the second make_sock() with SOCK_STREAM).

This named is started because it is necessary, but I understand that the VM host should not run many processes, so the name resolution function is not run on the VM host, but on another I'm planning to move to a server.

I understood where the problem is. Thank you very much for your cooperation.
I initially reported it as a bug in dnsmasq, but it turned out to be a problem in my environment.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

I am marking this bug as "invalid" per your last comment. Thanks!

Changed in dnsmasq (Ubuntu):
status: New → Invalid
Revision history for this message
Joe Blow (saturn1c) wrote (last edit ):

Adding the lines:

  listen-on port 53 { localhost; 192.168.122.0/24; };
  allow-query { localhost; 192.168.122.0/24; };

to /etc/bind/named.conf.options helped

Revision history for this message
Tim Chevalier (tchevalier) wrote (last edit ):

I found this bug report after experiencing the same problem. The error message I got was exactly the same as the one in comment #2. The fix suggested by Kazuhiro in comment #9 fixed the problem.

I think that since this bug affects more than one person, it's not merely a problem with one person's local environment. I didn't have occasion to use KVM between 2024-02-27 and today, but it was working for me before then and I didn't manually change anything in my environment. I don't know enough about Ubuntu networking to say whether it's a bug in dnsmasq or another component, but surely a package upgrade shouldn't break something like this.

$ cat /var/log/apt/history.log.1
(snip)
Start-Date: 2024-02-27 06:19:38
Commandline: /usr/bin/unattended-upgrade
Upgrade: dnsmasq-base:amd64 (2.89-1, 2.90-0ubuntu0.23.10.1)
End-Date: 2024-02-27 06:19:42
(snip)

$ cat /etc/issue
Ubuntu 23.10 \n \l

$ virsh --version
9.6.0
$ /usr/sbin/dnsmasq --version
Dnsmasq version 2.90 Copyright (c) 2000-2024 Simon Kelley
Compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset nftset auth cryptohash DNSSEC loop-detect inotify dumpfile

This software comes with ABSOLUTELY NO WARRANTY.
Dnsmasq is free software, and you are welcome to redistribute it
under the terms of the GNU General Public License, version 2 or 3
$

Revision history for this message
Mitchell Dzurick (mitchdz) wrote :

Thanks Tim! With more people having a seemingly similar issue I re-opened it as New and subscribed ubuntu-server to take a look at this again.

Changed in dnsmasq (Ubuntu):
status: Invalid → New
Revision history for this message
Jeff Lane  (bladernr) wrote :

Something in a recent dnsmasq update broke KVM. I hit the same and found a workaround that gets me rolling again wtih KVM guests at least.... here's a quick shell summary:

bladernr@galactica:~$ sudo virsh net-start default
error: Failed to start network default
error: internal error: Child process (VIR_BRIDGE_NAME=virbr0 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper) unexpected exit status 2:
dnsmasq: failed to create listening socket for 192.168.123.1: Address already in use

bladernr@galactica:~$ sudo virsh net-list
 Name State Autostart Persistent
----------------------------------------

bladernr@galactica:~$ sudo apt-cache policy dnsmasq-base
dnsmasq-base:
  Installed: 2.90-0ubuntu0.23.10.1
  Candidate: 2.90-0ubuntu0.23.10.1
  Version table:
 *** 2.90-0ubuntu0.23.10.1 500
        500 http://archive.ubuntu.com/ubuntu mantic-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu mantic-security/main amd64 Packages
        100 /var/lib/dpkg/status
     2.89-1 500
        500 http://archive.ubuntu.com/ubuntu mantic/main amd64 Packages

bladernr@galactica:~$ sudo apt install dnsmasq-base=2.89-1
bladernr@galactica:~$ sudo virsh net-start default
Network default started

bladernr@galactica:~$ sudo virsh net-list
 Name State Autostart Persistent
--------------------------------------------
 default active yes yes

as you can see, as a workaround, reverting to the original Mantic version allows the NAT network to start and I can once again launch KVM guests on my machine.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in dnsmasq (Ubuntu):
status: New → Confirmed
tags: added: server-triage-discuss
Bryce Harrington (bryce)
Changed in dnsmasq (Ubuntu):
assignee: nobody → Sergio Durigan Junior (sergiodj)
tags: added: server-todo
removed: server-triage-discuss
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I was able to reproduce the bug on Focal, and since we seem to carry the same version on Jammy/Mantic (and likely Noble), it's probable that the bug also happens in those releases.

For future reference:

# apt install -y libvirt-daemon-system bind9 dnsmasq

Reboot, and try bringing up the "default" network on libvirt:

# virsh net-start default
error: Failed to start network default
error: internal error: Child process (VIR_BRIDGE_NAME=virbr0 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper) unexpected exit status 2:
dnsmasq: failed to create listening socket for 192.168.122.1: Address already in use

Robie Basak (racb)
tags: added: regression-update
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I talked to Marc to understand whether the security team had any plans to "fix" this problem, and he raised a valid point: from his perspective (and the Security team's as well, I gather), this is not a bug because we have two services trying to listen on the same port. The "fix" here is to adjust the local configuration, as mentioned in the comments above.

I'm reverting this bug's state to Invalid, then.

Changed in dnsmasq (Ubuntu):
assignee: Sergio Durigan Junior (sergiodj) → nobody
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Moving to invalid as per the last comment.

Changed in dnsmasq (Ubuntu):
status: Confirmed → Invalid
Bryce Harrington (bryce)
tags: removed: server-todo
tags: removed: regression-update
Revision history for this message
Marc Olzheim (zlo-zlo) wrote :

I'm sorry, but if this means that in the default configuration this is no longer working, how is this not a regression ?

Should the default configuration not be so that both bind9 and libvirtd can be installed and used without issue as was the case before the dnsmasq update?

Breaking this within an LTS release does not sounds right to me.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Two different services attempting to bind on a single (IP, port) tuple is going to lead to a failure of one of the services. Prior to this update it was a silent failure, which serves only to make debugging problems more difficult.

I can empathize with the feeling that things shouldn't break in an LTS release, but it was quietly broken already.

Thanks

Revision history for this message
Marc Olzheim (zlo-zlo) wrote :

Well, the main impact for me was that after the update, suddenly the autostart of all Virtual Machines failed after a reboot, which to me is a seriously POLA violation and not something I would expect to happen within an LTS release.

Even though you are correct in the fact that it was broken already and bind should not have gotten installed (it was as suggestied by samba) in the first place.

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.