dnsmasq >= 2.81 not responding to DHCP requests with current q-dhcp configs

Bug #1896945 reported by Lee Yarwood
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Dan Radez

Bug Description

* High level description:

I've been attempting to enable Fedora 32 support in devstack and encountered the following issue where dnsmasq as configured by q-dhcp isn't responding to DHCP requests from clients:

https://review.opendev.org/#/c/750292/

Looking at tcpdump and strace it appears that dnsmasq can see the requests but doesn't reply suggesting a configuration issue either caused by q-dhcp *or* a regression in dnsmasq itself:

$ openstack server reboot --hard test && sudo ip netns exec qdhcp-df64061e-0784-4bbe-909b-ae1c5f466981 tcpdump -i tapee679459-e1 -n port 67 or port 68
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapee679459-e1, link-type EN10MB (Ethernet), capture size 262144 bytes
18:40:24.070796 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:16:07:a0, length 300
18:41:24.118961 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:16:07:a0, length 300
18:42:24.192716 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:16:07:a0, length 300

$ openstack server reboot --hard test && sudo ip netns exec qdhcp-df64061e-0784-4bbe-909b-ae1c5f466981 strace -p 196856
strace: Process 196856 attached
restart_syscall(<... resuming interrupted read ...>) = 1
recvmsg(4, {msg_name={sa_family=AF_INET, sin_port=htons(68), sin_addr=inet_addr("0.0.0.0")}, msg_namelen=16, msg_iov=[{iov_base="\1\1\6\0\0041\326S\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\372\26>\26"..., iov_len=548}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=if_nametoindex("tapee679459-e1"), ipi_spec_dst=inet_addr("10.0.0.2"), ipi_addr=inet_addr("255.255.255.255")}}], msg_controllen=32, msg_flags=0}, MSG_PEEK|MSG_TRUNC) = 300
recvmsg(4, {msg_name={sa_family=AF_INET, sin_port=htons(68), sin_addr=inet_addr("0.0.0.0")}, msg_namelen=16, msg_iov=[{iov_base="\1\1\6\0\0041\326S\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\372\26>\26"..., iov_len=548}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=if_nametoindex("tapee679459-e1"), ipi_spec_dst=inet_addr("10.0.0.2"), ipi_addr=inet_addr("255.255.255.255")}}], msg_controllen=32, msg_flags=0}, 0) = 300
ioctl(4, SIOCGIFNAME, {ifr_index=9, ifr_name="tapee679459-e1"}) = 0
ioctl(4, SIOCGIFFLAGS, {ifr_name="tapee679459-e1", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0
ioctl(4, SIOCGIFADDR, {ifr_name="tapee679459-e1", ifr_addr={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.0.0.2")}}) = 0
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=14, events=POLLIN}], 8, -1

The current configs are listed below:

http://paste.openstack.org/show/798334/

I was able to downgrade dnsmasq on f32 to 2.80 in order to workaround this:

$ sudo dnf downgrade dnsmasq -y
[..]
$ rpm -qa | grep dnsmasq
dnsmasq-2.80-14.fc32.x86_64
$ sudo killall dnsmasq && sudo systemctl restart devstack@q-*
$ openstack server reboot --hard test && sudo ip netns exec qdhcp-df64061e-0784-4bbe-909b-ae1c5f466981 tcpdump -i tapee679459-e1 -n port 67 or port 68
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapee679459-e1, link-type EN10MB (Ethernet), capture size 262144 bytes
12:06:57.028953 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:16:07:a0, length 300
12:06:57.029994 IP 10.0.0.2.bootps > 10.0.0.49.bootpc: BOOTP/DHCP, Reply, length 328
12:06:57.042300 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:16:07:a0, length 300
12:06:57.047014 IP 10.0.0.2.bootps > 10.0.0.49.bootpc: BOOTP/DHCP, Reply, length 344

* Pre-conditions:

F32 with dnsmasq >= 2.81 installed.

* Step-by-step reproduction steps:

Deploy F32 with dnsmasq >= 2.81 installed, attempt to spawn an instance attached to a subnet with dhcp enabled.

* Expected output:

dnsmasq responds to DHCP request from instance.

* Actual output:

dnsmasq doesn't respond to DHCP request from instance.

* Version:
  ** OpenStack version (Specific stable branch, or git hash if from trunk);

  Neutron @ 0fdcc4b1b63dc90fbc9f46f5947f84626f8e5b41

  ** Linux distro, kernel. For a distro, it’s also worth knowing specific versions of client and server;

  Fedora 32 with kernel 5.8.10-200.fc32.x86_64

  ** DevStack or other _deployment_ mechanism?

  Devstack @ https://review.opendev.org/#/c/750292/

* Environment: what types of services are you running (core services like DB and AMQP broker, as well as Nova/hypervisor if it matters), and which type of deployment (clustered servers)? Multi-node or single node, etc.

  Single node devstack env.

* Perceived severity: is this a blocker for you?

  High, assuming other distros will be impacted by this once they move to dnsmasq >= 2.81

tags: added: l3-ipam-dhcp
removed: ovs
Revision history for this message
Lee Yarwood (lyarwood) wrote :

I forgot to add that I also tried 2.82 from Fedora 33 with the same results as 2.81.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

At first glance it seems like something similar to https://bugs.launchpad.net/neutron/+bug/1876094 but we will have to check that deeper as that bug should be fixed already.

Changed in neutron:
importance: Undecided → High
Changed in neutron:
status: New → Confirmed
Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Lajos Katona (lajos-katona) wrote :

I checked with groovy (Ubuntu20.10), and on it works with latest master.

$ uname -a
Linux dnsmasqtest 5.8.0-18-generic #19-Ubuntu SMP Wed Aug 26 15:26:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu Groovy Gorilla (development branch)
Release: 20.10
Codename: groovy

$ dnsmasq --version
Dnsmasq version 2.82 Copyright (c) 2000-2020 Simon Kelley
Compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify dumpfile

This software comes with ABSOLUTELY NO WARRANTY.
Dnsmasq is free software, and you are welcome to redistribute it
under the terms of the GNU General Public License, version 2 or 3.

$ sudo dpkg -l |grep dnsmasq
ii dnsmasq-base 2.82-1ubuntu1 amd64 Small caching DNS proxy and DHCP/TFTP server
ii dnsmasq-utils 2.82-1ubuntu1 amd64 Utilities for manipulating DHCP leases

I started cirros-0.5.1 on default private network, which has both IPv4 and IPv6 address.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

tomorrow I try to change kernel to 5.8.10 perhaps that can help

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I can confirm that on Fedora 32 and I think it is related to https://bugzilla.redhat.com/show_bug.cgi?id=1834454 and fix for that which was included in dnsmasq-2.81-3 in F32.
I will investigate why it's like that.

Changed in neutron:
status: Incomplete → Confirmed
assignee: nobody → Slawek Kaplonski (slaweq)
Dan Radez (dradez)
Changed in neutron:
assignee: Slawek Kaplonski (slaweq) → Dan Radez (dradez)
status: Confirmed → In Progress
Revision history for this message
Lajos Katona (lajos-katona) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/755356
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=704576e54e041340ed9c2964b110815e074239a6
Submitter: Zuul
Branch: master

commit 704576e54e041340ed9c2964b110815e074239a6
Author: Dan Radez <email address hidden>
Date: Wed Sep 30 11:00:07 2020 -0400

    Default dnsmasq --conf-file to /dev/null

    Passing --conf-file= with no value has no effect on the dnsmasq
    process. Intended effect here is for the default system dnsmasq.conf
    file not to be read and included in configuring the process. For
    that to happen some value has to be passed to --conf-file. Passing
    /dev/null will invoke the desired outcome to skip the system
    default conf file.

    Closes-Bug: #1896945
    Change-Id: I22570a44f84d14a792633747c04d7426ab231009

Changed in neutron:
status: In Progress → Fix Released
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/756693

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/756921

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/756922

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/756922
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8654eb2d183cf13699e50b8b967a6c52412602ac
Submitter: Zuul
Branch: stable/train

commit 8654eb2d183cf13699e50b8b967a6c52412602ac
Author: Dan Radez <email address hidden>
Date: Wed Sep 30 11:00:07 2020 -0400

    Default dnsmasq --conf-file to /dev/null

    Passing --conf-file= with no value has no effect on the dnsmasq
    process. Intended effect here is for the default system dnsmasq.conf
    file not to be read and included in configuring the process. For
    that to happen some value has to be passed to --conf-file. Passing
    /dev/null will invoke the desired outcome to skip the system
    default conf file.

    Closes-Bug: #1896945
    Change-Id: I22570a44f84d14a792633747c04d7426ab231009
    (cherry picked from commit 704576e54e041340ed9c2964b110815e074239a6)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/756693
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1da19b3c9eb6a670f6a6845cdaa8564c21ae7399
Submitter: Zuul
Branch: stable/victoria

commit 1da19b3c9eb6a670f6a6845cdaa8564c21ae7399
Author: Dan Radez <email address hidden>
Date: Wed Sep 30 11:00:07 2020 -0400

    Default dnsmasq --conf-file to /dev/null

    Passing --conf-file= with no value has no effect on the dnsmasq
    process. Intended effect here is for the default system dnsmasq.conf
    file not to be read and included in configuring the process. For
    that to happen some value has to be passed to --conf-file. Passing
    /dev/null will invoke the desired outcome to skip the system
    default conf file.

    Closes-Bug: #1896945
    Change-Id: I22570a44f84d14a792633747c04d7426ab231009
    (cherry picked from commit 704576e54e041340ed9c2964b110815e074239a6)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/756921
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3447613efae7e291e43404ee7e8a9770faef4118
Submitter: Zuul
Branch: stable/ussuri

commit 3447613efae7e291e43404ee7e8a9770faef4118
Author: Dan Radez <email address hidden>
Date: Wed Sep 30 11:00:07 2020 -0400

    Default dnsmasq --conf-file to /dev/null

    Passing --conf-file= with no value has no effect on the dnsmasq
    process. Intended effect here is for the default system dnsmasq.conf
    file not to be read and included in configuring the process. For
    that to happen some value has to be passed to --conf-file. Passing
    /dev/null will invoke the desired outcome to skip the system
    default conf file.

    Closes-Bug: #1896945
    Change-Id: I22570a44f84d14a792633747c04d7426ab231009
    (cherry picked from commit 704576e54e041340ed9c2964b110815e074239a6)

tags: added: in-stable-ussuri
tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 15.3.1

This issue was fixed in the openstack/neutron 15.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.3.0

This issue was fixed in the openstack/neutron 16.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.1.0

This issue was fixed in the openstack/neutron 17.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.0.0.0rc1

This issue was fixed in the openstack/neutron 18.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.