neutron dnsmasq keeps pushing the default gateway

Bug #1979528 reported by kay
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

neutron dnsmasq keeps pushing the default gateway via "0.0.0.0/0" even when an empty router option has been defined for a port.

Here are some internal of the dnsmasq opts on the agent side:

$ cat /var/lib/neutron/dhcp/b7b18782-1419-4fdc-8395-619ae8a8372a/opts
tag:subnet-6bafc9a4-2181-42bf-ae35-fba7fa249256,option:classless-static-route,10.100.0.0/26,10.180.0.1,10.10.136.0/24,10.180.0.1,10.10.22.0/24,10.180.0.1,10.10.28.0/24,10.180.0.1,10.10.46.0/24,10.180.0.1,123.123.1.123/32,10.180.0.1,123.123.2.32/27,10.180.0.1,169.254.169.254/32,10.180.0.3,0.0.0.0/0,10.180.0.1
tag:subnet-6bafc9a4-2181-42bf-ae35-fba7fa249256,249,10.100.0.0/26,10.180.0.1,10.10.136.0/24,10.180.0.1,10.10.22.0/24,10.180.0.1,10.10.28.0/24,10.180.0.1,10.10.46.0/24,10.180.0.1,123.123.1.123/32,10.180.0.1,123.123.2.32/27,10.180.0.1,169.254.169.254/32,10.180.0.3,0.0.0.0/0,10.180.0.1
tag:subnet-6bafc9a4-2181-42bf-ae35-fba7fa249256,option:router,10.180.0.1
tag:port-07769b80-b493-40c9-bf78-5cd78a5f9af0,option:router
tag:port-2dc4792c-5d7e-4c10-9dcf-6b81e5dd4cd2,option:router
tag:port-438a608a-badd-439a-be51-d4e10c1aaafc,option:router
tag:port-54b4cc66-4eba-4b59-84a8-cc3582f76d5f,option:router
tag:port-5b9b05e2-a247-4ca4-9e90-5cd4d0825772,option:router
tag:port-691ea1ba-0ab0-4210-b6f3-a2281ee15f05,option:router
tag:port-8cd9ec95-b9ee-4ede-be06-2143acb28513,option:router
tag:port-8e86dd29-e274-4ddb-b6c6-8fbb80936d6b,option:router
tag:port-9d38b6d8-7f44-4ab6-a3a9-814455b52340,option:router
tag:port-ecfd965f-4961-4b65-9c79-0a26c31b836d,option:router
tag:port-f9413026-f384-4b91-9021-b1e0139e2fb9,option:router
tag:subnet-6bafc9a4-2181-42bf-ae35-fba7fa249256,option:dns-server,10.180.0.3,10.180.0.2

Here is a port definition:

$ openstack port show 5b9b05e2-a247-4ca4-9e90-5cd4d0825772 -f value -c extra_dhcp_opts
[{'opt_name': 'router', 'opt_value': '', 'ip_version': 4}]

Here is a subnet definition:

$ openstack subnet show 6bafc9a4-2181-42bf-ae35-fba7fa249256
+----------------------+------------------------------------------------------+
| Field | Value |
+----------------------+------------------------------------------------------+
| allocation_pools | 10.180.0.2-10.180.0.254 |
| cidr | 10.180.0.0/24 |
| created_at | 2022-02-18T17:12:29Z |
| description | |
| dns_nameservers | |
| dns_publish_fixed_ip | None |
| enable_dhcp | True |
| gateway_ip | 10.180.0.1 |
| host_routes | destination='10.100.0.0/26', gateway='10.180.0.1' |
| | destination='10.10.136.0/24', gateway='10.180.0.1' |
| | destination='10.10.22.0/24', gateway='10.180.0.1' |
| | destination='10.10.28.0/24', gateway='10.180.0.1' |
| | destination='10.10.46.0/24', gateway='10.180.0.1' |
| | destination='123.123.1.123/32', gateway='10.180.0.1' |
| | destination='123.123.2.32/27', gateway='10.180.0.1' |
| id | 6bafc9a4-2181-42bf-ae35-fba7fa249256 |
| ip_version | 4 |
| ipv6_address_mode | None |
| ipv6_ra_mode | None |
| name | subnet-data |
| network_id | b7b18782-1419-4fdc-8395-619ae8a8372a |
| project_id | ba3fddd34b014b208e55a8047574fb60 |
| revision_number | 36 |
| segment_id | None |
| service_types | |
| subnetpool_id | None |
| tags | skip-monitoring |
| updated_at | 2022-06-20T14:30:39Z |
+----------------------+------------------------------------------------------+

You can clearly see that it does have a gateway_ip, but I don't want it to be pushed to a VM (multiple network interfaces). I also configured a custom dnsmasq option: empty "option:router", but dnsmasq still has the "0.0.0.0/0,10.180.0.1" which is interpreted as a default GW in OS (ubuntu 20.04):

$ cat /run/systemd/netif/leases/2 | grep '0\.0\.0\.0'
ROUTES=10.100.0.0/26,10.180.0.1 10.10.136.0/24,10.180.0.1 10.10.22.0/24,10.180.0.1 10.10.28.0/24,10.180.0.1 10.10.46.0/24,10.180.0.1 123.123.1.123/32,10.180.0.1 123.123.2.32/27,10.180.0.1 169.254.169.254/32,10.180.0.2 0.0.0.0/0,10.180.0.1

There should be a way to get rid of 0.0.0.0/0 destination.

Revision history for this message
kay (kay-diam) wrote :
Revision history for this message
Bence Romsics (bence-romsics) wrote :

It seems to me I can get neutron to not push a default route.

The subnet gateway actually can be unset. The API allows is:

https://docs.openstack.org/api-ref/network/v2/?expanded=create-subnet-detail#create-subnet
"To specify a subnet without a gateway, set the gateway_ip attribute to null in the request body."

With neutron client you could (and still can though neutron client is deprecated) leave the subnet gateway unset:

neutron subnet-create NET CIDR --no-gateway

I believe it is a bug of openstack client that leaving the gateway unset is not possible. I will open another bug report for that.

But if I do that and set the port's dhcp option 'router' to empty then I believe no default route is pushed:

openstack network create net0
neutron subnet-create net0 10.0.4.0/24 --name subnet0 --no-gateway
openstack port create port0 --network net0
openstack port set port0 --extra-dhcp-option name=router,value=,ip-version=4
openstack server create --flavor ds1G --image u1804 --nic port-id=port0 --wait vm0

Then I log in to this vm and inside:

# dhclient ens2

and in parallel I see this dhcp reply:

# tcpdump -n -vvv -i ens2 port 67 or port 68
...
    10.0.4.1.67 > 10.0.4.55.68: [udp sum ok] BOOTP/DHCP, Reply, length 333, xid 0xbc39bb14, Flags [none] (0x0000)
          Your-IP 10.0.4.55
          Server-IP 10.0.4.1
          Client-Ethernet-Address fa:16:3e:ac:bd:06
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: ACK
            Server-ID Option 54, length 4: 10.0.4.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 10.0.4.255
            Domain-Name-Server Option 6, length 4: 10.0.4.1
            Domain-Name Option 15, length 14: "openstacklocal"
            Hostname Option 12, length 14: "host-10-0-4-55"
            Classless-Static-Route Option 121, length 9: (169.254.169.254/32:10.0.4.1)
            MTU Option 26, length 2: 1450
            END Option 255, length 0

I believe here we are only pushing a route for metadata but no default route.

I did not try yet, which part was necessary: unsetting the subnet gateway and/or using the router dhcp option (as empty). Please give it a try. I hope it works for you as a well.

I will link to the openstack client bug from here when I opened it. But until then neutron client should work.

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Looking into osc code I realized it is actually possible to leave a subnet's gateway unset with the following arcane syntax:

$ openstack subnet create --network net0 --subnet-range CIDR --gateway none NAME

And this is actually documented:

$ openstack subnet create -h
...
  --gateway <gateway>
      Specify a gateway for the subnet. The three options are: <ip-address>: Specific IP address to use as the gateway, 'auto': Gateway address should automatically be chosen from within the subnet itself, 'none': This subnet will not use a gateway, e.g.: --gateway 192.168.9.1, --gateway auto, --gateway none (default is 'auto').
...

Changed in neutron:
status: New → Invalid
Revision history for this message
kay (kay-diam) wrote :

Hi Bence, thanks for the update. The problem of the "--gateway none" is removing the actual GW from the agent subnet as well, which makes all DNS requests coming from VMs to dnsmasq to fail. This trick works temporarily until the agent is "refreshed" (new config) or a new port is added, etc.

I found the source for the "0.0.0.0/0" route: https://bugs.launchpad.net/neutron/+bug/1317935
I'm not a network expert, but here is a quote from https://datatracker.ietf.org/doc/html/rfc3442

Local Subnet Routes

   In some cases more than one IP subnet may be configured on a link.
   In such cases, a host whose IP address is in one IP subnet in the
   link could communicate directly with a host whose IP address is in a
   different IP subnet on the same link. In cases where a client is
   being assigned an IP address on an IP subnet on such a link, for each
   IP subnet in the link other than the IP subnet on which the client
   has been assigned the DHCP server MAY be configured to specify a
   router IP address of 0.0.0.0.

   For example, consider the case where there are three IP subnets
   configured on a link: 10.0.0/24, 192.168.0/24, 10.0.21/24. If the
   client is assigned an IP address of 10.0.21.17, then the server could
   include a route with a destination of 10.0.0/24 and a router address
   of 0.0.0.0, and also a route with a destination of 192.168.0/24 and a
   router address of 0.0.0.0.

   A DHCP client whose underlying TCP/IP stack does not provide this
   capability MUST ignore routes in the Classless Static Routes option
   whose router IP address is 0.0.0.0. Please note that the behavior
   described here only applies to the Classless Static Routes option,
   not to the Static Routes option nor the Router option.

For me 0.0.0.0/0 is not the same as 0.0.0.0 (or 0.0.0.0/32). Please correct me if I'm wrong.

kay (kay-diam)
Changed in neutron:
status: Invalid → New
Revision history for this message
Bence Romsics (bence-romsics) wrote :

Hi Kay,

Now I better understand the problem you have. But it still seems possible to me to get what you need without changing neutron behavior.

We can set the dns servers also from dhcp options. Consider this:

openstack network create net0
openstack subnet create --network net0 --subnet-range 10.0.4.0/24 --gateway none subnet0
openstack port create port0 --network net0
openstack port set port0 --extra-dhcp-option name=domain-name-servers,value=...,ip-version=4

Where we can find the IPs for dnsmasqs like this (of course in a generic case you may have more than one dhcp ports and more than one IPs per port):

openstack port list --network net0 --device-owner network:dhcp -f value -c id | head -1 | xargs -r -n1 openstack port show -f json -c fixed_ips | jq -r .fixed_ips[0].ip_address

Could this work for you? I believe what we see in the API with this setup is a better representation for the behavior you want, because nobody needs to know (and interpret) which piece of information overrides another piece.

> This trick works temporarily until the agent is "refreshed" (new config) or a new port is added, etc.

What exactly do you mean by this? Have you found a case where dnsmasq configuration is properly generated first but is later lost somehow without the API content changing? If yes, please tell me how to reproduce that. That sounds like a bug.

> ... but here is a quote from https://datatracker.ietf.org/doc/html/rfc3442

To be honest, I don't see how this quote is relevant here. I read that quote as enabling dhcp to tell a client that multiple subnets are available on the same network. But '0.0.0.0' here is not a default route. The special value '0.0.0.0' is used in the gateway's place.

Revision history for this message
kay (kay-diam) wrote :

Hi Bence,

Apologies for a late response. Has a lot of stuff to do...

> We can set the dns servers also from dhcp options

This is true, but then you're loosing the dnsmasq caching possibility. If I set upstream DNS servers, they may be bombarded by DNS clients in the private networks.

> What exactly do you mean by this?

It's hard to explain, because I haven't digged into this deeper. My actual "trick" is this one:

openstack subnet set subnet-data --gateway 10.180.0.1 # sets the default GW on the agent and DHCP option
oopenstack subnet set subnet-data --gateway none # unsets it, but it is still kept on the agent for some time

The default GW on the agent disappears after some time and DNS requests become unusable again.

> To be honest, I don't see how this quote is relevant here.

Initial change (https://bugs.launchpad.net/neutron/+bug/1317935) introduced the "0.0.0.0/0,10.180.0.1" route option, which basically means the default gateway. RFC3442 doesn't tell that there must be the "0.0.0.0/0" CIDR, it mentions only the "0.0.0.0" address.

Long story short: DHCP agents don't have default gateway when the "--gateway none" is set and they become useless.

Revision history for this message
kay (kay-diam) wrote :

Hi Bence,

Do you have any updates on this issue?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.