neutron

Toggling dhcp on and off in a subnet causes new instances to be unreachable

Bug #1918914 reported by Arnoud de Jonge on 2021-03-12

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	neutron	Invalid	Medium	Unassigned

Bug Description

After DHCP was turned on and off again on our network, new instances where not reachable. We found that they still tried to get their network config via DHCP after that.

We run Openstack Ussuri installed with Openstack Kolla with OVN networking enabled. Also force_config_drive is set to true.

Steps to reproduce:

  openstack network create test
  openstack subnet create --no-dhcp --subnet-range 192.168.0.0/24 --network test test
  openstack router create test
  openstack router set test --external-gateway public
  openstack router add subnet test test

openstack server create --network test --image e83d66e7-776a-4b59-a583-97dfcc5799f6 --flavor s3.small --key-name noudssh test-1

Network metadata:

{
   "links" : [
      {
         "ethernet_mac_address" : "fa:16:3e:b1:f6:ee",
         "id" : "tap7608d5b5-bd",
         "mtu" : 8942,
         "type" : "ovs",
         "vif_id" : "7608d5b5-bdc5-4215-a39c-acd8fa1318c2"
      }
   ],
   "networks" : [
      {
         "id" : "network0",
         "ip_address" : "192.168.0.237",
         "link" : "tap7608d5b5-bd",
         "netmask" : "255.255.255.0",
         "network_id" : "66a6378c-3e2d-4814-9412-4a784a81e516",
         "routes" : [
            {
               "gateway" : "192.168.0.1",
               "netmask" : "0.0.0.0",
               "network" : "0.0.0.0"
            }
         ],
         "services" : [],
         "type" : "ipv4"
      }
   ],
   "services" : []
}

Toggle DHCP and create new server:

  openstack subnet set --dhcp test
  openstack subnet set --no-dhcp test
  openstack server create --network test --image e83d66e7-776a-4b59-a583-97dfcc5799f6 --flavor s3.small --key-name noudssh test-2

Network metadata:

{
   "links" : [
      {
         "type" : "ovs",
         "id" : "tapee8f020a-1f",
         "vif_id" : "ee8f020a-1f2e-4db3-aab5-f6387fb45ba6",
         "ethernet_mac_address" : "fa:16:3e:94:05:35",
         "mtu" : 8942
      }
   ],
   "services" : [],
   "networks" : [
      {
         "network_id" : "66a6378c-3e2d-4814-9412-4a784a81e516",
         "link" : "tapee8f020a-1f",
         "type" : "ipv4_dhcp",
         "id" : "network0"
      }
   ]
}

As DHCP is now off, this instance stays unreachable.

I tried the same in a cluster with OVN disabled and that worked without any problem. So this seems to be OVN related.

See original description

Tags:

Arnoud de Jonge (arnoud-dejonge-4) on 2021-03-12

description:

updated

Revision history for this message

Brian Haley (brian-haley) wrote on 2021-03-12:

Since this report is against Ussuri or later I'm moving to the neutron component since that's where the OVN driver code is now.

affects:	networking-ovn → neutron
tags:	added: ovn

Revision history for this message

Rodolfo Alonso (rodolfo-alonso-hernandez) wrote on 2021-03-12:

Hello Arnoud:

I tried using the steps you provided. When I create the first VM (test-1), there is no DHCP reply to the DHCP client. Once I enable the DHCP in the subnet, the VM receives an IP address.

Then I disable again the DHCP option in the subnet and create the second VM. Of course, the second VM (test-2) do not receive an IP address but the first one do not loose the one provided and still has connectivity.

Am I forgetting something when trying to reproduce this issue?

BTW, the metadata namespace is correctly created and the metadata agent is serving data from this namespace (at least to the first VM when the DHCP is enabled again and receives an IP).

Regards.

Revision history for this message

Arnoud de Jonge (arnoud-dejonge-4) wrote on 2021-03-12:

We're using the configdrive. So test-1 and test-2 should get their IP through the config drive. In our case test-1 gets an IP as expected. Then I toggle DHCP on and off and launch test-2. I would expect test-2 to get an IP the same way as test-1, but it doesn't and when I check the config drive for this one (see the json in my original post) it is now set to DHCP, which of cause will not work.

When I delete the DHCP port from the subnet, and launch a new server, it does get a static IP again, the same way I see for test-1 again.

Hongbin Lu (hongbin.lu) on 2021-03-13

Changed in neutron:
status:	New → Confirmed
importance:	Undecided → Medium

Revision history for this message

Max Khon (fjoe) wrote on 2022-05-31 (last edit on 2022-05-31):

When you turn DHCP off the instance is expected to get the IP address via metadata service.

What are the Port_Binding properties for localport for your network in your scenario?

In my case I see that localport does not have external_ids:neutron:cidrs property (empty) and that's why neutron-ovn-metadata-agent ignores it:

---
root@eq-os1:~# ovn-sbctl find Port_Binding type=localport
_uuid : b6329cbe-e80f-48a3-921d-e1031afd85d8
chassis : []
datapath : 097732e0-85d1-4744-a9c6-bafa0d861700
encap : []
external_ids : {"neutron:cidrs"="", "neutron:device_id"=ovnmeta-81954d74-51e6-4598-b6b6-3da3832f20df, "neutron:device_owner"="network:dhcp", "neutron:network_name"=neutron-81954d74-51e6-4598-b6b6-3da3832f20df, "neutron:port_name"="", "neutron:project_id"=f11221fbfbb844209cd49c7ca3a12a00, "neutron:revision_number"="1", "neutron:security_group_ids"=""}
gateway_chassis : []
ha_chassis_group : []
logical_port : "a557f47a-dae7-4150-96c2-71abbf48b84b"
mac : ["fa:16:3e:06:ed:9b"]
nat_addresses : []
options : {requested-chassis=""}
parent_port : []
tag : []
tunnel_key : 2
type : localport
virtual_parent : []
root@eq-os1:~#
---

Corresponding code in neutron-ovn-metadata-agent is:

--- neutron/agent/ovn/metadata/agent.py ---
        # If there's no metadata port or it doesn't have a MAC or IP
        # addresses, then tear the namespace down if needed. This might happen
        # when there are no subnets yet created so metadata port doesn't have
        # an IP address.
        if not (port and port.mac and
                port.external_ids.get(ovn_const.OVN_CIDRS_EXT_ID_KEY, None)):
            LOG.debug("There is no metadata port for network %s or it has no "
                      "MAC or IP addresses configured, tearing the namespace "
                      "down if needed", net_name)
            self.teardown_datapath(datapath, net_name)
            return
---

When I enable DHCP on this subnet (of an external provider network) neutron:cidrs gets non-empty and metadata gets correctly provisioned.

When you turn DHCP off the instance is expected to get the IP address via metadata service.

What are the Port_Binding properties for localport for your network in your scenario?

In my case I see that localport does not have external_ids:neutron:cidrs property (empty) and that's why neutron-ovn-metadata-agent ignores it:

---
root@eq-os1:~# ovn-sbctl find Port_Binding type=localport
_uuid               : b6329cbe-e80f-48a3-921d-e1031afd85d8
chassis             : []
datapath            : 097732e0-85d1-4744-a9c6-bafa0d861700
encap               : []
external_ids        : {"neutron:cidrs"="", "neutron:device_id"=ovnmeta-81954d74-51e6-4598-b6b6-3da3832f20df, "neutron:device_owner"="network:dhcp", "neutron:network_name"=neutron-81954d74-51e6-4598-b6b6-3da3832f20df, "neutron:port_name"="", "neutron:project_id"=f11221fbfbb844209cd49c7ca3a12a00, "neutron:revision_number"="1", "neutron:security_group_ids"=""}
gateway_chassis     : []
ha_chassis_group    : []
logical_port        : "a557f47a-dae7-4150-96c2-71abbf48b84b"
mac                 : ["fa:16:3e:06:ed:9b"]
nat_addresses       : []
options             : {requested-chassis=""}
parent_port         : []
tag                 : []
tunnel_key          : 2
type                : localport
virtual_parent      : []
root@eq-os1:~# 
---

Corresponding code in neutron-ovn-metadata-agent is:

When I enable DHCP on this subnet (of an external provider network) neutron:cidrs gets non-empty and metadata gets correctly provisioned.

Revision history for this message

yatin (yatinkarel) wrote on 2022-06-02:

Seems the original issue is duplicate of [1] already fixed with [2][3] in Ussuri, available with tag 16.4.2. I think this can be closed. And if still happens with >= 16.4.2 can be re opened.

And issue mentioned by @Max Khon seems to be the behavior(no metadata provision if dhcp disabled), atleast workaround is to use config-drive. Anyway there is already bug for it[4] so can be discussed seperately and see if this use can be supported or not.

[1] https://bugs.launchpad.net/networking-ovn/+bug/1950180
[2] https://review.opendev.org/c/openstack/neutron/+/813411
[3] https://review.opendev.org/c/openstack/neutron/+/812337
[4] https://bugs.launchpad.net/neutron/+bug/1976366

Brian Haley (brian-haley) on 2022-11-08

Changed in neutron:
status:	Confirmed → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.