OVN metadata use the same IP as LB health monitor

Bug #2004238 reported by Ching Kuo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Fernando Royo

Bug Description

Recently I ran into an issue that instances are unable to get metadata, after a few debugs I found out the issue to be the LB health monitor is replying the arp request for OVN metadata agent's IP causing the request going to metadata agent actually being sent to another MAC address.

Seems to related to https://bugs.launchpad.net/bugs/1956034, but that bug is closed by mistake.

Form sb_db logical flow dump I found the bellow flow

_uuid : df93e919-c7f5-4d04-be6f-796e175e266f
actions : "eth.dst = eth.src; eth.src = 5e:69:e7:47:38:7b; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 5e:69:e7:47:38:7b; arp.tpa = arp.spa; arp.spa = 10.0.0.2; outport = inport; flags.loopback = 1; output;"
controller_meter : []
external_ids : {source="northd.c:7971", stage-hint="5b11468a", stage-name=ls_in_arp_rsp}
logical_datapath : f714c506-3043-4366-8065-d6af4d2a08de
logical_dp_group : []
match : "arp.tpa == 10.0.0.2 && arp.op == 1"
pipeline : ingress
priority : 110
table_id : 18
tags : {}
hash : 0

Priority 110 flow with this action should be from the LB health monitor according to ovn-northd doc [1]

Where as ovn-metadata agent also uses the same IP (10.0.0.2)

+-------------------------+---------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------+---------------------------------------------------------------------------------------------+
| admin_state_up | UP |
| allowed_address_pairs | |
| binding_host_id | |
| binding_profile | |
| binding_vif_details | |
| binding_vif_type | unbound |
| binding_vnic_type | normal |
| created_at | 2023-01-17T08:04:35Z |
| data_plane_status | None |
| description | |
| device_id | ovnmeta-5640d643-e490-46b3-a724-0478aef3722e |
| device_owner | network:distributed |
| device_profile | None |
| dns_assignment | fqdn='host-10-0-0-2.infra.cloudnative.tw.', hostname='host-10-0-0-2', ip_address='10.0.0.2' |
| dns_domain | |
| dns_name | |
| extra_dhcp_opts | |
| fixed_ips | ip_address='10.0.0.2', subnet_id='3000b8bf-f687-406d-b000-55d38f365e33' |
| id | c1ba8ceb-6f02-48e6-9da3-3cb9b628b771 |
| ip_allocation | None |
| mac_address | fa:16:3e:7c:b8:1c |
| name | |
| network_id | 5640d643-e490-46b3-a724-0478aef3722e |
| numa_affinity_policy | None |
| port_security_enabled | False |
| project_id | 2fbc86f895ed4ef1ad036b7e4a068b50 |
| propagate_uplink_status | None |
| qos_network_policy_id | None |
| qos_policy_id | None |
| resource_request | None |
| revision_number | 2 |
| security_group_ids | |
| status | DOWN |
| tags | |
| trunk_details | None |
| updated_at | 2023-01-17T08:04:36Z |
+-------------------------+---------------------------------------------------------------------------------------------+

This causes all packet that should be going to 10.0.0.2 (metadata agent) actually being sent to 5e:69:e7:47:38:7b MAC address which is incorrect.

Expected result:

OVN LB health monitor sending arp replies to an unused IP, or reserved a port with an IP specifically for health monitor.

[1] https://www.ovn.org/support/dist-docs/ovn-northd.8.html

Revision history for this message
Mamatisa Nurmatov (isabek) wrote :

Hi! Which release you use? Have you tried this fix https://review.opendev.org/c/openstack/ovn-octavia-provider/+/834345 ?

Changed in neutron:
status: New → Incomplete
Revision history for this message
Ching Kuo (genekuo) wrote :

Hi,

I use the latest Zed release deployed with Kolla-Ansible.

I've just quickly checked the code, it seems like it is designed to use the metadata/dhcp port which will cause issue.

https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/helper.py#L2331-L2343

_ensure_hm_ovn_port returns the metadata/dhcp port which in _update_hm_members will use it as source IP for health monitor causing the issue I am reporting.

https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/helper.py#L137-L146

We should probably create a dedicated port or find other solutions which prevent using the metadata/dhcp port IP for health monitor.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

The bug looks legit.

Changed in neutron:
status: Incomplete → New
Revision history for this message
Ching Kuo (genekuo) wrote :

Seems like a ovn-octavia-provider bug, should I report move the bug there?

tags: added: ovn-octavia-provider
Revision history for this message
Fernando Royo (froyoredhat) wrote :

Just to confirm the issue described here. As soon a HM is added to the pool, the servers (members) change in their arp table the mac for the ovn-metadataport pointing to the one used by the HM to do the healtchecks.

Changed in neutron:
assignee: nobody → Fernando Royo (froyoredhat)
Revision history for this message
Ching Kuo (genekuo) wrote :

Yes, you are correct. I think the flow for HM arp reply have higher priority. Therefore servers receive arp reply with the arp address configured for HM, causing metadata not working as requests packets are sent to the wrong MAC addresses.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (master)
Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/873426
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/54d96ca07237dff8bccdaf4a9d5492f9103ea261
Submitter: "Zuul (22348)"
Branch: master

commit 54d96ca07237dff8bccdaf4a9d5492f9103ea261
Author: Fernando Royo <email address hidden>
Date: Fri Feb 10 19:18:09 2023 +0100

    Avoid use of ovn metadata port IP for HM checks

    For every backend IP in the load balancer for which health
    check is configured, a new row in the Service_Monitor table
    is created and according to that ovn-controller will
    periodically sends out the service monitor packets.

    In this patch we create a new port for this purpose,
    instead of use the ovn_metadata_port to configure the
    backends in the field ip_port_mappings, this mapping is
    the info used to be translated to Service_Monitor
    entries (more details [1]).

    [1] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/northd/ovn-northd.8.xml#L1431

    Closes-Bug: #2004238
    Change-Id: I11c4d9671eee002b15080d055a18a4d3f4d7c540

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
Ching Kuo (genekuo) wrote :

Are we back porting the changes to current maintained version?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/zed)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/yoga)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/xena)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/874273

Revision history for this message
Ching Kuo (genekuo) wrote :

I can verify that the patch works for newly created health monitors.

+--------------------------------------+-------------------------------------------------+-------------------+------------------------------------------------------------------------------+--------+
| ID | Name | MAC Address | Fixed IP Addresses | Status |
+--------------------------------------+-------------------------------------------------+-------------------+------------------------------------------------------------------------------+--------+
| 0bdea4a7-6407-4fa6-bbad-f119b0e1e2cb | ovn-lb-hm-57d76ad6-dae2-4415-8b55-e792eea85b06 | fa:16:3e:09:21:69 | ip_address='192.168.0.6', subnet_id='57d76ad6-dae2-4415-8b55-e792eea85b06' | DOWN |

However, for existing load balancer with health monitors, the port isn't created.

Revision history for this message
Fernando Royo (froyoredhat) wrote :

Yeah, in order to create the port you should do any of the following options:

- Add a member
- Remove a member
- Recreate the HM

Revision history for this message
Ching Kuo (genekuo) wrote :

Got it.
IMO, it's better to add these information to the release note. Thank you for the quick fix.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/874270
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/fac557f9f7b03dd6c77cf37f869fd169d3033e87
Submitter: "Zuul (22348)"
Branch: stable/zed

commit fac557f9f7b03dd6c77cf37f869fd169d3033e87
Author: Fernando Royo <email address hidden>
Date: Fri Feb 10 19:18:09 2023 +0100

    Avoid use of ovn metadata port IP for HM checks

    For every backend IP in the load balancer for which health
    check is configured, a new row in the Service_Monitor table
    is created and according to that ovn-controller will
    periodically sends out the service monitor packets.

    In this patch we create a new port for this purpose,
    instead of use the ovn_metadata_port to configure the
    backends in the field ip_port_mappings, this mapping is
    the info used to be translated to Service_Monitor
    entries (more details [1]).

    [1] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/northd/ovn-northd.8.xml#L1431

    Closes-Bug: #2004238
    Change-Id: I11c4d9671eee002b15080d055a18a4d3f4d7c540
    (cherry picked from commit 54d96ca07237dff8bccdaf4a9d5492f9103ea261)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/874271
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/776e96bf8b47ce3e8db08d19ee104657539b6144
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 776e96bf8b47ce3e8db08d19ee104657539b6144
Author: Fernando Royo <email address hidden>
Date: Fri Feb 10 19:18:09 2023 +0100

    Avoid use of ovn metadata port IP for HM checks

    For every backend IP in the load balancer for which health
    check is configured, a new row in the Service_Monitor table
    is created and according to that ovn-controller will
    periodically sends out the service monitor packets.

    In this patch we create a new port for this purpose,
    instead of use the ovn_metadata_port to configure the
    backends in the field ip_port_mappings, this mapping is
    the info used to be translated to Service_Monitor
    entries (more details [1]).

    [1] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/northd/ovn-northd.8.xml#L1431

    Closes-Bug: #2004238
    Change-Id: I11c4d9671eee002b15080d055a18a4d3f4d7c540
    (cherry picked from commit 54d96ca07237dff8bccdaf4a9d5492f9103ea261)

tags: added: in-stable-yoga
tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/874272
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/1c6024dd8aaa76476d205a11a3b1791515fa2da0
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 1c6024dd8aaa76476d205a11a3b1791515fa2da0
Author: Fernando Royo <email address hidden>
Date: Fri Feb 10 19:18:09 2023 +0100

    Avoid use of ovn metadata port IP for HM checks

    For every backend IP in the load balancer for which health
    check is configured, a new row in the Service_Monitor table
    is created and according to that ovn-controller will
    periodically sends out the service monitor packets.

    In this patch we create a new port for this purpose,
    instead of use the ovn_metadata_port to configure the
    backends in the field ip_port_mappings, this mapping is
    the info used to be translated to Service_Monitor
    entries (more details [1]).

    [1] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/northd/ovn-northd.8.xml#L1431

    Closes-Bug: #2004238
    Change-Id: I11c4d9671eee002b15080d055a18a4d3f4d7c540
    (cherry picked from commit 54d96ca07237dff8bccdaf4a9d5492f9103ea261)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/874273
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/82a46915948dc8e8700a978033bbe69e26835f56
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 82a46915948dc8e8700a978033bbe69e26835f56
Author: Fernando Royo <email address hidden>
Date: Fri Feb 10 19:18:09 2023 +0100

    Avoid use of ovn metadata port IP for HM checks

    For every backend IP in the load balancer for which health
    check is configured, a new row in the Service_Monitor table
    is created and according to that ovn-controller will
    periodically sends out the service monitor packets.

    In this patch we create a new port for this purpose,
    instead of use the ovn_metadata_port to configure the
    backends in the field ip_port_mappings, this mapping is
    the info used to be translated to Service_Monitor
    entries (more details [1]).

    [1] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/northd/ovn-northd.8.xml#L1431

    Conflicts:
          ovn_octavia_provider/helper.py

    Closes-Bug: #2004238
    Change-Id: I11c4d9671eee002b15080d055a18a4d3f4d7c540
    (cherry picked from commit 54d96ca07237dff8bccdaf4a9d5492f9103ea261)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 4.0.0.0rc1

This issue was fixed in the openstack/ovn-octavia-provider 4.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.3.0

This issue was fixed in the openstack/ovn-octavia-provider 1.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 2.1.0

This issue was fixed in the openstack/ovn-octavia-provider 2.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 3.1.0

This issue was fixed in the openstack/ovn-octavia-provider 3.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider wallaby-eom

This issue was fixed in the openstack/ovn-octavia-provider wallaby-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.