Restarting OVS with DVR creates a network loop

Bug #2028795 reported by Jakub Libosvar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Jakub Libosvar

Bug Description

restarting OVS agent with DVR enabled creates a network loop between the external network and a tunneling network for a very short period of time. This causes big problems when 2 agents are restarted at the same time.

Steps to reproduce:
1) Have ml2/ovs with DVR enabled
2) Have a VM with a FIP on compute node A
3) Have a gw port for snat traffic on network node B
4) ping the FIP with -i 0.1 option to send icmp request every 0.1 seconds
5) restart OVS agents on both compute node A and network node B at the same time

Now the replies for the FIP traffic gets dropped on the compute node A for about 3-5 minutes because the loop causes that local OVS on compute node A learns that GW port MAC is on the tunneling interface. All reply traffic uses that MAC in its destination field and normal OVS action no longer floods such traffic but as per its FDB entry forwards it to the patch port between br-int and br-tun, where it's dropped until the FDB entry expires.

Revision history for this message
Lajos Katona (lajos-katona) wrote :
Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890230
Committed: https://opendev.org/openstack/neutron/commit/489cdf5f16a97d0cc63eb88d031e780ed1f75cff
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 489cdf5f16a97d0cc63eb88d031e780ed1f75cff
Author: Jakub Libosvar <email address hidden>
Date: Wed Jul 26 18:28:29 2023 +0000

    dvr: Avoid installing non-dvr openflow rule on startup

    The tunneling bridge uses different openflow rules depending if the
    agent is running in DVR mode or not. With DVR enabled initial rule was
    installed that caused traffic coming from the integration bridge to be
    flooded to all tunnels. After a few miliseconds this flow was replaced
    by a DVR specific flow, correctly dropping the traffic. This small time
    window caused a network loop on the compute node with restarted agent.

    This patch skips installing the non-dvr specific flow in case OVS agent
    is working in DVR mode. Hence the traffic is never flooded to the
    tunnels.

    Closes-bug: #2028795

    Conflicts:
            neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py

    Signed-off-by: Jakub Libosvar <email address hidden>
    Change-Id: I3ce026054286c8e28ec1500f1a4aa607fe73f337
    (cherry picked from commit ba6f7bf83e6f17048a97f781aa16bf4a643a53d2)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890413
Committed: https://opendev.org/openstack/neutron/commit/7692e3ae6c3e73d58dcca54ef199dceb1250daf7
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 7692e3ae6c3e73d58dcca54ef199dceb1250daf7
Author: Jakub Libosvar <email address hidden>
Date: Wed Jul 26 18:28:29 2023 +0000

    dvr: Avoid installing non-dvr openflow rule on startup

    The tunneling bridge uses different openflow rules depending if the
    agent is running in DVR mode or not. With DVR enabled initial rule was
    installed that caused traffic coming from the integration bridge to be
    flooded to all tunnels. After a few miliseconds this flow was replaced
    by a DVR specific flow, correctly dropping the traffic. This small time
    window caused a network loop on the compute node with restarted agent.

    This patch skips installing the non-dvr specific flow in case OVS agent
    is working in DVR mode. Hence the traffic is never flooded to the
    tunnels.

    Closes-bug: #2028795

    Conflicts:
            neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py

    Signed-off-by: Jakub Libosvar <email address hidden>
    Change-Id: I3ce026054286c8e28ec1500f1a4aa607fe73f337
    (cherry picked from commit ba6f7bf83e6f17048a97f781aa16bf4a643a53d2)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890229
Committed: https://opendev.org/openstack/neutron/commit/b96dc966ce0f1f6fa0705ec75ac0a33f62be2e2c
Submitter: "Zuul (22348)"
Branch: stable/zed

commit b96dc966ce0f1f6fa0705ec75ac0a33f62be2e2c
Author: Jakub Libosvar <email address hidden>
Date: Wed Jul 26 18:28:29 2023 +0000

    dvr: Avoid installing non-dvr openflow rule on startup

    The tunneling bridge uses different openflow rules depending if the
    agent is running in DVR mode or not. With DVR enabled initial rule was
    installed that caused traffic coming from the integration bridge to be
    flooded to all tunnels. After a few miliseconds this flow was replaced
    by a DVR specific flow, correctly dropping the traffic. This small time
    window caused a network loop on the compute node with restarted agent.

    This patch skips installing the non-dvr specific flow in case OVS agent
    is working in DVR mode. Hence the traffic is never flooded to the
    tunnels.

    Closes-bug: #2028795

    Signed-off-by: Jakub Libosvar <email address hidden>
    Change-Id: I3ce026054286c8e28ec1500f1a4aa607fe73f337
    (cherry picked from commit ba6f7bf83e6f17048a97f781aa16bf4a643a53d2)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890228
Committed: https://opendev.org/openstack/neutron/commit/6f8dc124c4ec1ca026ebe1e5f00aef21cbaa70dc
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 6f8dc124c4ec1ca026ebe1e5f00aef21cbaa70dc
Author: Jakub Libosvar <email address hidden>
Date: Wed Jul 26 18:28:29 2023 +0000

    dvr: Avoid installing non-dvr openflow rule on startup

    The tunneling bridge uses different openflow rules depending if the
    agent is running in DVR mode or not. With DVR enabled initial rule was
    installed that caused traffic coming from the integration bridge to be
    flooded to all tunnels. After a few miliseconds this flow was replaced
    by a DVR specific flow, correctly dropping the traffic. This small time
    window caused a network loop on the compute node with restarted agent.

    This patch skips installing the non-dvr specific flow in case OVS agent
    is working in DVR mode. Hence the traffic is never flooded to the
    tunnels.

    Closes-bug: #2028795

    Signed-off-by: Jakub Libosvar <email address hidden>
    Change-Id: I3ce026054286c8e28ec1500f1a4aa607fe73f337
    (cherry picked from commit ba6f7bf83e6f17048a97f781aa16bf4a643a53d2)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890231
Committed: https://opendev.org/openstack/neutron/commit/494f0e01e8fd5c58239eed580eec607c6db18f1e
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 494f0e01e8fd5c58239eed580eec607c6db18f1e
Author: Jakub Libosvar <email address hidden>
Date: Wed Jul 26 18:28:29 2023 +0000

    dvr: Avoid installing non-dvr openflow rule on startup

    The tunneling bridge uses different openflow rules depending if the
    agent is running in DVR mode or not. With DVR enabled initial rule was
    installed that caused traffic coming from the integration bridge to be
    flooded to all tunnels. After a few miliseconds this flow was replaced
    by a DVR specific flow, correctly dropping the traffic. This small time
    window caused a network loop on the compute node with restarted agent.

    This patch skips installing the non-dvr specific flow in case OVS agent
    is working in DVR mode. Hence the traffic is never flooded to the
    tunnels.

    Closes-bug: #2028795

    Conflicts:
            neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py

    Signed-off-by: Jakub Libosvar <email address hidden>
    Change-Id: I3ce026054286c8e28ec1500f1a4aa607fe73f337
    (cherry picked from commit ba6f7bf83e6f17048a97f781aa16bf4a643a53d2)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.0.0.0b3

This issue was fixed in the openstack/neutron 23.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.1.0

This issue was fixed in the openstack/neutron 22.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.5.0

This issue was fixed in the openstack/neutron 20.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.2.0

This issue was fixed in the openstack/neutron 21.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom

This issue was fixed in the openstack/neutron wallaby-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron xena-eom

This issue was fixed in the openstack/neutron xena-eom release.

Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.