[OVS agent] Physical bridges can't be initialized if there is no connectivity to rabbitmq

Bug #1840443 reported by Slawek Kaplonski on 2019-08-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Medium
Slawek Kaplonski

Bug Description

In some deployments it may be that same external bridge (br-ex for example) is used to provide data plane connectivity connectivity for vms but also connectivity for control plane, e.g. neutron openvswitch agent uses it to connect to rabbitmq.
That may lead to "dead lock" after e.g. host reboot. It happens like that because br-ex is set by neutron agent to be in faile_mode=secure and that means that if there are no openflow rules added for bridge, it will not proceed any packets. And as there is no connection to rabbitmq, neutron-ovs-agent is failing on setup_rpc method (here: https://github.com/openstack/neutron/blob/30a60d04f098581340f83b38b7a79104308c66bc/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L198) so will never configure initial rules for physical bridges which is done here: https://github.com/openstack/neutron/blob/30a60d04f098581340f83b38b7a79104308c66bc/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L212

To fix this problem, we should do initialization of physical bridges before setup rpc.

Fix proposed to branch: master
Review: https://review.opendev.org/676949

Changed in neutron:
status: Confirmed → In Progress

Reviewed: https://review.opendev.org/676949
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d41bd58f31e259fe408c8c059b31299fdfe81127
Submitter: Zuul
Branch: master

commit d41bd58f31e259fe408c8c059b31299fdfe81127
Author: Slawek Kaplonski <email address hidden>
Date: Fri Aug 16 13:44:09 2019 +0000

    Initialize phys bridges before setup_rpc

    Neutron-ovs-agent configures physical bridges that they works
    in fail_mode=secure. This means that only packets which match some
    OpenFlow rule in the bridge can be processed.
    This may cause problem on hosts with only one physical NIC
    where same bridge is used to provide control plane connectivity
    like connection to rabbitmq and data plane connectivity for VM.
    After e.g. host reboot bridge will still be in fail_mode=secure
    but there will be no any OpenFlow rule on it thus there will be
    no communication to rabbitmq.

    With current order of actions in __init__ method of OVSNeutronAgent
    class it first tries to establish connection to rabbitmq and later
    configure physical bridges with some initial OpenFlow rules.
    And in case described above it will fail as there is no connectivity
    to rabbitmq through physical bridge.

    So this patch changes order of actions in __init__ method that it first
    setup physical bridges and than configure rpc connection.

    Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
    Closes-Bug: #1840443

Changed in neutron:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/677054
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3a2842bdd8d8d59e445393c7c7e7a9793357df08
Submitter: Zuul
Branch: stable/stein

commit 3a2842bdd8d8d59e445393c7c7e7a9793357df08
Author: Slawek Kaplonski <email address hidden>
Date: Fri Aug 16 13:44:09 2019 +0000

    Initialize phys bridges before setup_rpc

    Neutron-ovs-agent configures physical bridges that they works
    in fail_mode=secure. This means that only packets which match some
    OpenFlow rule in the bridge can be processed.
    This may cause problem on hosts with only one physical NIC
    where same bridge is used to provide control plane connectivity
    like connection to rabbitmq and data plane connectivity for VM.
    After e.g. host reboot bridge will still be in fail_mode=secure
    but there will be no any OpenFlow rule on it thus there will be
    no communication to rabbitmq.

    With current order of actions in __init__ method of OVSNeutronAgent
    class it first tries to establish connection to rabbitmq and later
    configure physical bridges with some initial OpenFlow rules.
    And in case described above it will fail as there is no connectivity
    to rabbitmq through physical bridge.

    So this patch changes order of actions in __init__ method that it first
    setup physical bridges and than configure rpc connection.

    Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
    Closes-Bug: #1840443
    (cherry picked from commit d41bd58f31e259fe408c8c059b31299fdfe81127)

tags: added: in-stable-stein
tags: added: in-stable-rocky

Reviewed: https://review.opendev.org/677055
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f9473566d559bf25c1baac2fcef505b0d2609fa1
Submitter: Zuul
Branch: stable/rocky

commit f9473566d559bf25c1baac2fcef505b0d2609fa1
Author: Slawek Kaplonski <email address hidden>
Date: Fri Aug 16 13:44:09 2019 +0000

    Initialize phys bridges before setup_rpc

    Neutron-ovs-agent configures physical bridges that they works
    in fail_mode=secure. This means that only packets which match some
    OpenFlow rule in the bridge can be processed.
    This may cause problem on hosts with only one physical NIC
    where same bridge is used to provide control plane connectivity
    like connection to rabbitmq and data plane connectivity for VM.
    After e.g. host reboot bridge will still be in fail_mode=secure
    but there will be no any OpenFlow rule on it thus there will be
    no communication to rabbitmq.

    With current order of actions in __init__ method of OVSNeutronAgent
    class it first tries to establish connection to rabbitmq and later
    configure physical bridges with some initial OpenFlow rules.
    And in case described above it will fail as there is no connectivity
    to rabbitmq through physical bridge.

    So this patch changes order of actions in __init__ method that it first
    setup physical bridges and than configure rpc connection.

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py

    Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
    Closes-Bug: #1840443
    (cherry picked from commit d41bd58f31e259fe408c8c059b31299fdfe81127)
    (cherry picked from commit 3a2842bdd8d8d59e445393c7c7e7a9793357df08)

tags: added: in-stable-queens

Reviewed: https://review.opendev.org/677056
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6618a917d86add7ee5c1005a21046d1bc57ac95c
Submitter: Zuul
Branch: stable/queens

commit 6618a917d86add7ee5c1005a21046d1bc57ac95c
Author: Slawek Kaplonski <email address hidden>
Date: Fri Aug 16 13:44:09 2019 +0000

    Initialize phys bridges before setup_rpc

    Neutron-ovs-agent configures physical bridges that they works
    in fail_mode=secure. This means that only packets which match some
    OpenFlow rule in the bridge can be processed.
    This may cause problem on hosts with only one physical NIC
    where same bridge is used to provide control plane connectivity
    like connection to rabbitmq and data plane connectivity for VM.
    After e.g. host reboot bridge will still be in fail_mode=secure
    but there will be no any OpenFlow rule on it thus there will be
    no communication to rabbitmq.

    With current order of actions in __init__ method of OVSNeutronAgent
    class it first tries to establish connection to rabbitmq and later
    configure physical bridges with some initial OpenFlow rules.
    And in case described above it will fail as there is no connectivity
    to rabbitmq through physical bridge.

    So this patch changes order of actions in __init__ method that it first
    setup physical bridges and than configure rpc connection.

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py

    Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
    Closes-Bug: #1840443
    (cherry picked from commit d41bd58f31e259fe408c8c059b31299fdfe81127)
    (cherry picked from commit 3a2842bdd8d8d59e445393c7c7e7a9793357df08)

This issue was fixed in the openstack/neutron 15.0.0.0b1 development milestone.

Reviewed: https://review.opendev.org/687095
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fb515a75d6c055a5a4a1de42f48bb7eaf393c5d4
Submitter: Zuul
Branch: stable/pike

commit fb515a75d6c055a5a4a1de42f48bb7eaf393c5d4
Author: Slawek Kaplonski <email address hidden>
Date: Fri Aug 16 13:44:09 2019 +0000

    Initialize phys bridges before setup_rpc

    Neutron-ovs-agent configures physical bridges that they works
    in fail_mode=secure. This means that only packets which match some
    OpenFlow rule in the bridge can be processed.
    This may cause problem on hosts with only one physical NIC
    where same bridge is used to provide control plane connectivity
    like connection to rabbitmq and data plane connectivity for VM.
    After e.g. host reboot bridge will still be in fail_mode=secure
    but there will be no any OpenFlow rule on it thus there will be
    no communication to rabbitmq.

    With current order of actions in __init__ method of OVSNeutronAgent
    class it first tries to establish connection to rabbitmq and later
    configure physical bridges with some initial OpenFlow rules.
    And in case described above it will fail as there is no connectivity
    to rabbitmq through physical bridge.

    So this patch changes order of actions in __init__ method that it first
    setup physical bridges and than configure rpc connection.

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py

    Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
    Closes-Bug: #1840443
    (cherry picked from commit d41bd58f31e259fe408c8c059b31299fdfe81127)
    (cherry picked from commit 3a2842bdd8d8d59e445393c7c7e7a9793357df08)

tags: added: in-stable-pike
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers