[OVN] If "chassis" register is deleted, "chassis_private" can have 0 "chassis" associated

Bug #1951149 reported by Rodolfo Alonso
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Rodolfo Alonso

Bug Description

When a OVN SB "chassis" register is deleted, the "chassis_private" register will have no "chassis" associated.

In a healthy environment, with the ovn-controller service running, if the host "chassis" register is deleted, the ovn-controller will create it again.

If the ovn-controller didn't finish gracefully, it will leave both registers in the SB DB undeleted. To clean up the environment, an administrator should delete both registers ("chassis_private" and "chassis") from the SB DB and delete the OVN agents from Neutron.

However, if those steps are done incorrectly, the Neutron server can return an exception. Steps to reproduce this error:
- Kill the ovn-controller.
- Delete the "chassis" register.
- Restart the Neutron server. Neutron does not attend to "chassis" events. We need to restart the server to retrieve the "chassis_private" SB register with "chassis=[]" [1]
- List the network agents. This method [2] will return L50, as a "chassis" register. This is incorrect and will fail with [3].

[1]https://paste.opendev.org/show/811040/
[2]https://github.com/openstack/neutron/blob/83c6d23308c475e2cd7f1af858e26d14d0cba8fb/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py#L44-L50
[2]https://paste.opendev.org/show/811041/

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/818132

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/818132
Committed: https://opendev.org/openstack/neutron/commit/f1a5511e9094d6aae7a61cd858c59c706033ca95
Submitter: "Zuul (22348)"
Branch: master

commit f1a5511e9094d6aae7a61cd858c59c706033ca95
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Nov 17 17:33:29 2021 +0000

    [OVN] Handle OVN agents when "Chassis" register is deleted

    If an "ovn-controller" ends not gracefully, the node "Chassis" and
    "Chassis_Private" registers will remain in the OVN SB database.
    Because there is no a mandatory procedure to delete the "Chassis"
    and "Chassis_Private" registers, the administrator can manually
    delete, from the OVN SB database, any register in any order.

    If the "Chassis" register is deleted and the Neutron server restarted,
    the updated "Chassis_Private" register will be read from the database.
    That won't contain the "Chassis" information as this register has been
    deleted. In this case, the ``NeutronAgent`` returns ``DeletedChassis``,
    an empty chassis register with no information.

    NOTE: the sequence of actions ("Chassis" register deletion, Neutron
    server restart) must be follow to reproduce this issue. If the
    "Chassis" register is deleted, the Neutron server OVN agent local cache
    won't update the stored information and will keep the previous value.
    It is when the Neutron server is restarted when the OVN agent local
    cache is retrieved again; at this time the "Chassis_Private" register
    won't have any related "Chassis" register.

    Closes-Bug: #1951149

    Change-Id: I17aa53cea6aba8ea83187c99102a6f25fd33cfff

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/839022

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/839023

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/839024

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/839027

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/839022
Committed: https://opendev.org/openstack/neutron/commit/669d7ccee0b2368e0b00971a12c4fb55c33a0bc3
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 669d7ccee0b2368e0b00971a12c4fb55c33a0bc3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Nov 17 17:33:29 2021 +0000

    [OVN] Handle OVN agents when "Chassis" register is deleted

    If an "ovn-controller" ends not gracefully, the node "Chassis" and
    "Chassis_Private" registers will remain in the OVN SB database.
    Because there is no a mandatory procedure to delete the "Chassis"
    and "Chassis_Private" registers, the administrator can manually
    delete, from the OVN SB database, any register in any order.

    If the "Chassis" register is deleted and the Neutron server restarted,
    the updated "Chassis_Private" register will be read from the database.
    That won't contain the "Chassis" information as this register has been
    deleted. In this case, the ``NeutronAgent`` returns ``DeletedChassis``,
    an empty chassis register with no information.

    NOTE: the sequence of actions ("Chassis" register deletion, Neutron
    server restart) must be follow to reproduce this issue. If the
    "Chassis" register is deleted, the Neutron server OVN agent local cache
    won't update the stored information and will keep the previous value.
    It is when the Neutron server is restarted when the OVN agent local
    cache is retrieved again; at this time the "Chassis_Private" register
    won't have any related "Chassis" register.

    Closes-Bug: #1951149

    Change-Id: I17aa53cea6aba8ea83187c99102a6f25fd33cfff
    (cherry picked from commit f1a5511e9094d6aae7a61cd858c59c706033ca95)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/839023
Committed: https://opendev.org/openstack/neutron/commit/9b8a75f6e7f48c964b5af68f3caa94dedd98ef92
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 9b8a75f6e7f48c964b5af68f3caa94dedd98ef92
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Nov 17 17:33:29 2021 +0000

    [OVN] Handle OVN agents when "Chassis" register is deleted

    If an "ovn-controller" ends not gracefully, the node "Chassis" and
    "Chassis_Private" registers will remain in the OVN SB database.
    Because there is no a mandatory procedure to delete the "Chassis"
    and "Chassis_Private" registers, the administrator can manually
    delete, from the OVN SB database, any register in any order.

    If the "Chassis" register is deleted and the Neutron server restarted,
    the updated "Chassis_Private" register will be read from the database.
    That won't contain the "Chassis" information as this register has been
    deleted. In this case, the ``NeutronAgent`` returns ``DeletedChassis``,
    an empty chassis register with no information.

    NOTE: the sequence of actions ("Chassis" register deletion, Neutron
    server restart) must be follow to reproduce this issue. If the
    "Chassis" register is deleted, the Neutron server OVN agent local cache
    won't update the stored information and will keep the previous value.
    It is when the Neutron server is restarted when the OVN agent local
    cache is retrieved again; at this time the "Chassis_Private" register
    won't have any related "Chassis" register.

    Closes-Bug: #1951149

    Change-Id: I17aa53cea6aba8ea83187c99102a6f25fd33cfff
    (cherry picked from commit f1a5511e9094d6aae7a61cd858c59c706033ca95)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/839024
Committed: https://opendev.org/openstack/neutron/commit/29998538cec9a696b01d84bd9573f2ebbbb15f90
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 29998538cec9a696b01d84bd9573f2ebbbb15f90
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Nov 17 17:33:29 2021 +0000

    [OVN] Handle OVN agents when "Chassis" register is deleted

    If an "ovn-controller" ends not gracefully, the node "Chassis" and
    "Chassis_Private" registers will remain in the OVN SB database.
    Because there is no a mandatory procedure to delete the "Chassis"
    and "Chassis_Private" registers, the administrator can manually
    delete, from the OVN SB database, any register in any order.

    If the "Chassis" register is deleted and the Neutron server restarted,
    the updated "Chassis_Private" register will be read from the database.
    That won't contain the "Chassis" information as this register has been
    deleted. In this case, the ``NeutronAgent`` returns ``DeletedChassis``,
    an empty chassis register with no information.

    NOTE: the sequence of actions ("Chassis" register deletion, Neutron
    server restart) must be follow to reproduce this issue. If the
    "Chassis" register is deleted, the Neutron server OVN agent local cache
    won't update the stored information and will keep the previous value.
    It is when the Neutron server is restarted when the OVN agent local
    cache is retrieved again; at this time the "Chassis_Private" register
    won't have any related "Chassis" register.

    Closes-Bug: #1951149

    Conflicts:
      neutron/tests/functional/plugins/ml2/drivers/ovn/mech_driver/test_mech_driver.py

    Change-Id: I17aa53cea6aba8ea83187c99102a6f25fd33cfff
    (cherry picked from commit f1a5511e9094d6aae7a61cd858c59c706033ca95)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/839027
Committed: https://opendev.org/openstack/neutron/commit/494cd8ca5dc924dbaa3b3a185110371dfe2e59c1
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 494cd8ca5dc924dbaa3b3a185110371dfe2e59c1
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Nov 17 17:33:29 2021 +0000

    [OVN] Handle OVN agents when "Chassis" register is deleted

    If an "ovn-controller" ends not gracefully, the node "Chassis" and
    "Chassis_Private" registers will remain in the OVN SB database.
    Because there is no a mandatory procedure to delete the "Chassis"
    and "Chassis_Private" registers, the administrator can manually
    delete, from the OVN SB database, any register in any order.

    If the "Chassis" register is deleted and the Neutron server restarted,
    the updated "Chassis_Private" register will be read from the database.
    That won't contain the "Chassis" information as this register has been
    deleted. In this case, the ``NeutronAgent`` returns ``DeletedChassis``,
    an empty chassis register with no information.

    NOTE: the sequence of actions ("Chassis" register deletion, Neutron
    server restart) must be follow to reproduce this issue. If the
    "Chassis" register is deleted, the Neutron server OVN agent local cache
    won't update the stored information and will keep the previous value.
    It is when the Neutron server is restarted when the OVN agent local
    cache is retrieved again; at this time the "Chassis_Private" register
    won't have any related "Chassis" register.

    Closes-Bug: #1951149

    Conflicts:
      neutron/tests/functional/plugins/ml2/drivers/ovn/mech_driver/test_mech_driver.py

    Change-Id: I17aa53cea6aba8ea83187c99102a6f25fd33cfff
    (cherry picked from commit f1a5511e9094d6aae7a61cd858c59c706033ca95)
    (cherry picked from commit 29998538cec9a696b01d84bd9573f2ebbbb15f90)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.4.0

This issue was fixed in the openstack/neutron 18.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.3.0

This issue was fixed in the openstack/neutron 19.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.1.0

This issue was fixed in the openstack/neutron 20.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.0.0.0rc1

This issue was fixed in the openstack/neutron 21.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/907239

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/ussuri)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/907239
Reason: stable/ussuri branch of openstack/neutron is about to be deleted. To be able to do that, all open patches need to be abandoned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron victoria-eom

This issue was fixed in the openstack/neutron victoria-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.