after restart of a ovn-controller the agent is still down

Bug #1997982 reported by Felix Huettner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Felix Huettner

Bug Description

Assume a neutron setup with the ml2 ovn plugin.
Further assume for the duration of this issue that no changes are made on the user api, so that nb_cfg at the start of the issue is equal to nb_cfg at the end of the issue:

1. Take any ovn-controller that you have and run a openstack network agent show on it; this should say "up" and a valid "heartbeat_timestamp"
2. Restart the ovn-controller
3. the openstack output should not say down with the unix 0 timestamp as heartbeat
4. Do any change that causes nb_cfg to increase
5. the agent is now up with a proper timestamp

Issue is caused by https://opendev.org/openstack/neutron/src/commit/0384b3193b11eb6cc849c4511d2e539d42b6d3f9/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L339

in step 2 the southbound database will emit two events:
1. when the ovn-controller first starts, one with the addition of Chassis_Private where nb_cfg and nb_cfg_timestamp is 0
2. when the ovn-controller has finished syncing with the nb_cfg as in SB_GLOBAL and nb_cfg_timestamp with the current timestamp

however the second event is currently filtered by the `match_fn` as `old.nb_cfg` is `0` at this point. In the condition `0` is evaluated to `False` thereby ignoring the event.

This issue might be the same as https://bugs.launchpad.net/neutron/+bug/1955503

Changed in neutron:
assignee: nobody → Felix Huettner (felix.huettner)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/865697

Changed in neutron:
status: New → In Progress
tags: added: ovn
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/865697
Committed: https://opendev.org/openstack/neutron/commit/4cc611d319d0afe1ee04df6e4419014f1133df09
Submitter: "Zuul (22348)"
Branch: master

commit 4cc611d319d0afe1ee04df6e4419014f1133df09
Author: Felix Huettner <email address hidden>
Date: Fri Nov 25 16:39:31 2022 +0100

    Fix handling the restart of ovn-controllers

    The previous `getattr(old, 'nb_cfg', False)` would evaluate to `False`
    if the `old` row either did not contain a `nb_cfg` value or if the value
    was 0.

    As 0 is the value set on startup of the ovn-controller this causes the
    neutron-api to ignore any event a ovn-controller directly sends after
    startup. In turn this causes us to miss the information that the agent
    is synchronized, causing the agent to appear as down, until something
    bumps the `nb_cfg` value globally.

    Closes-Bug: #1997982

    Change-Id: Icec8fee93e64b871999f38674e305238e9705fd4

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/868181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/868182

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/868263

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/868264

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/868181
Committed: https://opendev.org/openstack/neutron/commit/99b7acc505598971fb5086c4217e0cd059f22b51
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 99b7acc505598971fb5086c4217e0cd059f22b51
Author: Felix Huettner <email address hidden>
Date: Fri Nov 25 16:39:31 2022 +0100

    Fix handling the restart of ovn-controllers

    The previous `getattr(old, 'nb_cfg', False)` would evaluate to `False`
    if the `old` row either did not contain a `nb_cfg` value or if the value
    was 0.

    As 0 is the value set on startup of the ovn-controller this causes the
    neutron-api to ignore any event a ovn-controller directly sends after
    startup. In turn this causes us to miss the information that the agent
    is synchronized, causing the agent to appear as down, until something
    bumps the `nb_cfg` value globally.

    Closes-Bug: #1997982

    Change-Id: Icec8fee93e64b871999f38674e305238e9705fd4
    (cherry picked from commit 4cc611d319d0afe1ee04df6e4419014f1133df09)

tags: added: in-stable-zed
tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/868182
Committed: https://opendev.org/openstack/neutron/commit/0d3fe4f7a2b241544623b90f86c9344cb9cfc91d
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 0d3fe4f7a2b241544623b90f86c9344cb9cfc91d
Author: Felix Huettner <email address hidden>
Date: Fri Nov 25 16:39:31 2022 +0100

    Fix handling the restart of ovn-controllers

    The previous `getattr(old, 'nb_cfg', False)` would evaluate to `False`
    if the `old` row either did not contain a `nb_cfg` value or if the value
    was 0.

    As 0 is the value set on startup of the ovn-controller this causes the
    neutron-api to ignore any event a ovn-controller directly sends after
    startup. In turn this causes us to miss the information that the agent
    is synchronized, causing the agent to appear as down, until something
    bumps the `nb_cfg` value globally.

    Closes-Bug: #1997982

    Change-Id: Icec8fee93e64b871999f38674e305238e9705fd4
    (cherry picked from commit 4cc611d319d0afe1ee04df6e4419014f1133df09)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/868263
Committed: https://opendev.org/openstack/neutron/commit/70f947e05249917992c6b31b50b627f725b25a72
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 70f947e05249917992c6b31b50b627f725b25a72
Author: Felix Huettner <email address hidden>
Date: Fri Nov 25 16:39:31 2022 +0100

    Fix handling the restart of ovn-controllers

    The previous `getattr(old, 'nb_cfg', False)` would evaluate to `False`
    if the `old` row either did not contain a `nb_cfg` value or if the value
    was 0.

    As 0 is the value set on startup of the ovn-controller this causes the
    neutron-api to ignore any event a ovn-controller directly sends after
    startup. In turn this causes us to miss the information that the agent
    is synchronized, causing the agent to appear as down, until something
    bumps the `nb_cfg` value globally.

    Closes-Bug: #1997982

    Conflicts:
        neutron/tests/functional/plugins/ml2/drivers/ovn/mech_driver/ovsdb/test_ovsdb_monitor.py

    Change-Id: Icec8fee93e64b871999f38674e305238e9705fd4
    (cherry picked from commit 4cc611d319d0afe1ee04df6e4419014f1133df09)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/868264
Committed: https://opendev.org/openstack/neutron/commit/0f1811afa57f4959d6454865a72109f63424b250
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 0f1811afa57f4959d6454865a72109f63424b250
Author: Felix Huettner <email address hidden>
Date: Fri Nov 25 16:39:31 2022 +0100

    Fix handling the restart of ovn-controllers

    The previous `getattr(old, 'nb_cfg', False)` would evaluate to `False`
    if the `old` row either did not contain a `nb_cfg` value or if the value
    was 0.

    As 0 is the value set on startup of the ovn-controller this causes the
    neutron-api to ignore any event a ovn-controller directly sends after
    startup. In turn this causes us to miss the information that the agent
    is synchronized, causing the agent to appear as down, until something
    bumps the `nb_cfg` value globally.

    Closes-Bug: #1997982

    Conflicts:
        neutron/tests/functional/plugins/ml2/drivers/ovn/mech_driver/ovsdb/test_ovsdb_monitor.py

    Change-Id: Icec8fee93e64b871999f38674e305238e9705fd4
    (cherry picked from commit 4cc611d319d0afe1ee04df6e4419014f1133df09)
    (cherry picked from commit 61c6adf5fed415ae82dbe15e38015f7ad5c9f3e8)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.5.0

This issue was fixed in the openstack/neutron 19.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.0.0.0rc1

This issue was fixed in the openstack/neutron 22.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.3.0

This issue was fixed in the openstack/neutron 20.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.1.0

This issue was fixed in the openstack/neutron 21.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom

This issue was fixed in the openstack/neutron wallaby-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.