[ovn]agents alive status error after restarting neutron server

Bug #1938478 reported by ZhouHeng
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
In Progress
Medium
ZhouHeng

Bug Description

I have 3 ovn-controller nodes. 3 nodes run normal.
through run 'openstack network agnet list': 3 agents alive.

then simulate a node failure, stop 1 ovn-controller.a minute later, list agent, you can find a node down. this seems normal.

Restart neutron at this time and list agents, 3 all agents are alive.this seems to be a problem.
a minute later, list agent, you can find a node down. this seems normal again.

Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → High
importance: High → Medium
tags: added: ovn
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Bug's status is "in progress" but nobody is assigned to it. If someone is working on it, please assign it to yourself. And if there is already related patch proposed, please add link to the patch in the bug's comment.

ZhouHeng (zhouhenglc)
Changed in neutron:
assignee: nobody → ZhouHeng (zhouhenglc)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/802834
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/802834
Committed: https://opendev.org/openstack/neutron/commit/9e263dcf00b7ddc102a54f5e681bcf09931f3a72
Submitter: "Zuul (22348)"
Branch: master

commit 9e263dcf00b7ddc102a54f5e681bcf09931f3a72
Author: zhouhenglc <email address hidden>
Date: Thu Jul 29 15:49:39 2021 +0800

    [ovn]support read chassis update time from nb_cfg_timestamp

    nb_cfg_timestamp: The timestamp when ovn-controller finishes
    processing the change corresponding to nb_cfg[1]. it can better
    reflect the status of chassis.

    This patch updated some unit tests. ensure mock 'time.time' is
    stopped after test. if not stop, may affect "timeutils.utcnow_ts"
    to obtain the real time, cause test case
    'test_agent_with_nb_cfg_timestamp_not_timeout' failure.

    Partial-bug: #1938478
    [1] https://www.ovn.org/support/dist-docs/ovn-sb.5.html

    Change-Id: Ia74a9404411862dc88b48c4a198d5c53f5f52704

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/848131

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/848131
Committed: https://opendev.org/openstack/neutron/commit/cfc0678caf39a5ec002fc56a878275a94dc11b94
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit cfc0678caf39a5ec002fc56a878275a94dc11b94
Author: zhouhenglc <email address hidden>
Date: Thu Jul 29 15:49:39 2021 +0800

    [ovn]support read chassis update time from nb_cfg_timestamp

    nb_cfg_timestamp: The timestamp when ovn-controller finishes
    processing the change corresponding to nb_cfg[1]. it can better
    reflect the status of chassis.

    This patch updated some unit tests. ensure mock 'time.time' is
    stopped after test. if not stop, may affect "timeutils.utcnow_ts"
    to obtain the real time, cause test case
    'test_agent_with_nb_cfg_timestamp_not_timeout' failure.

    Partial-bug: #1938478
    [1] https://www.ovn.org/support/dist-docs/ovn-sb.5.html

    Conflicts:
      neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py
      neutron/tests/unit/plugins/ml2/drivers/ovn/mech_driver/test_mech_driver.py

    Change-Id: Ia74a9404411862dc88b48c4a198d5c53f5f52704
    (cherry picked from commit 9e263dcf00b7ddc102a54f5e681bcf09931f3a72)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/868498

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/869349

Revision history for this message
Brian Haley (brian-haley) wrote :

So did the patch here fix the issue? It was only tagged as partial-fix. Just trying to close old bugs.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/868498
Committed: https://opendev.org/openstack/neutron/commit/538712635cdccb5d121ead816018ca6efcc78f56
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 538712635cdccb5d121ead816018ca6efcc78f56
Author: zhouhenglc <email address hidden>
Date: Thu Jul 29 15:49:39 2021 +0800

    [ovn]support read chassis update time from nb_cfg_timestamp

    nb_cfg_timestamp: The timestamp when ovn-controller finishes
    processing the change corresponding to nb_cfg[1]. it can better
    reflect the status of chassis.

    This patch updated some unit tests. ensure mock 'time.time' is
    stopped after test. if not stop, may affect "timeutils.utcnow_ts"
    to obtain the real time, cause test case
    'test_agent_with_nb_cfg_timestamp_not_timeout' failure.

    Partial-bug: #1938478
    [1] https://www.ovn.org/support/dist-docs/ovn-sb.5.html

    Conflicts:
      neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py
      neutron/tests/unit/plugins/ml2/drivers/ovn/mech_driver/test_mech_driver.py

    Change-Id: Ia74a9404411862dc88b48c4a198d5c53f5f52704
    (cherry picked from commit 9e263dcf00b7ddc102a54f5e681bcf09931f3a72)
    (cherry picked from commit cfc0678caf39a5ec002fc56a878275a94dc11b94)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/869349
Committed: https://opendev.org/openstack/neutron/commit/d34afc5c13b0251c51f8c493b331900a46737512
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit d34afc5c13b0251c51f8c493b331900a46737512
Author: zhouhenglc <email address hidden>
Date: Thu Jul 29 15:49:39 2021 +0800

    [ovn]support read chassis update time from nb_cfg_timestamp

    nb_cfg_timestamp: The timestamp when ovn-controller finishes
    processing the change corresponding to nb_cfg[1]. it can better
    reflect the status of chassis.

    This patch updated some unit tests. ensure mock 'time.time' is
    stopped after test. if not stop, may affect "timeutils.utcnow_ts"
    to obtain the real time, cause test case
    'test_agent_with_nb_cfg_timestamp_not_timeout' failure.

    Partial-bug: #1938478
    [1] https://www.ovn.org/support/dist-docs/ovn-sb.5.html

    Conflicts:
      neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py
      neutron/tests/unit/plugins/ml2/drivers/ovn/mech_driver/test_mech_driver.py
      neutron/tests/unit/agent/l2/extensions/dhcp/test_ipv6.py

    Change-Id: Ia74a9404411862dc88b48c4a198d5c53f5f52704
    (cherry picked from commit 9e263dcf00b7ddc102a54f5e681bcf09931f3a72)
    (cherry picked from commit cfc0678caf39a5ec002fc56a878275a94dc11b94)
    (cherry picked from commit 538712635cdccb5d121ead816018ca6efcc78f56)

tags: added: in-stable-wallaby
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.