Timeout while getting bridge datapath id crashes ova agent

Bug #1837380 reported by Slawek Kaplonski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Slawek Kaplonski

Bug Description

In case when e.g. bridge was recreated ovs-agent has to reconfigure it and in such case if there will be timeout while getting bridge data path id whole ovs agent will crash.

It causes failure of fullstack tests like e.g.: http://logs.openstack.org/81/671881/1/check/neutron-fullstack/c6b2e08/testr_results.html.gz
see agent's logs: http://logs.openstack.org/81/671881/1/check/neutron-fullstack/c6b2e08/controller/logs/dsvm-fullstack-logs/TestLegacyL3Agent.test_north_south_traffic/neutron-openvswitch-agent--2019-07-22--07-48-17-892074_log.txt.gz#_2019-07-22_07_49_28_698

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/672018

Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/672018
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=b63809715a0b9b8ea0f354dfd6f1b3fd7a713352
Submitter: Zuul
Branch: master

commit b63809715a0b9b8ea0f354dfd6f1b3fd7a713352
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jul 22 13:29:28 2019 +0200

    Don't crash ovs agent during reconfigure of phys bridges

    In case when physical bridge was recreated on host, ovs agent
    is trying to reconfigure it.
    If there will be e.g. timeout while getting bridge's datapath_id,
    RuntimeError will be raised and that caused crash of whole agent.

    This patch changes that to not crash agent in such case but try to
    reconfigure everything in next rpc_loop iteration once again.

    Change-Id: Ic9b17a420068c0c76748e2c24d97be1ed7c460c7
    Closes-Bug: #1837380

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/673145

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/673161

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/673171

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/673145
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8d8f66eddd43658e59de157dfcbd0f9dc9c716c6
Submitter: Zuul
Branch: stable/stein

commit 8d8f66eddd43658e59de157dfcbd0f9dc9c716c6
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jul 22 13:29:28 2019 +0200

    Don't crash ovs agent during reconfigure of phys bridges

    In case when physical bridge was recreated on host, ovs agent
    is trying to reconfigure it.
    If there will be e.g. timeout while getting bridge's datapath_id,
    RuntimeError will be raised and that caused crash of whole agent.

    This patch changes that to not crash agent in such case but try to
    reconfigure everything in next rpc_loop iteration once again.

    Change-Id: Ic9b17a420068c0c76748e2c24d97be1ed7c460c7
    Closes-Bug: #1837380
    (cherry picked from commit b63809715a0b9b8ea0f354dfd6f1b3fd7a713352)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.opendev.org/673171
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fa2863be864f75cd4a11584bcbeeb761a5380883
Submitter: Zuul
Branch: stable/queens

commit fa2863be864f75cd4a11584bcbeeb761a5380883
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jul 22 13:29:28 2019 +0200

    Don't crash ovs agent during reconfigure of phys bridges

    In case when physical bridge was recreated on host, ovs agent
    is trying to reconfigure it.
    If there will be e.g. timeout while getting bridge's datapath_id,
    RuntimeError will be raised and that caused crash of whole agent.

    This patch changes that to not crash agent in such case but try to
    reconfigure everything in next rpc_loop iteration once again.

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
        neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py

    Change-Id: Ic9b17a420068c0c76748e2c24d97be1ed7c460c7
    Closes-Bug: #1837380
    (cherry picked from commit b63809715a0b9b8ea0f354dfd6f1b3fd7a713352)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/673161
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=41db09c434281977fb02196278613b1284990525
Submitter: Zuul
Branch: stable/rocky

commit 41db09c434281977fb02196278613b1284990525
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jul 22 13:29:28 2019 +0200

    Don't crash ovs agent during reconfigure of phys bridges

    In case when physical bridge was recreated on host, ovs agent
    is trying to reconfigure it.
    If there will be e.g. timeout while getting bridge's datapath_id,
    RuntimeError will be raised and that caused crash of whole agent.

    This patch changes that to not crash agent in such case but try to
    reconfigure everything in next rpc_loop iteration once again.

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
        neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py

    Change-Id: Ic9b17a420068c0c76748e2c24d97be1ed7c460c7
    Closes-Bug: #1837380
    (cherry picked from commit b63809715a0b9b8ea0f354dfd6f1b3fd7a713352)

tags: added: in-stable-rocky
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 15.0.0.0b1

This issue was fixed in the openstack/neutron 15.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.3

This issue was fixed in the openstack/neutron 14.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 13.0.5

This issue was fixed in the openstack/neutron 13.0.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 12.1.1

This issue was fixed in the openstack/neutron 12.1.1 release.

tags: removed: neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.