OVS agent hangs on rpc calls if neutron-server is down and ovs-agent received SIGTERM

Bug #1408334 reported by Jakub Libosvar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Jakub Libosvar
Juno
Fix Released
Medium
Jakub Libosvar

Bug Description

There is an infinite loop in OVS agent driven by one variable. If OVG agent receives SIGTERM signal and loop is running, OVS agent must wait until execution reaches loop control variable. If at the same time neutron-server is down, agent still uses rpc call() methods and waits for response from neutron-server. Several timeouts on rpc must occur until OVS agents quits. If this whole process of exiting takes more than 90 seconds, systemd by default sends SIGKILL to ovs-agent process which means ovs-agent didn't exit with exit code 0. RPC calls are not necessary if we know agent is going to shutdown.

Changed in neutron:
assignee: nobody → Jakub Libosvar (libosvar)
status: New → Confirmed
Changed in neutron:
importance: Undecided → Medium
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/145529
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d3af7b0d2ebb2dd4da7e4a620eca8f554c124ec6
Submitter: Jenkins
Branch: master

commit d3af7b0d2ebb2dd4da7e4a620eca8f554c124ec6
Author: Jakub Libosvar <email address hidden>
Date: Fri Jan 30 18:30:22 2015 +0100

    Decrease rpc timeout after agent receives SIGTERM

    The patch sets different timeout to rpc api objects in OVS agent after
    SIGTERM is received. Given timeout is configurable. This action prevents
    long waiting for rpc call() methods to timeout and decreases amount of time
    needed to successfully stopping OVS agent.

    DocImpact
    Change-Id: I3026775e813a74bad9e0bca3be1f535212a2e417
    Closes-Bug: 1408334

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/153150

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → kilo-2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/juno)

Reviewed: https://review.openstack.org/153150
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7cd3d0c1b76dff9151f5cee34ea3e313c1b19ecd
Submitter: Jenkins
Branch: stable/juno

commit 7cd3d0c1b76dff9151f5cee34ea3e313c1b19ecd
Author: Jakub Libosvar <email address hidden>
Date: Fri Jan 30 18:30:22 2015 +0100

    Decrease rpc timeout after agent receives SIGTERM

    The patch sets different timeout to rpc api objects in OVS agent after
    SIGTERM is received. Given timeout is configurable. This action prevents
    long waiting for rpc call() methods to timeout and decreases amount of time
    needed to successfully stopping OVS agent.

    DocImpact
    Closes-Bug: 1408334

    Change-Id: I3026775e813a74bad9e0bca3be1f535212a2e417
    (cherry picked from commit d3af7b0d2ebb2dd4da7e4a620eca8f554c124ec6)

tags: added: in-stable-juno
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-2 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.