os.kill(SIGTERM) does not finish and timeouts

Bug #1921154 reported by Rodolfo Alonso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Rodolfo Alonso
Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Revision history for this message
Lajos Katona (lajos-katona) wrote :
Revision history for this message
Lajos Katona (lajos-katona) wrote :
Changed in neutron:
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I'm developing a possible solution in oslo.privsep. [1] implements a privsep decorator with timeout. When the client side timeouts, the method raises a timeout exception. That can be catch by Neutron.

The problem: the commands will continue running in the daemon and nothing will stop the thread running it until the command returns. This design is unstable because we can run out of threads very quick.

A possible alternative could be to execute those commands spawning a new process, instead a new thread. This process can be monitored and, if needed, killed by the daemon if does not reply on time. But this is something that I need to propose in this patch.

Regards.

[1]https://review.opendev.org/c/openstack/oslo.privsep/+/782981

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

The problem is still present even with the os.kill method revert [1].

[1]https://review.opendev.org/c/openstack/neutron/+/782972

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/791243

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/791986

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/791986
Committed: https://opendev.org/openstack/neutron/commit/54420d04dd2abc2bdeb78b9490032e1b1ebd3fac
Submitter: "Zuul (22348)"
Branch: master

commit 54420d04dd2abc2bdeb78b9490032e1b1ebd3fac
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue May 18 14:27:17 2021 +0000

    Skip "test_keepalived_spawns_conflicting_*" tests

    SIGHUP signal command to stop those processes does not return and
    the test (and eventually the whole testsuit) timeouts.

    Until a proper fix is implemented, these tests will be skipped
    temporarily.

    Change-Id: I592d8b7bb2afe5cd93cbb0d0ea7062f1b724ff2a
    Related-Bug: #1921154

tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Rodolfo Alonso <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/791243
Reason: This is bad idea

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

The CI issue is now solved, lowering the importance of the bug.

Changed in neutron:
importance: Critical → Low
Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.