Local executors don't send action heartbeats
Bug #1852722 reported by
Renat Akhmerov
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mistral |
Fix Released
|
High
|
Renat Akhmerov |
Bug Description
Local executors never send action heartbeats. This was an architectural issue in the initial solution. It leads to failing long running actions automatically after a configured amount of time (60 mins) no matter if they are being processed normally or not.
Changed in mistral: | |
assignee: | nobody → Renat Akhmerov (rakhmerov) |
importance: | Undecided → High |
status: | New → Confirmed |
milestone: | none → ussuri-1 |
Changed in mistral: | |
status: | Confirmed → In Progress |
To post a comment you must log in.
Reviewed: https:/ /review. opendev. org/694023 /git.openstack. org/cgit/ openstack/ mistral/ commit/ ?id=7ec4f26744a c17151adae4c06c d8a17b71f409a7
Committed: https:/
Submitter: Zuul
Branch: master
commit 7ec4f26744ac171 51adae4c06cd8a1 7b71f409a7
Author: Renat Akhmerov <email address hidden>
Date: Wed Nov 13 16:34:44 2019 +0700
Make action heartbeats work for all executor types
* Previously action hearbeats didn't work in case of using local
executors because the component responsible for sending heartbeats
was started by the executor RPC server which doesn't make sense to
initialize for a local executor. This patch refactors the code
so that now heartbeats get sent for any type of executors. For
local executors it is also useful because a cluster node that
runs an engine and a local executor may also crash. With this
change, remaining cluster nodes will be able to understand that
the action will never complete and one of them will time it out.
If all is fine with the node where the local executor is running
then heartbeats will be sent normally and the action won't time
out. Before this change, in case of local executors a long running
action would always time out after a configured amount of time
(by default, 60 mins) just because local executors never sent
heartbeats.
* Made a lot of renamings to clearly see what component is
responsible for.
* Wrote the tests that check the heartbeat sender, both positive
and negative scenarios for local and remote executor types.
Closes-Bug: #1852722
Change-Id: I4d0fdff54de9be e70aeaf10a4ef48 3ad7000840b