This happened after masakari marked node "compute-server-4" as on_maintenance. Instances are still reporting as being in "compute-server-4". We found the following logs.
2022-05-24 13:44:55.520 242 INFO masakari.engine.manager [req-5d4b847e-7b9a-490c-ab5d-abb5bf81273f 43c85077ed654be2b03defffeaa7c224 25b37f01745246bb92c370e13de3654f - - -] Processing notification 9971bb37-7e97-4781-af4e-690ad5853caa of type: COMPUTE_HOST
2022-05-24 13:44:58.785 242 INFO masakari.compute.nova [req-8e93ae2b-8b5a-40bd-be64-d23a8557f33a masakari - - - -] Disable nova-compute on compute-server-4
2022-05-24 13:45:00.571 242 INFO masakari.engine.drivers.taskflow.host_failure [req-8e93ae2b-8b5a-40bd-be64-d23a8557f33a masakari - - - -] Sleeping 60 sec before starting recovery thread until nova recognizes the node down.
2022-05-24 13:46:00.637 242 INFO masakari.compute.nova [req-15694540-b45b-48a0-979f-9d118f843279 masakari - - - -] Fetch Server list on compute-server-4
2022-05-24 13:46:01.799 242 INFO masakari.compute.nova [req-44bfaa9d-bf06-4226-95ff-770fc5aa9435 masakari - - - -] Call get server command for instance 901eef3f-80bb-428d-9d87-b7c7459620ee
2022-05-24 13:46:02.631 242 INFO masakari.compute.nova [req-c6e2b956-67fb-462f-b27e-527cbcf25a01 masakari - - - -] Call get server command for instance 901eef3f-80bb-428d-9d87-b7c7459620ee
2022-05-24 13:46:02.652 242 INFO masakari.compute.nova [req-4ac53b28-39df-48df-98c5-c9353be641d9 masakari - - - -] Call get server command for instance 7f3476c2-1636-4a22-a28c-9cb844d6716b
2022-05-24 13:46:03.438 242 INFO masakari.compute.nova [req-d2cdc6c9-b3d8-4e83-924b-d7df022770a5 masakari - - - -] Call get server command for instance 7f3476c2-1636-4a22-a28c-9cb844d6716b
2022-05-24 13:46:03.459 242 INFO masakari.compute.nova [req-828eef7a-b84b-4b5d-be96-880bbd6d4ef7 masakari - - - -] Call lock server command for instance 901eef3f-80bb-428d-9d87-b7c7459620ee
2022-05-24 13:46:03.474 242 INFO masakari.compute.nova [req-0bbb9493-2abc-4255-8b1a-00a33191f2e6 masakari - - - -] Call get server command for instance c9e26c8a-8a5e-4693-ba37-dc0712288764
2022-05-24 13:46:04.050 242 INFO masakari.compute.nova [req-0bbb9493-2abc-4255-8b1a-00a33191f2e6 masakari - - - -] Call evacuate command for instance 901eef3f-80bb-428d-9d87-b7c7459620ee on host None
2022-05-24 13:46:04.317 242 INFO masakari.compute.nova [req-9b796652-e985-4683-819c-3a2de5f484a1 masakari - - - -] Call get server command for instance c9e26c8a-8a5e-4693-ba37-dc0712288764
2022-05-24 13:46:04.339 242 INFO masakari.compute.nova [req-fdcb45ae-5449-41cf-a7de-34c5306052b1 masakari - - - -] Call get server command for instance 94f5d43e-20bb-4e2a-ab99-011b9c648a81
2022-05-24 13:46:04.346 242 INFO masakari.compute.nova [req-26fd2fff-4794-448b-af8d-ca0c4cdf89ad masakari - - - -] Call lock server command for instance 7f3476c2-1636-4a22-a28c-9cb844d6716b
2022-05-24 13:46:04.904 242 INFO masakari.compute.nova [req-26fd2fff-4794-448b-af8d-ca0c4cdf89ad masakari - - - -] Call evacuate command for instance 7f3476c2-1636-4a22-a28c-9cb844d6716b on host None
2022-05-24 13:46:05.310 242 INFO masakari.compute.nova [req-5976edb2-c0db-442d-a8c8-660d04d4c49c masakari - - - -] Call unlock server command for instance 901eef3f-80bb-428d-9d87-b7c7459620ee
2022-05-24 13:46:05.450 242 INFO masakari.compute.nova [req-a6658fad-5f95-4e74-ae74-a48c633f0050 masakari - - - -] Call lock server command for instance c9e26c8a-8a5e-4693-ba37-dc0712288764
2022-05-24 13:46:05.853 242 INFO masakari.compute.nova [req-6a117999-e1f8-4d16-bba1-f38f0141fd4e masakari - - - -] Call get server command for instance 94f5d43e-20bb-4e2a-ab99-011b9c648a81
2022-05-24 13:46:05.990 242 INFO masakari.compute.nova [req-6a117999-e1f8-4d16-bba1-f38f0141fd4e masakari - - - -] Call evacuate command for instance c9e26c8a-8a5e-4693-ba37-dc0712288764 on host None
2022-05-24 13:46:06.622 242 INFO masakari.compute.nova [req-2338436b-ec44-4668-a98b-645acd9ba20b masakari - - - -] Call unlock server command for instance 7f3476c2-1636-4a22-a28c-9cb844d6716b
2022-05-24 13:46:06.636 242 INFO masakari.compute.nova [req-aabc3be4-eee5-4e7b-a6b1-5dc7a3e17e7c masakari - - - -] Call lock server command for instance 94f5d43e-20bb-4e2a-ab99-011b9c648a81
2022-05-24 13:46:06.980 242 INFO masakari.compute.nova [req-cbefd7e9-ad16-4e1b-a3f3-df68fc5c026d masakari - - - -] Call unlock server command for instance c9e26c8a-8a5e-4693-ba37-dc0712288764
2022-05-24 13:46:07.175 242 INFO masakari.compute.nova [req-cbefd7e9-ad16-4e1b-a3f3-df68fc5c026d masakari - - - -] Call evacuate command for instance 94f5d43e-20bb-4e2a-ab99-011b9c648a81 on host None
2022-05-24 13:46:08.119 242 INFO masakari.compute.nova [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Call unlock server command for instance 94f5d43e-20bb-4e2a-ab99-011b9c648a81
2022-05-24 13:46:08.674 242 WARNING masakari.engine.drivers.taskflow.driver [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Task 'EvacuateInstancesTask' (438f0a0e-bd9e-4a59-a2e0-1af6058c56ff) transitioned into state 'FAILURE' from state 'RUNNING'
4 predecessors (most recent first):
Flow 'post_tasks'
|__Flow 'main_tasks'
|__Flow 'pre_tasks'
|__Flow 'instance_evacuate_engine': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '901eef3f-80bb-428d-9d87-b7c7459620ee,7f3476c2-1636-4a22-a28c-9cb844d6716b,c9e26c8a-8a5e-4693-ba37-dc0712288764,94f5d43e-20bb-4e2a-ab99-011b9c648a81' from host 'compute-server-4'
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver Traceback (most recent call last):
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver File "/usr/lib/python3/dist-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver result = task.execute(**arguments)
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver File "/usr/lib/python3/dist-packages/masakari/engine/drivers/taskflow/host_failure.py", line 387, in execute
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver _do_evacuate(self.context, host_name, instance_list)
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver File "/usr/lib/python3/dist-packages/masakari/engine/drivers/taskflow/host_failure.py", line 368, in _do_evacuate
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver message=msg)
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '901eef3f-80bb-428d-9d87-b7c7459620ee,7f3476c2-1636-4a22-a28c-9cb844d6716b,c9e26c8a-8a5e-4693-ba37-dc0712288764,94f5d43e-20bb-4e2a-ab99-011b9c648a81' from host 'compute-server-4'
2022-05-24 13:46:08.674 242 ERROR masakari.engine.drivers.taskflow.driver
2022-05-24 13:46:08.756 242 WARNING masakari.engine.drivers.taskflow.driver [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Task 'EvacuateInstancesTask' (438f0a0e-bd9e-4a59-a2e0-1af6058c56ff) transitioned into state 'REVERTED' from state 'REVERTING'
2022-05-24 13:46:08.787 242 WARNING masakari.engine.drivers.taskflow.driver [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Task 'PrepareHAEnabledInstancesTask' (49a4c7a5-e2bd-4402-8b2a-9ff3721590fa) transitioned into state 'REVERTED' from state 'REVERTING'
2022-05-24 13:46:08.821 242 WARNING masakari.engine.drivers.taskflow.driver [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Task 'DisableComputeServiceTask' (ad317739-d4a2-4607-981b-4db053713263) transitioned into state 'REVERTED' from state 'REVERTING'
2022-05-24 13:46:08.834 242 WARNING masakari.engine.drivers.taskflow.driver [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Flow 'instance_evacuate_engine' (5edd2314-3868-48f2-8387-9cc9b8aa12a9) transitioned into state 'REVERTED' from state 'RUNNING'
2022-05-24 13:46:08.835 242 ERROR masakari.engine.manager [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Failed to process notification '9971bb37-7e97-4781-af4e-690ad5853caa'. Reason: Failed to evacuate instances '901eef3f-80bb-428d-9d87-b7c7459620ee,7f3476c2-1636-4a22-a28c-9cb844d6716b,c9e26c8a-8a5e-4693-ba37-dc0712288764,94f5d43e-20bb-4e2a-ab99-011b9c648a81' from host 'compute-server-4': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '901eef3f-80bb-428d-9d87-b7c7459620ee,7f3476c2-1636-4a22-a28c-9cb844d6716b,c9e26c8a-8a5e-4693-ba37-dc0712288764,94f5d43e-20bb-4e2a-ab99-011b9c648a81' from host 'compute-server-4'
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server [req-46496c6d-f689-4262-a91e-cf6bae252458 masakari - - - -] Exception during message handling: IndexError: list index out of range
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/manager.py", line 331, in process_notification
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server self._process_notification(context, notification)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/manager.py", line 327, in _process_notification
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server do_process_notification(notification)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/utils.py", line 267, in inner
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/manager.py", line 309, in do_process_notification
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server context, notification)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/manager.py", line 258, in _handle_notification_type_host
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server tb=tb)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/utils.py", line 47, in notify_about_notification_update
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server fault, priority = _get_fault_and_priority_from_exc_and_tb(exception, tb)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/engine/utils.py", line 31, in _get_fault_and_priority_from_exc_and_tb
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server exception, tb)
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/masakari/notifications/objects/exception.py", line 39, in from_exc_and_traceback
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server trace = inspect.trace()[-1]
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server IndexError: list index out of range
2022-05-24 13:46:08.835 242 ERROR oslo_messaging.rpc.server