Fail evacuate flow with deleted VM

Bug #2023464 reported by VO LE HUY
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
masakari
New
Undecided
Unassigned

Bug Description

Today my lab environment got exception like this: "Resource could not be found". My case is very rare where the compute server unfortunately rebooted as soon as it received the request to delete the VM but could not do it, specifically look up the following behavior, note that it has concurrency:

1) Server 'controller' received request to delete VM (normal or amphora).
2) The 'compute' server is off, the above request is still unprocessed just stop at the waiting queue. The 'Masakari Engine' on the 'controller' server has listed the VMs located on the server that just crashed.
3) The 'compute' server is back up and running the request to delete the VM.
4) The 'Maskari Engine' on the 'controller' server continues execution to the step in the source code called 'Task Evacuate'. Right at the line of code that gets the VM information through the Nova SDK, there is an error right before the 'spawning evacuate' line for that VM, of course the error will be described that the resource cannot be found.

Note: I'm based on Yoga branch.

https://opendev.org/openstack/masakari/src/branch/stable/yoga/masakari/engine/drivers/taskflow/host_failure.py#L349
-----------------------------------------------------------------------
for instance_id in instance_list:
        msg = "Evacuation of instance started: '%s'" % instance_id
        self.update_details(msg, 0.5)
->      instance = self.novaclient.get_server(self.context,
                                              instance_id)
        thread_pool.spawn_n(self._evacuate_and_confirm, context,
                            instance, host_name,
                            failed_evacuation_instances,
                            reserved_host)
-----------------------------------------------------------------------

...
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/compute/nova.py", ler
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver return nova.servers.get(uuid)
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver File "/var/lib/kolla/venv/lib/python3.8/site-packages/novaclient/v2/servers.py", l
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver return self._get("/servers/%s" % base.getid(server), "server")
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver File "/var/lib/kolla/venv/lib/python3.8/site-packages/novaclient/base.py", line 35
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver resp, body = self.api.client.get(url)
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver File "/var/lib/kolla/venv/lib/python3.8/site-packages/keystoneauth1/adapter.py", l
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver return self.request(url, 'GET', **kwargs)
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver File "/var/lib/kolla/venv/lib/python3.8/site-packages/novaclient/client.py", line
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver raise exceptions.from_response(resp, body, url, method)
2023-06-11 05:02:04.326 7 ERROR masakari.engine.drivers.taskflow.driver masakari.exception.NotFound: Resource could not be found.
...

Revision history for this message
VO LE HUY (huyvl3) wrote :
description: updated
tags: added: before evacuate
tags: added: not-found-resource-before-evacuate
removed: before evacuate found not resource
VO LE HUY (huyvl3)
description: updated
VO LE HUY (huyvl3)
description: updated
VO LE HUY (huyvl3)
description: updated
VO LE HUY (huyvl3)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.