Comment 0 for bug 1897888

Revision history for this message
Mark Goddard (mgoddard) wrote :

Steps to reproduce
==================

Trigger a host failure of a node with instances running on it.

Cause evacuation to fail for some reason. In my case this was caused by using volume encryption, which fails with evacuation since the user used by masakari to trigger evacuation does not have read access to the volume's encryption key in barbican [1].

Expected results
================

Masakari detects the evacuation failure and aborts the failover.

Actual results
==============

The periodic looping call to wait for evacuation (_wait_for_evacuation_confirmation) polls for 90 seconds, then times out. After this point the main thread continues, but the periodic looping call continues to run forever. We see the following log:

Call get server command for instance <UUID>

[1] https://bugs.launchpad.net/nova/+bug/1895848