Comment 4 for bug 1856925

Revision history for this message
Matt Riedemann (mriedem) wrote :

I was unable to recreate the resize issue in a devstack created from master today.

I had 2 compute services, created a server, then stopped the source compute service on which the instance was running, then tried resizing the server to the other host.

It basically hung because the dest host's prep_resize routine tries to do an asynchronous RPC cast to the source compute to power off the instance and start transferring disks but the source service is down so it doesn't process the message.

The server is still active but the task_state is stuck in resize_prep:

| OS-EXT-STS:task_state | resize_prep
| OS-EXT-STS:vm_state | active

So the server doesn't go to ERROR status (unless eventually the RPC cast failure would result in an exception from oslo.messaging) but obviously there is a problem, I agree with that.

Do you have details on what fails in your recreate and have logs for it? Otherwise it seems the simplest solution here is the API should check that the source compute service is up before initiating the resize/cold migrate operation.