Comment 1 for bug 1848343

Revision history for this message
Matt Riedemann (mriedem) wrote : Re: MigrationTask rollback can leak allocations for a deleted server

Note that we could have the same issue in the compute service, for example if the server is deleted during the resize claim and we get to this exception block handler:

https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L4724

To revert the allocations here:

https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L4574

Which calls move_allocations which has the same problem described above.

We could also have the same issue if the instance is gone during resize_instance on the source host:

https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L4896

Or during finish_revert_resize I guess:

https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L4454

I'm not sure about the fix yet, but we might want callers to optionally tell the move_allocations method if it should require the usage of the target consumer (instance in this case) generation so we don't use a .get() here:

https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/scheduler/client/report.py#L1886

Actually if the target consumer no longer exists in placement the 'consumer_generation' key won't exist in that allocations response and we'd have to handle it earlier, like this:

https://review.opendev.org/#/c/688832/2/nova/scheduler/client/report.py