Comment 2 for bug 1982497

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

It is caused by a race condition between rollback_live_migration_at_destination and drop_move_claim_at_destination RPC methods happening on the destination during the rollback of the live migration. The rollback_live_migration_at_destination is an RPC cast so it can run _after_ drop_move_claim_at_destination, which is an RPC call, run. The rollback_live_migration_at_destination RPC temporary applies the migration context [1] and calls instance.save during libvirt/driver._cleanup()[2][3]. If this happens as the last thing of the rollback then the instance numa topology will point to the dest host even though the instance runs and points to the source host.

[1] https://github.com/openstack/nova/blob/bcb96f362ab12e297f125daa5189fb66345b4976/nova/compute/manager.py#L9400-L9403
[2] https://github.com/openstack/nova/blob/bcb96f362ab12e297f125daa5189fb66345b4976/nova/virt/libvirt/driver.py#L10449
[3] https://github.com/openstack/nova/blob/bcb96f362ab12e297f125daa5189fb66345b4976/nova/virt/libvirt/driver.py#L1674