The rebuild_instance method in the nova compute manager handles both rebuilds on the same host and evacuates to another host. If you force the evacuate to a specific host, we bypass some code in the nova-api and nova-conductor services and call directly into the compute on the target host and hit this code which makes it so we don't do a resource claim (which sets up the pci mappings on the instance.migration_context as part of the move claim):
That ^ code is really there for rebuild operations where we don't need a resource claim because we've already claimed resources on the original host, but it got confused with this forced evacuate to a target host scenario, and we never end up claiming resources on the target host.
I think all we have to do is modify that conditional to be:
(12:43:57 PM) gyee_: I am curious why we are using NopClaim if host is specified https:/ /github. com/openstack/ nova/blob/ master/ nova/compute/ manager. py#L2771 /github. com/openstack/ nova/blob/ master/ nova/network/ neutronv2/ api.py# L2462 /github. com/openstack/ nova/commit/ dc0221d7240326a 2d1b467e2a367be bb7e764e61 added that code in the compute manager about the nop claim, which implies resources were already claimed, but i'd have to dig into that
(12:44:22 PM) gyee_: does specifying a host implies force evacuate?
(12:49:08 PM) gyee_: meanwhile if SR-IOV is enabled, we are trying to get the PCI mapping via the migration context, which is only populated at rebuild_claim. https:/
(12:51:32 PM) mriedem: https:/
The rebuild_instance method in the nova compute manager handles both rebuilds on the same host and evacuates to another host. If you force the evacuate to a specific host, we bypass some code in the nova-api and nova-conductor services and call directly into the compute on the target host and hit this code which makes it so we don't do a resource claim (which sets up the pci mappings on the instance. migration_ context as part of the move claim):
https:/ /github. com/openstack/ nova/blob/ master/ nova/compute/ manager. py#L2767- L2771
That ^ code is really there for rebuild operations where we don't need a resource claim because we've already claimed resources on the original host, but it got confused with this forced evacuate to a target host scenario, and we never end up claiming resources on the target host.
I think all we have to do is modify that conditional to be:
if scheduled_node is not None or recreate: