Comment 1 for bug 1819216

Revision history for this message
Matt Riedemann (mriedem) wrote :

I was going to duplicate this against bug 1811235 since it's definitely related but sort of a different issue. In this case you have allow_resize_to_same_host=True (default in devstack but not nova) and two nodes. The scheduler picked the host that the instance is on for whatever reason, and then if you're not using the vcenter driver (you're using libvirt by default), you hit this in the compute and it blows up:

https://github.com/openstack/nova/blob/d3254af0fe2b15caff3990c965194133625b681d/nova/compute/manager.py#L4287

So we definitely have some weirdness around this because the control plane services don't know about the compute configuration but the API is allowing resize (and cold migrate) to the same host based on configuration there.

Options:

1. Always ignore the source host during cold migrate, similar to how live migrate and evacuate work, but that would break the vcenter case for cold migrating to the same compute service host which is just managing a vcenter cluster of esxi hosts.

2. Somehow communicate to the scheduler that we're doing a cold migration and we can or can't pick the source host. Now that we report driver capabilities as traits to placement, we could potentially rely on that to pass something along to the scheduler and placement about this case. That's probably the more flexible way to fix this.