in devstack, "nova migrate <uuid>" will try to migrate to the same host (and then fail)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Triaged
|
Medium
|
Unassigned |
Bug Description
In multinode devstack I had an instance running on one node and tried running "nova migrate <uuid>". The operation started, but then the instance went into an error state with the following fault:
{"message": "Unable to migrate instance (2bbdab8e-
Logically, I think that even if "resize to same host" is enabled, for a "migrate" operation we should remove the current host from consideration. We know it's going to fail, and it doesn't make sense anyways.
Also, it would probably make sense to make "migrate" work like "live migration" which removes the current host from consideration.
summary: |
- in devstack, "nova migrate <uuid>" can try to migrate to the same host + in devstack, "nova migrate <uuid>" will try to migrate to the same host + (and then fail) |
description: | updated |
I was going to duplicate this against bug 1811235 since it's definitely related but sort of a different issue. In this case you have allow_resize_ to_same_ host=True (default in devstack but not nova) and two nodes. The scheduler picked the host that the instance is on for whatever reason, and then if you're not using the vcenter driver (you're using libvirt by default), you hit this in the compute and it blows up:
https:/ /github. com/openstack/ nova/blob/ d3254af0fe2b15c aff3990c9651941 33625b681d/ nova/compute/ manager. py#L4287
So we definitely have some weirdness around this because the control plane services don't know about the compute configuration but the API is allowing resize (and cold migrate) to the same host based on configuration there.
Options:
1. Always ignore the source host during cold migrate, similar to how live migrate and evacuate work, but that would break the vcenter case for cold migrating to the same compute service host which is just managing a vcenter cluster of esxi hosts.
2. Somehow communicate to the scheduler that we're doing a cold migration and we can or can't pick the source host. Now that we report driver capabilities as traits to placement, we could potentially rely on that to pass something along to the scheduler and placement about this case. That's probably the more flexible way to fix this.