Scheduler finds new host using targeted live migration

Bug #1212478 reported by Joe Cropper on 2013-08-14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)

Bug Description

I'm using the master branch of Havana for the basis of this defect, although it's probably existed forever. :-)

1. Using the RetryFilter
2. In nova.conf, scheduler_max_attempts = 3 (i.e., the default)
3. Invoke a live-migration in a multi-host environment via [nova live-migration <vm> <host>], indicating the destination host name that the VM should move to.

The issue is that the specified host encountered an error for the live migration, but because RetryFilter was enabled, the "failed" host is added to the "exclusion list", so the VM winds up being migrated elsewhere. While this type of behavior is desirable for untargeted deployments and relocations (i.e., in which no host is specified), it's problematic when a very specific host is desired.

In this scenario, the migration ends up "succeeding", albeit it on an "unexpected" host. From an end-user perspective, I think that when targeted operations such as this are invoked, the operation should fail regardless of whether RetryFilter is enabled.

Joe Cropper (jwcroppe) wrote :

After a quick scan of the code in nova/conductor/tasks/, it seems like we could update the conductor LiveMigrationTask such that if the destination is passed on the constructor, we could set a "targeted" flag or some such. And then the check_not_over_max_attempt() could take this into consideration as well.

Matt Riedemann (mriedem) wrote :

I posted something to the development mailing list about this bug to see what people think, i.e. is this working as designed, should it be fixed as proposed, etc.

tags: added: scheduler

Joe, in my understanding, live migration with target host has nothing to do related with retry filter. If target host is not valid, then live migration will be failed. Thanks.

Joe Cropper (jwcroppe) wrote :

Jay, you beat me to the punch on minutes nonetheless. :-)

I was just further triaging the issue and examining the code, and concur that RetryFilter is not related to live migration; it's only applicable for the initial spawn/deploy operation. Let's hold off on this for now (I'll cancel) and I'm going to continue to dig into this to see what's going on, but from what I can tell as per my latest findings, the live migration code is working as it should: if the destination is passed into the conductor's live migration task, it doesn't even try to locate a new host and uses what's passed.


Changed in nova:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers