Live migration doesn't retry on migration pre-check failure

Bug #1480441 reported by Chris St. Pierre
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Chris St. Pierre

Bug Description

When live migrating an instance, it is supposed to retry some (configurable) number of times. It only retries if the host compatibility and migration pre-checks raise nova.exception.Invalid, though:

https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L167-L174

If, for instance, a destination hypervisor has run out of disk space it will not raise an Invalid subclass, but rather MigrationPreCheckError, which causes the retry loop to short-circuit. Nova should instead retry as long as either Invalid or MigrationPreCheckError is raised.

This can be tricky to reproduce because it only occurs if a host raises MigrationPreCheckError before a valid host is found, so it's dependent upon the order in which the scheduler supplies possible destinations to the conductor. In theory, though, it can be reproduced by bringing up a number of hypervisors, exhausting the disk on one -- ideally the one that the scheduler will return first -- and then attempting a live migration. It will fail with something like:

$ nova live-migration --block-migrate stpierre-test-1 ERROR (BadRequest): Migration pre-check error: Unable to migrate f44296dd-ffa6-4ec0-8256-c311d025d46c: Disk of instance is too large(available on destination host:-38654705664 < need:1073741824) (HTTP 400) (Request-ID: req-9951691a-c63c-4888-bec5-30a072dfe727)

Even when there are valid hosts to migrate to.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/208042

Changed in nova:
assignee: nobody → Chris St. Pierre (stpierre)
status: New → In Progress
Changed in nova:
importance: Undecided → Medium
status: In Progress → Triaged
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/208042
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=72fb7fdddb675206eca3d2ed19b6901c0e1e6975
Submitter: Jenkins
Branch: master

commit 72fb7fdddb675206eca3d2ed19b6901c0e1e6975
Author: Chris St. Pierre <email address hidden>
Date: Fri Jul 31 14:51:00 2015 -0500

    Retry live migration on pre-check failure

    Make Nova continue trying to find a host to live migrate an instance
    to when a possible destination host has failed migration pre-checks
    with MigrationPreCheckError, which can be raised if a hypervisor disk
    is full.

    Closes-Bug: #1480441
    Change-Id: I4fc4141eafbf7665e020c67de9578ef12c1c5ac5

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-3 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.