OpenStack Compute (nova)

Bug #1712008
Comment #1

Comment 1 for bug 1712008

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-08-21:

The problem is here:

https://github.com/openstack/nova/blob/16.0.0.0rc1/nova/conductor/tasks/live_migrate.py#L51-L56

When a host is forced, conductor bypasses the call to scheduler_client.select_destinations which is the code that eventually creates the allocation on the destination host:

https://github.com/openstack/nova/blob/16.0.0.0rc1/nova/scheduler/client/report.py#L147

And due to this change:

https://review.openstack.org/#/c/491012/

If all of your computes are upgraded, the resource tracker isn't going to "heal" the allocations on the target host during it's update_available_resources periodic task.

Thinking of solutions:

1. Both paths are going to eventually call check_can_live_migrate_destination on the destination compute host so we could create the allocation there, although that gets tricky since it could overwrite any allocations that the scheduler created via select_destinations if a host isn't forced.

2. Just call placement from conductor when a host isn't forced, somewhere in this else block:

https://github.com/openstack/nova/blob/16.0.0.0rc1/nova/conductor/tasks/live_migrate.py#L56

That's probably the cleanest since it wouldn't overwrite any allocations by the scheduler, since the scheduler isn't called, and it would actually make the destination host allocations correct before the RT could heal them, assuming not all compute nodes are upgraded yet.