OpenStack Compute (nova)

task_state is not restored on live-migration failure

Bug #1101969 reported by Kei Masumoto on 2013-01-20

This bug report is a duplicate of: Bug #1051881: Failed Live Block Migration leaves with Inconsistent Instance Status . Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Invalid	Undecided	Unassigned

Bug Description

task_state is not restored on live-migration failure, users cannot try again.
On executing live migration, instance status changes active -> migrating -> active, like below.

c) after live migration is completed.
same as a)

Status changes ths same way as described above on any failure cases, but it doesnt when scheduler raises exception.
In this case, users cannot try live migration again because task_state == None is the prerequisite for instances to be migrated.

For detailed explanation, please look at nova/compute/api.py.
I am trying to explain what if scheduler_rpcapi.live_migration raises exception
(task_state is never rollbacked).
There are some cases when exceptions are raised. One is destination compute doesnot have enough disk,
one of others is destination compute has different cpu chipset... and so on.

> def live_migrate(self, context, instance, block_migration,
> disk_over_commit, host_name):
>
> instance = self.update(context, instance,
> task_state=task_states.MIGRATING,
> expected_task_state=None)
>
> self.scheduler_rpcapi.live_migration(context, block_migration,
> disk_over_commit, instance, host_name)

The patch is attached.