OpenStack Compute (Nova)

Instance in state not running after live migration

Reported by Christian Wittwer on 2012-03-05
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Vish Ishaya

Bug Description

I'm testing the live migration feature in Essex milestone 4. I can sucessfully migrate an instance, but after the migration the instance can not be migrated again.

----------------------------------------------------------------------------------------------------------------------------------------
root@unic-prd-os-controller:~# nova-manage vm live_migration instance-00000007 unic-prd-os-compute6
2012-03-05 18:40:41 INFO nova.rpc.common [req-38c187eb-ca3e-4d4c-b576-0fed30a74e40 None None] Connected to AMQP server on 10.2.30.2:5672
Migration of instance-00000007 initiated.Check its progress using euca-describe-instances.
----------------------------------------------------------------------------------------------------------------------------------------

Initial migration, after the instance is running (ps aux on unic-prd-os-compute6) and I can ping it.

----------------------------------------------------------------------------------------------------------------------------------------
root@unic-prd-os-controller:~# nova show 41dedf0f-9a46-4593-9ae7-15e0f73973ce
+-------------------------------+----------------------------------------------------------+
| Property | Value |
+-------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | unic-prd-os-compute6 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000007 |
| OS-EXT-STS:power_state | 8 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| PrivateCloud network | 10.2.20.34 |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2012-03-05T16:37:46Z |
| flavor | m1.small |
| hostId | a24295a0fd51ae996ff8af9835bb03b83d10681885b54492fa3e2cf4 |
| id | 41dedf0f-9a46-4593-9ae7-15e0f73973ce |
| image | Centos5x64_v1 |
| key_name | |
| metadata | {} |
| name | testvm5 |
| progress | None |
| status | ACTIVE |
| tenant_id | e552d533c8fc43c1b759d79cb356c15d |
| updated | 2012-03-05T17:40:58Z |
| user_id | f400a2b6e60745f6b704d4aa51969d1b |
+-------------------------------+----------------------------------------------------------+
----------------------------------------------------------------------------------------------------------------------------------------

The state of the instance is active.

----------------------------------------------------------------------------------------------------------------------------------------
root@unic-prd-os-controller:~# nova-manage vm live_migration instance-00000007 unic-prd-os-compute5
2012-03-05 18:46:57 INFO nova.rpc.common [req-d6779fb1-bb74-4d16-a750-9658d207461e None None] Connected to AMQP server on 10.2.30.2:5672
Command failed, please check log for more info
2012-03-05 18:46:57 CRITICAL nova [req-d6779fb1-bb74-4d16-a750-9658d207461e None None] Remote error: InstanceNotRunning Instance i-00000007 is not running.
[u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 250, in _process_data\n rval = node_func(context=ctxt, **node_args)\n', u' File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 98, in _schedule\n self._set_instance_error(method, context, ex, *args, **kwargs)\n', u' File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__\n self.gen.next()\n', u' File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 93, in _schedule\n return real_meth(*args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 202, in schedule_live_migration\n self._live_migration_src_check(context, instance_ref)\n', u' File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 241, in _live_migration_src_check\n raise exception.InstanceNotRunning(instance_id=instance_id)\n', u'InstanceNotRunning: Instance i-00000007 is not running.\n'].
----------------------------------------------------------------------------------------------------------------------------------------

The migration breaks because the instance is not running, which is not true.

Vish Ishaya (vishvananda) wrote :

It is showing power_state=8 which means libvirt is reporting it as failed, so it doesn't look like the power state is being reported properly. Perhaps the old host is updating the power state in the db somewhere and overwriting the power state reported by the new host?

Changed in nova:
status: New → Triaged
importance: Undecided → Medium
milestone: none → essex-rc1
Kei Masumoto (masumotok) wrote :

I will handle on this matter.

It seems like you have to wait next periodic_tasks() begins iin existing implementation.
But it might be a problem for some cases, so lets change power_state of migrated vm right after live migration finishes.

> Vish
Can you assign this bug to me if it is convinent to anyone?

Thierry Carrez (ttx) on 2012-03-06
Changed in nova:
assignee: nobody → Kei Masumoto (masumotok)
Changed in nova:
assignee: Kei Masumoto (masumotok) → Vish Ishaya (vishvananda)

Fix proposed to branch: master
Review: https://review.openstack.org/5038

Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/5038
Committed: http://github.com/openstack/nova/commit/33def9e714fbd13a6dc4b755ade4841c971f7ae5
Submitter: Jenkins
Branch: master

commit 33def9e714fbd13a6dc4b755ade4841c971f7ae5
Author: Vishvananda Ishaya <email address hidden>
Date: Thu Mar 8 12:53:44 2012 -0800

    Fix live-migration in multi_host network

     * call teardown after live migration
     * call update a second time after migration for dhcp
     * moves the instance state update into post_live_migrate
     * completes the fix for bug 939060
     * fixes bug 947326

    Change-Id: I042567573b9bb46381c5447aa08e83cd1916b225

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2012-03-20
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2012-04-05
Changed in nova:
milestone: essex-rc1 → 2012.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers