"live migration fails due to instance record is erased"

Bug #751231 reported by Kei Masumoto
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Kei Masumoto

Bug Description

live migration has been done with following procedure.

1. Destination compute node is preparing to be migrated.
2. Source compute node try to migrate instance.
3. Source compute node try to fix db record
  (the host where instance running is source -> dest.. etc)

If nova.compute.manager._poll_instance_states() runs at source compute node between 2 and 3 above,
_poll_instance_states() recognize "db record exists, but no instance", then it calls instance_destroy".
As the result of that, procedure 3 above raise exception.
Therefore, _poll_instance_states() must ignore 'migrating' instances.
This issue does not always occurs since timing is important.
The exception message is below.

2011-04-06 03:12:19,047 DEBUG nova.rpc [-] MSG_ID is b5f89aa2d22a445c9c004236ce9fcd62 from (pid=10429) call /opt/nova/nova/rpc.py:353
2011-04-06 03:12:30,627 INFO nova.compute.manager [-] Found instance 'instance-00000005' in DB but no VM. State=3, so setting state to shutoff.
2011-04-06 03:12:30,627 INFO nova.compute.manager [-] DB/VM state mismatch. Changing state from '3' to '5'
libvir: QEMU error : Domain not found: no domain with matching name 'instance-00000005'
2011-04-06 03:12:30,705 INFO nova.compute.manager [-] post_live_migration() is started..
2011-04-06 03:12:30,717 DEBUG nova.utils [-] Attempting to grab semaphore "iptables" for method "apply"... from (pid=10429) inner /opt/ nova/nova/utils.py:594
2011-04-06 03:12:30,718 DEBUG nova.utils [-] Attempting to grab file lock "iptables" for method "apply"... from (pid=10429) inner /opt/ nova/nova/utils.py:599
2011-04-06 03:12:31,190 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-save -t filter from (pid=10429) execute /opt/nova/ nova/utils.py:150
2011-04-06 03:12:31,203 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-restore from (pid=10429) execute /opt/nova/nova/ut ils.py:150
2011-04-06 03:12:31,218 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-save -t nat from (pid=10429) execute /opt/nova/nov a/utils.py:150
2011-04-06 03:12:31,229 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-restore from (pid=10429) execute /opt/nova/nova/ut ils.py:150
2011-04-06 03:12:31,282 INFO nova.compute.manager [-] No floating_ip is found for instance-00000005.
2011-04-06 03:12:31,321 ERROR nova [-] in looping call
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/opt/nova/nova/utils.py", line 465, in _inner
(nova): TRACE: self.f(*self.args, **self.kw)
(nova): TRACE: File "/opt/nova/nova/virt/libvirt_conn.py", line 1485, in wait_for_live_migration
(nova): TRACE: post_method(ctxt, instance_ref, dest)
(nova): TRACE: File "/opt/nova/nova/compute/manager.py", line 1001, in post_live_migration
(nova): TRACE: self.recover_live_migration(ctxt, instance_ref, dest)
(nova): TRACE: File "/opt/nova/nova/compute/manager.py", line 1027, in recover_live_migration
(nova): TRACE: 'host': host})
(nova): TRACE: File "/opt/nova/nova/db/api.py", line 479, in instance_update
(nova): TRACE: return IMPL.instance_update(context, instance_id, values)
(nova): TRACE: File "/opt/nova/nova/db/sqlalchemy/api.py", line 109, in wrapper
(nova): TRACE: return f(*args, **kwargs)
(nova): TRACE: File "/opt/nova/nova/db/sqlalchemy/api.py", line 1043, in instance_update
(nova): TRACE: instance_ref = instance_get(context, instance_id, session=session)
(nova): TRACE: File "/opt/nova/nova/db/sqlalchemy/api.py", line 109, in wrapper
(nova): TRACE: return f(*args, **kwargs)
(nova): TRACE: File "/opt/nova/nova/db/sqlalchemy/api.py", line 897, in instance_get
(nova): TRACE: instance_id)
(nova): TRACE: InstanceNotFound: Instance 5 not found

Related branches

Kei Masumoto (masumotok)
Changed in nova:
status: New → Confirmed
assignee: nobody → Kei Masumoto (masumotok)
status: Confirmed → In Progress
Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
Thierry Carrez (ttx)
Changed in nova:
milestone: none → cactus-rc
Thierry Carrez (ttx)
Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: cactus-rc → 2011.2
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.