handling EVENT_LIFECYCLE_STOPPED races with removing instance from database
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
Peter Feiner |
Bug Description
When an instance is deleted (ComputeManager
The error is benign since the lifecycle event handler just syncs the power state of instances and this instance already has the correct power state (i.e., not in the database at all). However, the stack trace is scary and should probably be fixed.
The more stuff happening in parallel makes the bug easier to observe:
* It seems easiest to reproduce this bug when lots (10+) of instances are being created and deleted in parallel.
* I'm fairly certain this race exists regardless of libvirt version (i.e., asynchronous delivery of an event, handling of which queries a potentially deleted record), however I haven't been able to reproduce it with an older version (libvirt-0.9.8). This is probably explained by libvirt-1.0.3+ having a bunch of bottlenecks removed (see Daniel Berrange's patch series on libvirt-devel).
* I'm working on concurrency bottlenecks in OpenStack. With a bunch of these fixed, the bug happens almost all of the time.
I haven't tested this on the master branch, but the faulting bits (ComputeManager
This problem could be fixed in a couple of ways:
1) Wait for the EVENT_LIFECYCLE
2) Ignore events in ComputeManager.
3) maintain a deleted list (i.e., list of instance uuids for which events can be ignored).
I think (2) is simplest to implement and most robust. Moreover, (2) will handle other the race between any event delivery and the deletion of a domain. Although (3) has the advantage that it won't mask database deletion bugs, it introduces a garbage collection problem.
Relevant blueprint: https:/
InstanceNotFound: Instance 9b6045c5-
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver Traceback (most recent call last):
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver self._compute_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver self.handle_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver context, event.get_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver instance_uuid)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return self.call(context, msg, version='1.2')
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return rpc.call(context, self._get_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return fn(*args, **kwargs)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return _get_impl(
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver rpc_amqp.
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver rv = list(rv)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver raise result
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver InstanceNotFoun
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver Traceback (most recent call last):
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return func(*args, **kwargs)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver self.db.
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return IMPL.instance_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver r = fn(*args, **kwargs)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return f(*args, **kwargs)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return _instance_
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver return f(*args, **kwargs)
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver File "/opt/stack/
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver raise exception.
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver InstanceNotFound: Instance 9b6045c5-
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.735 3270 TRACE nova.virt.driver
2013-05-15 11:03:05.742 3270 DEBUG nova.virt.driver [-] Emitting event <nova.virt.
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → Low |
Changed in nova: | |
milestone: | none → havana-2 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | havana-2 → 2013.2 |
I will submit a patch in the next couple of days to implement approach (2).