Rapidly creating and rebooting vms causes nova to fail

Bug #942008 reported by David Kranz
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

A Tempest stress test that randomly creates and reboots vms eventually causes many errors in the compute log. This test was run on a diablo cluster with two kvm compute nodes and one controller. The stress tests have not been merged into Tempest yet but this case can be run by pulling https://review.openstack.org/#change,4393 into a tempest sandbox and running

PYTHONPATH=. python stress/tests/hard_reboots.py

after setting up tempest.conf as usual and setting the config parameters described in stress/README.rst. I am attaching the full compute log but the first set of errors looks like:

2012-02-27 10:08:56,437 ERROR nova.rpc [00681ce2-e8ff-4292-becd-d92718cbefd4 tester testproject] AMQP server on 172.18.0.1\
46:5672 is unreachable: Socket closed. Trying again in 1 seconds.

and then, a little later:

2012-02-27 10:09:02,653 ERROR nova.rpc [0a8f33fe-3ee9-4120-87eb-c1243099e87a tester testproject] AMQP server on 172.18.0.1\
46:5672 is unreachable: Socket closed. Trying again in 1 seconds.
2012-02-27 10:09:02,658 INFO nova.rpc [03608f9b-9ad6-4d98-b5f8-0723c76c4855 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,660 DEBUG nova.utils [21534b7c-3a5c-44f8-9008-f190f2961466 tester testproject] Running cmd (subprocess\
): sudo iptables-restore from (pid=1206) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-02-27 10:09:02,673 INFO nova.rpc [12d73812-465f-4756-9d9b-b2b9e16ba5ab tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,674 INFO nova.rpc [5995bacf-3de8-4444-bbcb-5865947a1ad5 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,675 INFO nova.rpc [e78db81e-bcba-47b9-bacd-3155265edcc8 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,677 INFO nova.rpc [802f39d5-5d0b-425f-8448-d36a564131e0 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,677 INFO nova.rpc [d053512a-f295-4906-b32b-0901edcae62f tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,678 INFO nova.rpc [e8261fd8-3e50-427f-a218-a13b63733e30 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,678 INFO nova.rpc [311568bb-c4e3-434a-b058-f2a499ca4259 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,681 INFO nova.rpc [5f9d72bc-b42d-47a9-a959-6f9f22743879 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,681 INFO nova.rpc [82413920-93f6-4f66-9150-58d31b2f8795 tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,682 INFO nova.rpc [b027dfe8-6e16-495b-ae3d-51bc084bae7c tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:02,683 INFO nova.rpc [09047330-97b6-4f4c-bee9-35c8f4147b7d tester testproject] Connected to AMQP server o\
n 172.18.0.146:5672
2012-02-27 10:09:03,464 ERROR nova.rpc [6a0f9545-70c3-4501-84e0-7c7cb16e4651 tester testproject] Exception during message \
handling
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 620, in _process_data
(nova.rpc): TRACE: rval = node_func(context=ctxt, **node_args)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 100, in wrapped
(nova.rpc): TRACE: return f(*args, **kw)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 118, in decorated_function
(nova.rpc): TRACE: function(self, context, instance_id, *args, **kwargs)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 636, in reboot_instance
(nova.rpc): TRACE: self.driver.reboot(instance_ref, network_info)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 100, in wrapped
(nova.rpc): TRACE: return f(*args, **kw)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 501, in reboot
(nova.rpc): TRACE: virt_dom = self._conn.lookupByName(instance['name'])
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1870, in lookupByName
(nova.rpc): TRACE: if ret is None:raise libvirtError('virDomainLookupByName() failed', conn=self)
(nova.rpc): TRACE: libvirtError: Domain not found: no domain with matching name 'instance-00000015'
(nova.rpc): TRACE:

Revision history for this message
David Kranz (david-kranz) wrote :
Revision history for this message
David Kranz (david-kranz) wrote :
Revision history for this message
David Kranz (david-kranz) wrote :

Diablo 2011.3.1.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.