Yep, we call deallocate_for_instance before calling driver.destroy (see below)
Perhaps this needs to be reversed, although we will have to do some testing to verify that doesn't break anything.
586 def _shutdown_instance(self, context, instance, action_str):
587 """Shutdown an instance on this host."""
588 context = context.elevated()
589 instance_id = instance['id']
590 instance_uuid = instance['uuid']
591 LOG.audit(_("%(action_str)s instance %(instance_uuid)s") %
592 {'action_str': action_str, 'instance_uuid': instance_uuid},
593 context=context)
594
595 network_info = self._get_instance_nw_info(context, instance)
596 if not FLAGS.stub_network:
597 self.network_api.deallocate_for_instance(context, instance)
598
599 if instance['power_state'] == power_state.SHUTOFF:
600 self.db.instance_destroy(context, instance_id)
601 raise exception.Error(_('trying to destroy already destroyed'
602 ' instance: %s') % instance_uuid)
603 # NOTE(vish) get bdms before destroying the instance
604 bdms = self._get_instance_volume_bdms(context, instance_id)
605 block_device_info = self._get_instance_volume_block_device_info(
606 context, instance_id)
607 self.driver.destroy(instance, network_info, block_device_info)
Vish
On Jan 25, 2012, at 2:26 PM, David Kranz wrote:
> Public bug reported:
>
> I see this error in the nova-api.log (the one from the compute node) when running a stress test that starts/kills vms rapidly. This is from a diablo-stable cluster with one controller and two compute nodes, using multi-host with nova-network and nova-api running on each compute node. It happens maybe 20% of the runs. Is is possible there is a race condition involving tearing down a vm and removing its fixed ip from this database?
> I am working on getting the stress tests checked into Tempest as soon as possible.
>
> 2012-01-25 17:10:56,282 DEBUG nova.compute.api [58fba9aa-f844-4edb-84f4-4d5877450762 None None] Searching by: {'fixed_ip': '10.\
> 0.0.6'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,400 DEBUG nova.compute.api [58fba9aa-f844-4edb-84f4-4d5877450762 None None] Searching by: {'project_id': 't\
> estproject'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,553 INFO nova.api [-] 0.271808s 10.0.0.6 GET /2009-04-04/meta-data/local-hostname None:None 200 [Python-url\
> lib/2.7] text/plain text/html
> 2012-01-25 17:10:56,556 DEBUG nova.compute.api [0fef399b-6d27-4e20-a745-badad830ac9c None None] Searching by: {'fixed_ip': '10.\
> 0.0.6'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,634 ERROR nova.api.ec2.metadata [-] Failed to get metadata for ip: 10.0.0.6
> 2012-01-25 17:10:56,634 INFO nova.api [-] 0.78120s 10.0.0.6 GET /2009-04-04/meta-data/placement/ None:None 404 [Python-urllib/2\
> .7] text/plain text/plain
>
> ** Affects: nova
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to
> OpenStack Compute (nova).
> https://bugs.launchpad.net/bugs/921858
>
> Title:
> Sporadic 'Failed to get metadata for ip:'
>
> Status in OpenStack Compute (Nova):
> New
>
> Bug description:
> I see this error in the nova-api.log (the one from the compute node) when running a stress test that starts/kills vms rapidly. This is from a diablo-stable cluster with one controller and two compute nodes, using multi-host with nova-network and nova-api running on each compute node. It happens maybe 20% of the runs. Is is possible there is a race condition involving tearing down a vm and removing its fixed ip from this database?
> I am working on getting the stress tests checked into Tempest as soon as possible.
>
> 2012-01-25 17:10:56,282 DEBUG nova.compute.api [58fba9aa-f844-4edb-84f4-4d5877450762 None None] Searching by: {'fixed_ip': '10.\
> 0.0.6'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,400 DEBUG nova.compute.api [58fba9aa-f844-4edb-84f4-4d5877450762 None None] Searching by: {'project_id': 't\
> estproject'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,553 INFO nova.api [-] 0.271808s 10.0.0.6 GET /2009-04-04/meta-data/local-hostname None:None 200 [Python-url\
> lib/2.7] text/plain text/html
> 2012-01-25 17:10:56,556 DEBUG nova.compute.api [0fef399b-6d27-4e20-a745-badad830ac9c None None] Searching by: {'fixed_ip': '10.\
> 0.0.6'} from (pid=1093) get_all /usr/lib/python2.7/dist-packages/nova/compute/api.py:863
> 2012-01-25 17:10:56,634 ERROR nova.api.ec2.metadata [-] Failed to get metadata for ip: 10.0.0.6
> 2012-01-25 17:10:56,634 INFO nova.api [-] 0.78120s 10.0.0.6 GET /2009-04-04/meta-data/placement/ None:None 404 [Python-urllib/2\
> .7] text/plain text/plain
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/921858/+subscriptions
Yep, we call deallocate_ for_instance before calling driver.destroy (see below)
Perhaps this needs to be reversed, although we will have to do some testing to verify that doesn't break anything.
586 def _shutdown_ instance( self, context, instance, action_str): _("%(action_ str)s instance %(instance_uuid)s") % instance_ nw_info( context, instance) api.deallocate_ for_instance( context, instance) 'power_ state'] == power_state. SHUTOFF: instance_ destroy( context, instance_id) Error(_ ('trying to destroy already destroyed' instance_ volume_ bdms(context, instance_id) instance_ volume_ block_device_ info( destroy( instance, network_info, block_device_info)
587 """Shutdown an instance on this host."""
588 context = context.elevated()
589 instance_id = instance['id']
590 instance_uuid = instance['uuid']
591 LOG.audit(
592 {'action_str': action_str, 'instance_uuid': instance_uuid},
593 context=context)
594
595 network_info = self._get_
596 if not FLAGS.stub_network:
597 self.network_
598
599 if instance[
600 self.db.
601 raise exception.
602 ' instance: %s') % instance_uuid)
603 # NOTE(vish) get bdms before destroying the instance
604 bdms = self._get_
605 block_device_info = self._get_
606 context, instance_id)
607 self.driver.
Vish
On Jan 25, 2012, at 2:26 PM, David Kranz wrote:
> Public bug reported: f844-4edb- 84f4-4d58774507 62 None None] Searching by: {'fixed_ip': '10.\ python2. 7/dist- packages/ nova/compute/ api.py: 863 f844-4edb- 84f4-4d58774507 62 None None] Searching by: {'project_id': 't\ python2. 7/dist- packages/ nova/compute/ api.py: 863 04/meta- data/local- hostname None:None 200 [Python-url\ 6d27-4e20- a745-badad830ac 9c None None] Searching by: {'fixed_ip': '10.\ python2. 7/dist- packages/ nova/compute/ api.py: 863 ec2.metadata [-] Failed to get metadata for ip: 10.0.0.6 04/meta- data/placement/ None:None 404 [Python-urllib/2\ /bugs.launchpad .net/bugs/ 921858 f844-4edb- 84f4-4d58774507 62 None None] Searching by: {'fixed_ip': '10.\ python2. 7/dist- packages/ nova/compute/ api.py: 863 f844-4edb- 84f4-4d58774507 62 None None] Searching by: {'project_id': 't\ python2. 7/dist- packages/ nova/compute/ api.py: 863 04/meta- data/local- hostname None:None 200 [Python-url\ 6d27-4e20- a745-badad830ac 9c None None] Searching by: {'fixed_ip': '10.\ python2. 7/dist- packages/ nova/compute/ api.py: 863 ec2.metadata [-] Failed to get metadata for ip: 10.0.0.6 04/meta- data/placement/ None:None 404 [Python-urllib/2\ /bugs.launchpad .net/nova/ +bug/921858/ +subscriptions
>
> I see this error in the nova-api.log (the one from the compute node) when running a stress test that starts/kills vms rapidly. This is from a diablo-stable cluster with one controller and two compute nodes, using multi-host with nova-network and nova-api running on each compute node. It happens maybe 20% of the runs. Is is possible there is a race condition involving tearing down a vm and removing its fixed ip from this database?
> I am working on getting the stress tests checked into Tempest as soon as possible.
>
> 2012-01-25 17:10:56,282 DEBUG nova.compute.api [58fba9aa-
> 0.0.6'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,400 DEBUG nova.compute.api [58fba9aa-
> estproject'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,553 INFO nova.api [-] 0.271808s 10.0.0.6 GET /2009-04-
> lib/2.7] text/plain text/html
> 2012-01-25 17:10:56,556 DEBUG nova.compute.api [0fef399b-
> 0.0.6'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,634 ERROR nova.api.
> 2012-01-25 17:10:56,634 INFO nova.api [-] 0.78120s 10.0.0.6 GET /2009-04-
> .7] text/plain text/plain
>
> ** Affects: nova
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to
> OpenStack Compute (nova).
> https:/
>
> Title:
> Sporadic 'Failed to get metadata for ip:'
>
> Status in OpenStack Compute (Nova):
> New
>
> Bug description:
> I see this error in the nova-api.log (the one from the compute node) when running a stress test that starts/kills vms rapidly. This is from a diablo-stable cluster with one controller and two compute nodes, using multi-host with nova-network and nova-api running on each compute node. It happens maybe 20% of the runs. Is is possible there is a race condition involving tearing down a vm and removing its fixed ip from this database?
> I am working on getting the stress tests checked into Tempest as soon as possible.
>
> 2012-01-25 17:10:56,282 DEBUG nova.compute.api [58fba9aa-
> 0.0.6'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,400 DEBUG nova.compute.api [58fba9aa-
> estproject'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,553 INFO nova.api [-] 0.271808s 10.0.0.6 GET /2009-04-
> lib/2.7] text/plain text/html
> 2012-01-25 17:10:56,556 DEBUG nova.compute.api [0fef399b-
> 0.0.6'} from (pid=1093) get_all /usr/lib/
> 2012-01-25 17:10:56,634 ERROR nova.api.
> 2012-01-25 17:10:56,634 INFO nova.api [-] 0.78120s 10.0.0.6 GET /2009-04-
> .7] text/plain text/plain
>
> To manage notifications about this bug go to:
> https:/