VMWare - Destroy fails when Claim is not successful

Bug #1307408 reported by Sagar Ratnakara Nikam
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Sagar Ratnakara Nikam
Icehouse
Fix Released
High
Sagar Ratnakara Nikam

Bug Description

If Claim is not successful, compute manager triggers a call to destroy instance.
Destroy fails since the compute node (cluster) is set only after claim is successful.

This issue occurs when multiple parallel nova boot operations are triggered simultaneously.

Snippet from nova-compute.log
2014-04-06 22:48:52.454 DEBUG nova.compute.utils [req-0663cdf1-9969-446a-af08-299f18366394 demo demo] [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] Insufficient compute resources: Free memory 975.00 MB < requested 2000 MB. from (pid=9041) notify_about_instance_usage /opt/stack/nova/nova/compute/utils.py:336
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] Traceback (most recent call last):
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  File "/opt/stack/nova/nova/compute/manager.py", line 1289, in _build_instance
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  with rt.instance_claim(context, instance, limits):
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  File "/opt/stack/nova/nova/openstack/common/lockutils.py", line 249, in inner
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  return f(*args, **kwargs)
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  File "/opt/stack/nova/nova/compute/resource_tracker.py", line 122, in instance_claim
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  overhead=overhead, limits=limits)
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  File "/opt/stack/nova/nova/compute/claims.py", line 95, in __init__
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  self._claim_test(resources, limits)
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  File "/opt/stack/nova/nova/compute/claims.py", line 148, in _claim_test
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572]  "; ".join(reasons))
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] ComputeResourcesUnavailable: Insufficient compute resources: Free memory 975.00 MB < requested 2000 MB.
2014-04-06 22:48:52.454 TRACE nova.compute.utils [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] 
2014-04-06 22:48:52.455 DEBUG nova.compute.manager [req-0663cdf1-9969-446a-af08-299f18366394 demo demo] [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] Clean up resource before rescheduling. from (pid=9041) _reschedule_or_error /opt/stack/nova/nova/compute/manager.py:1401
2014-04-06 22:48:52.455 AUDIT nova.compute.manager [req-0663cdf1-9969-446a-af08-299f18366394 demo demo] [instance: b22186ec-9f05-4f7d-a0d6-2276baeb6572] Terminating instance
2014-04-06 22:48:52.544 DEBUG nova.network.api [req-8cf2f302-42af-46e2-b745-fa30902c3319 demo demo] Updating cache with info: [] from (pid=9041) update_instance_cache_with_nw_info /opt/stack/nova/nova/network/api.py:74
2014-04-06 22:48:52.555 DEBUG nova.objects.instance [req-0663cdf1-9969-446a-af08-299f18366394 demo demo] Lazy-loading `system_metadata' on Instance uuid b22186ec-9f05-4f7d-a0d6-2276baeb6572 from (pid=9041) obj_load_attr /opt/stack/nova/nova/objects/instance.py:519
2014-04-06 22:48:52.563 DEBUG nova.compute.manager [req-8cf2f302-42af-46e2-b745-fa30902c3319 demo demo] [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] Deallocating network for instance from (pid=9041) _deallocate_network /opt/stack/nova/nova/compute/manager.py:1784
2014-04-06 22:48:52.593 ERROR nova.compute.manager [req-8cf2f302-42af-46e2-b745-fa30902c3319 demo demo] [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] Error: Insufficient compute resources: Free memory 975.00 MB < requested 2000 MB.
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] Traceback (most recent call last):
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 1289, in _build_instance
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  with rt.instance_claim(context, instance, limits):
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/openstack/common/lockutils.py", line 249, in inner
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  return f(*args, **kwargs)
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/resource_tracker.py", line 122, in instance_claim
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  overhead=overhead, limits=limits)
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/claims.py", line 95, in __init__
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  self._claim_test(resources, limits)
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/claims.py", line 148, in _claim_test
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  "; ".join(reasons))
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] ComputeResourcesUnavailable: Insufficient compute resources: Free memory 975.00 MB < requested 2000 MB.
2014-04-06 22:48:52.593 TRACE nova.compute.manager [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] 
2014-04-06 22:48:52.664 DEBUG nova.compute.utils [req-8cf2f302-42af-46e2-b745-fa30902c3319 demo demo] [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] The resource None does not exist from (pid=9041) notify_about_instance_usage /opt/stack/nova/nova/compute/utils.py:336
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] Traceback (most recent call last):
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 1202, in _run_instance
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  instance, image_meta, legacy_bdm_in_spec)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 1366, in _build_instance
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  filter_properties, bdms, legacy_bdm_in_spec)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 1412, in _reschedule_or_error
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  self._log_original_error(exc_info, instance_uuid)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 68, in __exit__
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  six.reraise(self.type_, self.value, self.tb)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 1407, in _reschedule_or_error
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  bdms, requested_networks)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 2136, in _shutdown_instance
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  requested_networks)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 68, in __exit__
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  six.reraise(self.type_, self.value, self.tb)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/compute/manager.py", line 2126, in _shutdown_instance
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  block_device_info)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 656, in destroy
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  _vmops = self._get_vmops_for_compute_node(instance['node'])
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 544, in _get_vmops_for_compute_node
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  resource = self._get_resource_for_node(nodename)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 536, in _get_resource_for_node
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b]  raise exception.NotFound(msg)
2014-04-06 22:48:52.664 TRACE nova.compute.utils [instance: 1deeb6c0-ed7f-4f5a-bdcc-97765803d18b] NotFound: The resource None does not exist

Tags: vmware
tags: added: vmware
Changed in nova:
assignee: nobody → Sagar Ratnakara Nikam (sagar-r-nikam)
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/87894

Changed in nova:
status: New → In Progress
Gary Kotton (garyk)
Changed in nova:
importance: Undecided → High
milestone: none → juno-1
tags: added: icehouse-backport-potential
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/87894
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=aadfffaaa69c6a8939f6c88cfbaeb026c7604163
Submitter: Jenkins
Branch: master

commit aadfffaaa69c6a8939f6c88cfbaeb026c7604163
Author: Sagar Ratnakara Nikam <email address hidden>
Date: Wed Apr 16 14:29:51 2014 +0530

    VMWare - Check for compute node before triggering destroy

    While booting an instance, if Claim on that compute node is not successful
    compute manager triggers a call to destroy instance. Destroy fails since the
    compute node (Cluster/ESX host) is set only after claim is successful.
    Hence the check for node is made before destroy, to ensure that exception
    does not get thrown. This ensures that the instance gets rescheduled
    Closes-Bug: #1307408

    Change-Id: Iaaf931e9c1e6cf046497e2c64952f92e802ad4be

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/92135

Alan Pevec (apevec)
tags: removed: icehouse-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/92135
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=422decdc7eeb1a170ebff3197f9115e58f4f89b4
Submitter: Jenkins
Branch: stable/icehouse

commit 422decdc7eeb1a170ebff3197f9115e58f4f89b4
Author: Sagar Ratnakara Nikam <email address hidden>
Date: Wed Apr 16 14:29:51 2014 +0530

    VMWare - Check for compute node before triggering destroy

    While booting an instance, if Claim on that compute node is not successful
    compute manager triggers a call to destroy instance. Destroy fails since the
    compute node (Cluster/ESX host) is set only after claim is successful.
    Hence the check for node is made before destroy, to ensure that exception
    does not get thrown. This ensures that the instance gets rescheduled
    Closes-Bug: #1307408

    Change-Id: Iaaf931e9c1e6cf046497e2c64952f92e802ad4be
    (cherry picked from commit aadfffaaa69c6a8939f6c88cfbaeb026c7604163)

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.