Resize on vCenter failed because of _VM_REFS_CACHE

Bug #1290807 reported by Feng Xi Yan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
High
Feng Xi Yan
Icehouse
In Progress
High
Feng Xi Yan
VMwareAPI-Team
In Progress
Critical
Unassigned

Bug Description

This bug is for nova/master branch.

The resize action in vmware environment always fails.

The reason is that nova resized the ****-orign rather than the new cloned vm.

It is caused by the outdated vm_ref in _VM_REFS_CACHE.

In nova/virt/vmwareapi/vmops.py:

def finish_migration(self, context, migration, instance, disk_info,
                         network_info, image_meta, resize_instance=False,
                         block_device_info=None, power_on=True):
        """Completes a resize, turning on the migrated instance."""
        if resize_instance:
            client_factory = self._session._get_vim().client.factory
            vm_ref = vm_util.get_vm_ref(self._session, instance)
            vm_resize_spec = vm_util.get_vm_resize_spec(client_factory,
                                                        instance)
            reconfig_task = self._session._call_method(
                                            self._session._get_vim(),
                                            "ReconfigVM_Task", vm_ref,
                                            spec=vm_resize_spec)
           .......

From this code, we can see we get vm_ref by vm_util.get_vm_ref.

In nova/virt/vmwareapi/vm_util.py

@vm_ref_cache_from_instance
def get_vm_ref(session, instance):
    """Get reference to the VM through uuid or vm name."""
    uuid = instance['uuid']
    vm_ref = (_get_vm_ref_from_vm_uuid(session, uuid) or
                  _get_vm_ref_from_extraconfig(session, uuid) or
                  _get_vm_ref_from_uuid(session, uuid) or
                  _get_vm_ref_from_name(session, instance['name']))
    if vm_ref is None:
        raise exception.InstanceNotFound(instance_id=uuid)
    return vm_ref

The "get_vm_ref" method is decorated by "vm_ref_cache_from_instance".
"vm_ref_cache_from_instance" will firstly check cache variable _VM_REFS_CACHE. But _VM_REFS_CACHE contains a outdated vm_ref(The original one) keyed by our instance_uuid, because the virtual machine's name is changed.

Feng Xi Yan (yanfengxi)
description: updated
description: updated
Revision history for this message
Feng Xi Yan (yanfengxi) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/#/c/79833/

Gary Kotton (garyk)
Changed in nova:
status: New → Confirmed
importance: Undecided → Critical
Feng Xi Yan (yanfengxi)
description: updated
Changed in nova:
assignee: nobody → Feng Xi Yan (yanfengxi)
assignee: Feng Xi Yan (yanfengxi) → nobody
Gary Kotton (garyk)
tags: added: vmware
Changed in nova:
milestone: none → icehouse-rc1
Changed in openstack-vmwareapi-team:
status: New → In Progress
importance: Undecided → Critical
Changed in nova:
status: Confirmed → In Progress
assignee: nobody → Feng Xi Yan (yanfengxi)
Changed in nova:
importance: Critical → High
Revision history for this message
Feng Xi Yan (yanfengxi) wrote :

Commit a new fix.
Abandoned the former method "get_vm_ref_for_resize" and try to update cache during resize.

Review: https://review.openstack.org/#/c/79833/

Feng Xi Yan (yanfengxi)
summary: - Resize on vCenter failed becausee of _VM_REFS_CACHE
+ Resize on vCenter failed because of _VM_REFS_CACHE
Revision history for this message
Feng Xi Yan (yanfengxi) wrote :
Revision history for this message
Shawn Hartsock (hartsock) wrote :

This bug arises due to changes for https://bugs.launchpad.net/nova/+bug/1258179 see https://review.openstack.org/#/c/60259/ which describes a caching solution to fix a Python IO problem.

Other solutions could have been to reduce the page size (number of VMs returned per result object) to a lower number (like 10) instead of introducing the cache.

The cache is manipulated through a set of decorators. Try to keep consistent and instead add a decorator to manipulate the cache as well.

I suggest use of names like

def vm_ref_cache_update_by_name(func):

def vm_ref_cache_update_by_instance(func):

def vm_ref_cache_invalidate_by_name(func):

def vm_ref_cache_invalidate_by_instance(func):

Revision history for this message
Tracy Jones (tjones-i) wrote :

this bug could be pushed to icehouse-rc-potential if not merged by 2/24 12pm UTC

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/82104

Revision history for this message
Shawn Hartsock (hartsock) wrote :
Revision history for this message
Russell Bryant (russellb) wrote :

per Tracy, moving to icehouse-rc-potential

tags: added: icehouse-rc-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.