xenapi: metadata updates causing rebuild and resize to fail

Bug #1207238 reported by John Garbutt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
John Garbutt

Bug Description

When you update instance metadata during tasks where the VM is not available, such as resize or rebuild, its possible you get NotFound exceptions causing the instances to go into the Error state, which resets the Task state, which then causes instance updates to fail that are part of the long running operation.

This should not happen, as change_instance_metadata can only be called when the task_state=None, however there is clearly a window where the metadata request has been made, and the resize or rebuild has started in the mean time, and so the NotFound error is raised.

To avoid this, we can silently fail metadata updates where the VM cannot be found, as the latest metadata will be added back into the VM when it is being configured on the destination.

Tags: xenserver
Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/39657

Changed in nova:
assignee: nobody → John Garbutt (johngarbutt)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/39657
Committed: http://github.com/openstack/nova/commit/4d5c2f13e8ccbe2a2aaa7dac8fcd53d158a73a73
Submitter: Jenkins
Branch: master

commit 4d5c2f13e8ccbe2a2aaa7dac8fcd53d158a73a73
Author: John Garbutt <email address hidden>
Date: Thu Aug 1 10:35:31 2013 +0100

    xenapi: skip metadata updates when VM not found

    There is a race condition where change_instance_metdata requests still
    get to the compute nodes when the VM has a task in operation that means
    the VM is no longer present. This change ensures that we skip the
    metadata updates when this occurs.

    Currently, if this happens, the long running task, like rebuild or
    resize will fail because the VM task_state=None and is put into ERROR.
    After this change, it will silently fail, and the updated metadata will
    be written when the VM gets re-created as part of the long running
    operation.

    Fixes bug 1207238
    Change-Id: I75e1f93e34d3b3ab93a8e8104fd64224f72d7309

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.