move_vhds_into_sr - invalid cookie

Bug #1362595 reported by Bob Ball
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
John Garbutt

Bug Description

When moving VHDs on the filesystem a coalesce may be in progress. The result of this is that the VHD file is not valid when it is copied as it is being actively changed - and the VHD cookie is invalid.

Seen in XenServer CI: http://dd6b71949550285df7dc-dda4e480e005aaa13ec303551d2d8155.r49.cf1.rackcdn.com/36/109836/4/23874/run_tests.log

2014-08-28 12:26:37.538 | Traceback (most recent call last):
2014-08-28 12:26:37.543 | File "tempest/api/compute/servers/test_server_actions.py", line 251, in test_resize_server_revert
2014-08-28 12:26:37.550 | self.client.wait_for_server_status(self.server_id, 'VERIFY_RESIZE')
2014-08-28 12:26:37.556 | File "tempest/services/compute/json/servers_client.py", line 179, in wait_for_server_status
2014-08-28 12:26:37.563 | raise_on_error=raise_on_error)
2014-08-28 12:26:37.570 | File "tempest/common/waiters.py", line 77, in wait_for_server_status
2014-08-28 12:26:37.577 | server_id=server_id)
2014-08-28 12:26:37.583 | BuildErrorException: Server e58677ac-dd72-4f10-9615-cb6763f34f50 failed to build and is in ERROR status
2014-08-28 12:26:37.589 | Details: {u'message': u'[\'XENAPI_PLUGIN_FAILURE\', \'move_vhds_into_sr\', \'Exception\', "VDI \'/var/run/sr-mount/16f5c980-eeb6-0fd3-e9b1-dec616309984/os-images/instancee58677ac-dd72-4f10-9615-cb6763f34f50/535cd7f2-80a5-463a-935c-9c4f52ba0ecf.vhd\' has an invalid footer: \' invalid cook', u'code': 500, u'created': u'2014-08-28T11:57:01Z'}

Tags: xenserver
Revision history for this message
Bob Ball (bob-ball) wrote :
Download full text (3.2 KiB)

I think the easiest fix here is to repair the VHDs on import.

My current theory is that because 'wait_for_coalesce' assumes (and has always assumed) that a single coalesce is going to happen (which is not necessarily correct) we might be trying to copy the VHDs while a coalesce is in progress.

Coalesce of the chain a-b-c (where c is the leaf) happens by:
1) Copying the blocks changed in b into a (to give a'-b-c)
2) Re-parenting C to a' (a'-c b)
3) Deleting b.

During step 1, the size of the VHD is extended, the new blocks written, and an updated footer put at the end of the extended VHD. If a file-level copy of the VHD is made after it has been extended but before the new footer is written the footer will be invalid. For this reason Citrix XenServer is looking at moving towards ignoring the footer and only using a 'backup footer' which is actually at the head of the VHD. This change is likely to be too invasive to be considered for a hotfix.

It seems that this can be repaired with the (very cheap) vhd-util repair option.

This may have been exasperated by https://review.openstack.org/#/c/93827/ to fix https://launchpad.net/bugs/1317792.
The behaviour of bug 1317792 was as follows:
1) The chain a-b-c was imported
2) c was snapshotted giving a-b-c-d; we waited for 'd' to coalesce back into 'b'
3) b was coalesced into a, giving a-c-d
4) c was coalesced into a, giving a-d
5) 'wait_for_coalesce' failed with a timeout

The fix for this issue was to wait for 'd' to coalesce back into anything other than 'c'; in this case 'a' or 'b'. As such, we might stop waiting at step 3 meaning the copy happens while 'c' was being coalesced.
Even without this fix, the above scenario could have occurred if the GC decided to coalesce 'c' first then the copy happened while 'b' was being coalesced.

Copying the VHDs in this state and fixing them up afterwards is, in my view, preferable to reverting to the previous behaviour.

In terms of moving forward without breaking bug 1317792 again, I think the following are options:
1) Use vhd-util repair to fix up the VHDs after the fact. As described above, the VHDs will still be valid as b is not removed from the chain until c is re-parented to a'. As such, any 'incorrect' data in a' will not be read because it is guaranteed that b contains the correct data.
2) Changing wait_for_coalesce to wait for _all_coalescing to be complete, based on XenServer's understanding of whether the GC is still running (would need a XAPI plugin to poll the GC to make sure it's not running at the point we copy)
3) Adding a XAPI plugin to manually lock the SR or VDI (using /opt/xensource/sm/lock.py). We're nervous about this as there have been deadlocks in the past with multiple threads locking (e.g. process 1 locks A, process 2 locks B, process 1 wants the lock for B). If there are other processes trying to lock the same things as SM then we're likely to see more issues with deadlocks or timeouts for valid SR operations.
4) (least preferred) add more logic to Nova to guess when it thinks GC will be able to coalesce or not. We currently have some logic that looks for siblings but if we were to follow options 2 or 3 then we can probably delete ...

Read more...

tags: added: xenserver
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → John Garbutt (johngarbutt)
milestone: none → juno-3
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/117498

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
milestone: juno-3 → none
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/121177

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/121177
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=48b6af716994a8a60ddaacacd090df0ca528c4b1
Submitter: Jenkins
Branch: master

commit 48b6af716994a8a60ddaacacd090df0ca528c4b1
Author: John Garbutt <email address hidden>
Date: Fri Sep 12 18:16:08 2014 +0100

    XenAPI improve post snapshot coalesce detection

    The coalesce detection is not working well after this change:
    ae2a27ce19f3e24d4a8c713a73e617f4cd71d4b4

    The snapshot operation will introduce a new VHD file into the VDI chain,
    and in many cases, that would be coalesced during the next SR scan. So
    we need to walk the chain after the snapshot has been taken, not before.

    This fixes the cause of the mentioned bug, but it doesn't help with
    launching the partially corrupted snapshots created because of this bug,
    so this only partially fixes the bug.

    Change-Id: I0bf7535ae3f7d9e6820f8dc07075892953d80a78
    Partial-Bug: #1362595

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/117498
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bfdae32efbeffcb74e7b2a0c48cb89cbf4c11329
Submitter: Jenkins
Branch: master

commit bfdae32efbeffcb74e7b2a0c48cb89cbf4c11329
Author: John Garbutt <email address hidden>
Date: Tue Aug 26 17:05:58 2014 +0100

    XenAPI: run vhd-util repair if VHD check fails

    We can hit issues with corrupted VHDs if we copy a VHD while XenServer
    is performing other operations. This happens because there are times
    when we copy the VHD chains while XenServer is still performing a
    coalesce of the VHD chain.

    In most cases, vhd-util should be able to safely repair any metadata
    corruption. It can copy the copy of the VHD footer at the front of the
    VHD file and add it at the bottom on the VHD file. There is no VM data
    loss, due to the way the coalesce happens, but the chain will be bigger
    than it would be both before and after the coalesce.

    This does not, however, ensure that snapshots are valid before uploading
    them to glance. But should you launch a corrupted snapshot, this change
    would fix up the snapshot, and allow it to boot correctly.

    Closes-Bug: #1362595

    Change-Id: I88b737d7e97964a9db5ccf2c39dea7fd0701ead4

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/143110

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/icehouse)

Change abandoned by Bob Ball (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/143110

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.