XenAPI: Race - import followed by upload / waiting for snapshot coalesce
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Bob Ball |
Bug Description
In a nutshell, so everyone doesn't have to read the rest of the bug description:
If a tree is imported with more than two VDIs, then the _snapshot_
The following tree is imported from glance:
May 8 13:41:10 localhost SMGC: [14336]........ *814ea7ed(
May 8 13:41:10 localhost SMGC: [14336]............ *863cc348(
May 8 13:41:10 localhost SMGC: [14336]
If we immediately try to perform an operation that uses a snapshot (e.g. resize), we get the 'original' parent as 863cc348 - which means that later we will expect 863cc348 to be the new parent:
2014-05-08 13:41:23.769 DEBUG nova.virt.
We then take a snapshot and wait for coalesce so we have as few VHDs in the chain as possible:
May 8 13:41:24 localhost SM: [24744] ['/usr/
2014-05-08 13:41:26.688 DEBUG nova.virt.
2014-05-08 13:41:26.688 DEBUG nova.virt.
The snapshot has the effect of creating two children (ecce is the one we're referencing and b554 keeps the snapshot from being deleted):
May 8 13:41:34 localhost SMGC: [14336]........ *814ea7ed(
May 8 13:41:34 localhost SMGC: [14336]............ *863cc348(
May 8 13:41:34 localhost SMGC: [14336]
May 8 13:41:34 localhost SMGC: [14336]
May 8 13:41:34 localhost SMGC: [14336]
Coalescing then happens, destroying first 863cc348 then 151c921, merging the differences up the chain into 814ea7ed:
May 8 13:41:59 localhost SMGC: [14336]........ *814ea7ed(
May 8 13:41:59 localhost SMGC: [14336]............ *863cc348(
May 8 13:41:59 localhost SMGC: [14336]............ *151c921e(
May 8 13:41:59 localhost SMGC: [14336]
May 8 13:41:59 localhost SMGC: [14336]
May 8 13:42:11 localhost SMGC: [14336]........ *814ea7ed(
May 8 13:42:11 localhost SMGC: [14336]............ *151c921e(
May 8 13:42:11 localhost SMGC: [14336]
May 8 13:42:11 localhost SMGC: [14336]
May 8 13:42:36 localhost SMGC: [5186]........ *814ea7ed(
May 8 13:42:36 localhost SMGC: [5186]............ b55436a7(
May 8 13:42:36 localhost SMGC: [5186]............ eccebf0b(
During this time, we're waiting for coalesce and checking the parent. Eventually we get to the static state where Nova is waiting for the parent of eccebf0b to be the deleted VDI 863cc348 - which can never happen now.
2014-05-08 13:42:41.981 DEBUG nova.virt.
2014-05-08 13:42:41.982 DEBUG nova.virt.
2014-05-08 13:44:00.678 ERROR nova.virt.
We're lucky in a way that 863cc348 was destroyed before 151c921, because otherwise this race might not have been noticed if Nova picked the right time to claim all coalescing had finished - which could have given more problems when uploading the VHDs.
description: | updated |
Changed in nova: | |
importance: | Undecided → Medium |
tags: | added: xenserver |
description: | updated |
description: | updated |
Changed in nova: | |
status: | New → Confirmed |
assignee: | nobody → Bob Ball (bob-ball) |
Changed in nova: | |
status: | Confirmed → In Progress |
Changed in nova: | |
milestone: | none → juno-1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | juno-1 → 2014.2 |
Reviewed: https:/ /review. openstack. org/93827 /git.openstack. org/cgit/ openstack/ nova/commit/ ?id=ae2a27ce19f 3e24d4a8c713a73 e617f4cd71d4b4
Committed: https:/
Submitter: Jenkins
Branch: master
commit ae2a27ce19f3e24 d4a8c713a73e617 f4cd71d4b4
Author: Bob Ball <email address hidden>
Date: Fri May 16 00:22:14 2014 +0100
XenAPI: Tolerate multiple coalesces
VHD coalescing might coalesce more than one VDI while waiting
and might coalesce the 'grandparent' before the parent. Wait
for the parent to be any of the original tree instead of just
the original parent.
Closes bug: 1317792
Change-Id: I82c4db0e8a36c4 1a86adf4bd32304 a4bfdabebbf