Comment 31 for bug 1548450

Revision history for this message
Matthew Booth (mbooth-9) wrote : Re: Host data leak during resize/migrate for raw-backed instances

Sean, Dan B,

The first stage goal of my libvirt storage pools series is to promote disk.info to a per-instance storage policy. Critically, once I have finished the refactor of Image.cache() I will be able to pull the management of disk.info out of the storage classes and promote it to what is currently Backend. With that in place, we will have enough context to create disk.info *before* creating the disk. Also, as discussed with Dan S, I am expanding the data that it stores to include not just format, but also which backend it uses, and its size. With this metadata stored persistently independent of the disk itself we will never have to inspect a disk with qemu-img info, and this class of bug will go away.

There is a separate discussion to be had around where the best place to store this kind of information is. disk.info is what we have now, so I'm running with that. It would be ideal if we could take it straight out of libvirt. Unfortunately though, because we don't use persistent domains we lose all of this information every time the instance shuts down. We could potentially use instance system metadata, and I will look into that. For today, though, I'm working on a Mitaka patch which understands an expanded format of disk.info which I intend to use in Newton. If we ultimately choose to move the data elsewhere that's fine, but the data needs to exist, and the code needs to be restructured so that it can use it safely.

The problem with the current code is the Image.cache() function. This function does everything for all backends. It is passed a callback function which creates something by some means. The cache() function does not know what it creates, or where it comes from. This means that, as in the case of the poorly-named Raw backend[1], if cache() is creating an image-backed disk whose format is dependent on the format of the image it is downloading, it does not have enough context to know:

1. What the expected format is of the image that it is downloading.
2. That it is even downloading an image at all.

Its only option with the current structure is to call the callback and test what it got. My series splits this into create_from_func() for ephemeral and swap disks, and create_from_image() for images. This gives the backend enough context to know what it is creating before it creates it, so it can safely create and use the metadata described above. The first part of that, create_from_func() is here: https://review.openstack.org/#/c/282580/ . It needs updating and it's not going to make Mitaka, but I stress that the disk.info patch which I should hopefully post today is required in Mitaka in order to be able to make these changes in Newton.

[1] https://review.openstack.org/#/c/279626/