nova libvirt re-write broken with mulitiple ephemeral disks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Thang Pham | ||
Icehouse |
Fix Released
|
High
|
Martin Falatic |
Bug Description
Seem to be experiencing a bug with libvirt.xml device formatting when --ephemeral flag is used after initial booth and then use of nova stop/start or nova reboot --hard. We are using following libvirt options in nova.conf for storage:
libvirt_
libvirt_
When normally using nova boot with a flavor that has ephemeral defined it create two LVM volumes appropriatly ex.
instance-
instance-
The instance libvirt.xml contains disk devices entry as follows:
<devices>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vda"/>
</disk>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vdb"/>
</disk>
If we use "nova boot --flavor 757c75fa-
instance-
instance-
instance-
The instance libvirt.xml after instance spawn has disk device entries like below and the instances happily boots.
<devices>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vda"/>
</disk>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vdb"/>
</disk>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vdc"/>
</disk>
If nova stop/start or nova reboot --hard is executed the instance is destroyed and libvirt.xml gets recreated. At this stage whatever values we passed with --ephemeral are not respected and libvirt.xml revirts to configuration that would have been generated without the use of the --ephemeral option like below where we only have one extra disk and it is not using the enumerated naming.
<devices>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vda"/>
</disk>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none"/>
<source dev="/dev/
<target bus="virtio" dev="vdb"/>
</disk>
This causes instances booting to fail at this stage. The nova block_device_
tags: | added: libvirt |
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in nova: | |
assignee: | nobody → Thang Pham (thang-pham) |
tags: | added: havana-backport-potential |
tags: | added: icehouse-backport-potential |
Changed in nova: | |
milestone: | none → juno-1 |
status: | Fix Committed → Fix Released |
tags: | removed: icehouse-backport-potential |
Changed in nova: | |
milestone: | juno-1 → 2014.2 |
In digging into this some more, it looks like the issue may be that block_device_info is set to none in compute/api.py:
compute/api.py: self.compute_ rpcapi. reboot_ instance( context, instance=instance, info=None, type=reboot_ type)
compute/api.py- block_device_
compute/api.py- reboot_
It's then dutifully passed along to the message queue:
compute/rpcapi.py: def reboot_ instance( self, ctxt, instance, block_device_info, can_send_ version( '2.32') : info=block_ device_ info,
compute/rpcapi.py- reboot_type):
compute/rpcapi.py- if not self.client.
--
compute/rpcapi.py: cctxt.cast(ctxt, 'reboot_instance',
compute/rpcapi.py- instance=instance,
compute/rpcapi.py- block_device_
I don't see any code in the reboot methods which calls get_instance_bdms, which would imply the block_device_info is never populated when reboot is called.
I believe it works when the instance is first booted because block_device_info is populated from the API call, but when reboot gets to has_default_ ephemeral in libvirt/ blockinfo. py, this conditional fails:
if (instance[ 'ephemeral_ gb'] <= 0) or ephemerals:
Because ephemerals is empty and ephemeral_gb is > 0.
I'm guessing we need to do a _get_bdm_ image_metadata somewhere in the reboot method to make sure we populate "ephemerals", but I don't know enough about this code to know if I'm being completely crazy or not.