Cinder

EMC XtremIO volume create returns success before volume is ready

Bug #1483359 reported by Joe Antkowiak on 2015-08-10

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Invalid	Medium	Shay Halsband

Bug Description

When creating more than one new volume (with an image source specified) at a time, the cinder driver for EMC XtremIO returns a success/complete back to cinder before the volume is actually ready to be written to.

This results in occasional (~1 in 10) errors when the volume is immediately mounted to lay down an image:

volume.log:2015-07-16 17:17:53.825 32582 ERROR oslo.messaging._drivers.common [req-7294e391-ceca-41d6-8b8f-eda0caeb4abb 6733469fa116471495f82942842e1460 5bc59b062aed4031a38ad6ef0f7b0b68 - - -] Returning exception Failed to copy image to volume: qemu-img: /dev/disk/by-path/ip-172.16.2.22:3260-iscsi-iqn.2008-05.com.xtremio:001e679efa9b-lun-5: error while converting raw: Device is too small

This can only be reproduced by initiating 10-20 cinder volume creates simultaneously. 1-2 of them will fail with the above error.

A workaround that has worked so far is to insert a random 2-8 second delay after the creation of the volume. This is probably not the best way to fix this.

Tags:

Revision history for this message

Joe Antkowiak (joe-antkowiak) wrote on 2015-08-10:

patch to xtremio.py to introduce 2-8 second delay following a volume create Edit (1.2 KiB, text/plain)

Joe Antkowiak (joe-antkowiak) on 2015-08-10

Changed in cinder:
assignee:	nobody → Joe Antkowiak (joe-antkowiak)

Revision history for this message

Xing Yang (xing-yang) wrote on 2015-08-11:

Adding a delay of a fixed amount of time is not a reliable way of fixing this. I'm assigning this to Shay Halsband to take a look as he wrote the XtremIO driver. Thanks.

Changed in cinder:
assignee:	Joe Antkowiak (joe-antkowiak) → Shay Halsband (shay-halsband)
tags:	added: drivers emc xtremio

Revision history for this message

Joe Antkowiak (joe-antkowiak) wrote on 2015-08-11:

Agreed.

I've reproduced this in a lab environment with both xtremio 3.x and 4.0. Let me know what I can do to help.

The only thing that worked so far has been a random delay. A fixed delay did not change anything.

If there's a way to query the gear again to verify it is ready to be written to, that would be more ideal.

Revision history for this message

Xing Yang (xing-yang) wrote on 2015-08-12:

I may have remove "ubuntu" by accident. Please add it back if you want to.

no longer affects:	ubuntu
Changed in cinder:
importance:	Undecided → Medium
status:	New → Triaged

Revision history for this message

Xing Yang (xing-yang) wrote on 2015-08-12:

The error says "Device is too small".

Returning exception Failed to copy image to volume: qemu-img: /dev/disk/by-path/ip-172.16.2.22:3260-iscsi-iqn.2008-05.com.xtremio:001e679efa9b-lun-5: error while converting raw: Device is too small

Creating a boot volume takes a lot of space because it needs to download image to a temp file and then convert it to raw using emu-img. You need to make sure the volume size is sufficient and also you have enough space on your cinder volume node.

Check the virtual size of the image using the following command:
qemu-img info <image file>

You need to create a volume of size at least as big as the virtual size.

Also make sure you have enough space on your cinder volume node to hold the virtual size of all the boot volumes if you are creating them simultaneously.

Revision history for this message

Xing Yang (xing-yang) wrote on 2015-08-12:

I wonder if you add a wait, some of the temp files will be cleaned up so you have enough space to create more.

Revision history for this message

Yaniv Kaul (yaniv-kaul) wrote on 2015-08-12:

Joe, can you please share more about your configuration and scenario? Specifically, host OS, OpenStack version, iSCSI with or without CHAP, how do you create multiple parallel volumes, is it from a raw or qcow2 image (I assume qcow2), are you using multipathing (doesn't look like, from the above snippet), etc.?

Revision history for this message

Joe Antkowiak (joe-antkowiak) wrote on 2015-08-12:

The image we this was reproduced with is cirros, which is only 12 megs in size, so the 1G volume size is appropriate. We also used a 5G volume size, with the same result. We also used a Fedora image with an 8G volume size, same result.

90% of these work just fine, about 10% of them fail with this issue, and only when there are simultaneous creates running.

Environment:
RH OSP5 (official RH) -- icehouse
host os is rhel7
iSCSI without chap
no multipathing
QCOW2 images

When we run the same exact scenario against other storage backends (we are using multiple cinder backends) this works fine, this only occurs with XtremIO.

This was initially discovered when we were teaching an end-user openstack class, where 25 people were all creating their new volumes at once. We reproduced the issue by just pasting this into the command line:
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
......

The image we this was reproduced with is cirros, which is only 12 megs in size, so the 1G volume size is appropriate.  We also used a 5G volume size, with the same result.  We also used a Fedora image with an 8G volume size, same result.

90% of these work just fine, about 10% of them fail with this issue, and only when there are simultaneous creates running.

Environment:
RH OSP5 (official RH) -- icehouse
host os is rhel7
iSCSI without chap
no multipathing
QCOW2 images

When we run the same exact scenario against other storage backends (we are using multiple cinder backends) this works fine, this only occurs with XtremIO.

This was initially discovered when we were teaching an end-user openstack class, where 25 people were all creating their new volumes at once.  We reproduced the issue by just pasting this into the command line:
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
cinder create --volume-type EMC-XtremIO-1 --image-id bacd75e2-bbbc-4293-841b-8cb8b05ea626 5&
......

Revision history for this message

Yaniv Kaul (yaniv-kaul) wrote on 2015-08-13:

- This does not reproduce when using multipathing - I assume it's because the delay setting up multipathing by Cinder.
- The initial assumption doesn't seem to be accurate. The volume is writable (or at least readable), as can be seen from the following reproduction snippet - right after the session establishment, cinder reads a block using 'dd' from the device.

2015-08-13 14:13:31.308 203965 DEBUG cinder.brick.initiator.connector [req-611cce66-30ad-42b3-9495-020a1e6ef910 d36baa20fd364f31b67d4f5df98512a6 e9d9c
8eac5bb447487985f707ac5711a - - -] iscsiadm ['-m', 'session']: stdout=tcp: [10] 10.205.96.8:3260,1 iqn.2008-05.com.xtremio:xio00150202049-514f0c50400f
4c05 (non-flash)
2015-08-13 14:13:31.309 203965 DEBUG cinder.openstack.common.processutils [req-611cce66-30ad-42b3-9495-020a1e6ef910 d36baa20fd364f31b67d4f5df98512a6 e
9d9c8eac5bb447487985f707ac5711a - - -] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf dd if=/dev/disk/by-path/ip-10.205.96.8
:3260-iscsi-iqn.2008-05.com.xtremio:xio00150202049-514f0c50400f4c05-lun-2 of=/dev/null count=1 execute /usr/lib/python2.7/site-packages/cinder/opensta
ck/common/processutils.py:147

I suspect cache issues with the call to qemu-img. I'd try to see if the '-t' and '-T' flags of qemu-img can be used.

Revision history for this message

Yaniv Kaul (yaniv-kaul) wrote on 2015-08-23:

#10

1. Reproduced in Kilo.
2. Only reproducible without multipathing.
3. Looks like the disk is being reset right after the 'dd' and before the conversion takes place!

Revision history for this message

Shay Halsband (shay-halsband) wrote on 2015-08-26:

#11

XtremIO mandates multipath for performance and HA,
as this only happens without multipath I'm closing the issue