copy_image_to_volume calls fetch_to_raw for transfer even if the image is raw

Bug #1200800 reported by Jay Bryant on 2013-07-12
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Jay Bryant

Bug Description

We have been running into a problem on PowerPC systems using the storwize_svc driver where they fail to copy an image to volume with the following error:

"cinder create --display-name=fromimage2_hm12 --image-id 2c176cb3-5843-4c7f-9257-78a6d53ec761 1"

Traceback is:

['Traceback (most recent call last):\n', '
File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 654, in _copy_image_to_volume\n image_id)\n', '
File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/storwize_svc.py", line 1315, in copy_image_to_volume\n context, volume, image_service, image_id)\n', '
File "/usr/lib/python2.6/site-packages/cinder/volume/driver.py", line 432, in copy_image_to_volume\n volume_path)\n', '
File "/usr/lib/python2.6/site-packages/cinder/image/image_utils.py", line 229, in fetch_to_raw\n data = qemu_img_info(tmp)\n', '
File "/usr/lib/python2.6/site-packages/cinder/image/image_utils.py", line 187, in qemu_img_info\n run_as_root=True)\n', '
File "/usr/lib/python2.6/site-packages/cinder/utils.py", line 107, in execute\n description=ex.description)\n', "ProcessExecutionError: Unexpected error while running command.\nCommand: sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C LANG=C qemu-img info /tmp/tmpU4LUlC\nExit code: 127\nStdout: ''\nStderr: '/bin/env: qemu-img: No such File or directory\\n'\n"]

PowerPC systems don't have qemu-img installed, so the error isn't surprising. For this reason the storwize_svc support is limited to RAW images.

Unfortunately, when storwize_svc calls out to the default iscsi copy_image_to_volume function it is always calling image_utils.fetch_to_raw even if the image is already in the RAW format. I don't believe there is a problem in copy_volume_to_image as it calls image_utils.upload_volume which checks if the format is RAW and avoids this problem.

It seems to me that copy_image_to_volume should check the image format and only use image_utils.fetch_to_raw if the image is not RAW, otherwise, can't the image just be fetched?

John Griffith (john-griffith) wrote :

Not sure, but I seem to recall having this discussion in the past and I think I actually removed all of the crazy checks that were in place on types. The reason being is that fortunately the qemu convert function is pretty smart, and if the image is already in the correct format it won't go through and change it, it's basically a noop at that point.

I'll give folks a chance to chime in and tell me I'm wrong etc, but I'm thinking this is probably ok as is and probably the right way to go. Perhaps I should add some comments to the effect of what I've stated above?

Jay Bryant (jsbryant) wrote :

John, thanks for the quick response. Given that it sounds like it may have been a conscious decision to remove the checks for the Image type it would seem I have two options: 1) See if we can get qemu-img added to our PowerPC systems in some way so that we don't hit this problem. A bit of searching suggests that it should be possible as SuSE has a package. It just appears that RHEL doesn't. 2) Change the way that the storewize_svc driver is doing copy_image_to_volume to avoid the problem without changing the default iscsi path.

Obviously I will try option 1 first. :-)

Jay Bryant (jsbryant) wrote :

John,

Option 1 is possible (you can download and compile qemu for ppc) so I have provided it to the consumer as a workaround for the time being. Unfortunately, getting this into our build process would be unlikely. So, I need a more permanent solution that will enable storwize-svc on ppc systems without qemu-img installed.

Given that the checks for the image type were removed from the iscsi path, I am assuming you would not want them added back in. Would an alternate solution be for the storwize driver to not use the iscsi code, but instead to implement copy_image_to_volume itself and do the necessary checks there? If so I can put together a patch to implement this.

Changed in cinder:
status: New → Invalid
status: Invalid → Confirmed
Avishay Traeger (avishay-il) wrote :

I would prefer that the Storwize/SVC driver not have its own version of copy_image_to_volume, especially now that Walt is doing great work on making it work for FC as well. Let's try to think of an alternate solution.

Jay Bryant (jsbryant) wrote :

Avishay, I am open to ideas. I would be fine with saying that qemu-img is required on the PPC systems, but I have gotten push back against that statement. If we aren't able to add a custom handler in Storwize/SVC or update the iSCSI layer to check the image type, how else can we avoid encountering the qemu-img problem?

Could we put a fallback in the iSCSI path that checks for the existence of qemu-img before trying to use it? In the case that it can't find the executable we then check the image type, if it is RAW we go on and avoid the failure? Would that possibly be an acceptable solution?

Avishay Traeger (avishay-il) wrote :

qemu-img is also used to determine the image type. How can we do that on PPC to ensure it's raw?

Jay Bryant (jsbryant) wrote :

I have spent some more time looking at this. First, looking back through the logs to when Avishay put the copy volume<->image code in, back in January, I don't see any check-ins that obviously remove additional checks that were being made in the iSCSI path for the image type. I also don't think that adding a check in for copy_image_to_volume is unprecedented as copy_volume_to_image calls image_utils.upload_volume and the first thing it does is to check the image type without qemu-img.

Also, a move away from being dependent upon qemu-img is also not unprecedented given that Glance has already made changes to move away from requiring qemu-img. https://bugs.launchpad.net/nova/+bug/1131033 .

It seems to me I should be able to do an self.image_service.show() to get the image_meta information. If the image_type is RAW we shouldn't call fetch_to_raw() but call fetch() instead.

If you would prefer to reduce the number of cases where this new path would be used, we could check for the existence of qemu-img and only use this path in the cases where qemu-img doesn't exist.

With regards to Avishay's question about how do we verify the image type. If we don't have qemu-img it doesn't seem that we can. If we don't have qemu-img, it seems we need to try the image_type we find. On PPC this shouldn't be a big issue given that only RAW is supported.

Why am I pushing on this? There are platforms that don't have qemu-img by default and due to the licensing of qemu-img it is difficult to make this work without some sort of customer intervention like installing qemu-img by hand which isn't desirable if we can work around the issue in code.

Jay Bryant (jsbryant) wrote :

Just an update to this issue ... Spent some time last night/this morning talking to Avishay about this bug and got agreement that there is something that needs to be fixed. His suggestion, to keep the impact to a minimum, is to add a configuration option like 'raw_images_only' to indicate systems like PowerPC where we need to skip qemu-img.

I am going to try putting together a patch based on this approach.

Avishay Traeger (avishay-il) wrote :

Oh man...I was going to blame you if it turned out to not be a good idea... :)

Jay Bryant (jsbryant) wrote :

Sorry sir. I needed some weight to throw around behind it. ;-)

Jay Bryant (jsbryant) wrote :

I was finally able to get on the system that was having issues and learned a lot.

First, they weren't even using the storwize_svc driver. They were just using LVM. This, however, wasn't all bad as it was easy to see that, eventually, both storwize_svc and LVM eventually end up in fetch_to_raw. I was able to apply the same idea proposed above (adding a raw_images_only boolean to the config file) in image_utils and then change the flow of fetch_to_raw based on that setting. That got things to work on systems without qemu-img.

To me this seems like it would be the broadest coverage while being the least intrusive. I have some error checking to work out and probably a unit test or two to work on before I can upload a patch, but I think I am getting close to something here. :-)

Changed in cinder:
assignee: nobody → Jay Bryant (jsbryant)

Fix proposed to branch: master
Review: https://review.openstack.org/41483

Changed in cinder:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/41483
Committed: http://github.com/openstack/cinder/commit/870e984274d9e8d452e7aa383b289955b6896f28
Submitter: Jenkins
Branch: master

commit 870e984274d9e8d452e7aa383b289955b6896f28
Author: Jay S. Bryant <email address hidden>
Date: Mon Aug 12 12:55:48 2013 -0500

    Add check for qemu-img to image_utils fetch_to_raw

    Some platforms, particularly PowerPC systems running RHEL,
    do not have qemu-img installed by default and do not
    support image formats other than RAW. For these systems,
    functions/drivers that use fetch_to_raw() currently fail
    because it is assumed that qemu-img will always be installed.

    This change updates fetch_to_raw() to function more like the
    upload_volume() function which checks the image format and skips
    any conversion if the image is already in the RAW format. To
    minimize the impact upon platforms that have qemu-img installed
    this is being implemented by checking to see if qemu-img is
    installed. If it is installed, no changes are made to the flow
    through fetch_to_raw. If qemu-img is not available and the image
    format is not already RAW, an exception is thrown. Otherwise,
    the image is downloaded and we do not progress to the portions of
    fetch_to_raw that require qemu-img to be installed.

    This commit also updates the test cases that touch this code.

    Closes-bug: 1200800

    Change-Id: I34540dfa238a8b8e61ff5c30c7e121130bad39b4

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2013-10-04
Changed in cinder:
milestone: none → havana-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2013-10-17
Changed in cinder:
milestone: havana-rc1 → 2013.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers