Create Volume from image bug (iscsi)

Bug #1531711 reported by Daniel Pryor
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned

Bug Description

When attempting to create a volume from image you will get "qemu-img: error writing zeroes at sector 0: Input/output error" on the first attempt and a success on the second.

Steps to reproduce.
1 )Nimble backend using 2.3.9.0
2) Ubuntu 14.04 with qemu-utils 2.3 installed
3) attempt to create volume from image and you will receive the error
Or
1) iscsiadm --mode node --targetname '<iqn.2007-11.com.nimblestorage:volume>' --portal <SAN_IP>:3260 --login
2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t none -O raw image /dev/disk/by-path/<volume
ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
3)try it again and it will work

My work around for the issue has been to comment out the code that sets the qemu-img cache policy to none in image_utils.py. I do not see a real reason to set the cache to none since the default is write-through which means it will not notify the os that it has completed until it has. None does not prevent the underlying os from terminating the connection or packet loss from creating issues. On top of that it will "possibly" prevent your SAN from caching the data making it so that creating instances will not be fast as it can be.

Another option would be to make the call an option. isci_direct=true (cinder.conf)

Thanks
Daniel

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Seems this bug is duplicate of https://bugs.launchpad.net/cinder/+bug/1389728
Could you check the bug?

-------- snip ----------
This is probably a qemu-img bug fixed by this patch in qemu:
http://git.qemu-project.org/?p=qemu.git;a=commitdiff;h=f3a9cfddae
-------- snip ----------

Revision history for this message
Daniel Pryor (pryorda) wrote : Re: [Bug 1531711] Re: Create Volume from image bug (iscsi)

Hello,

I did test the patch and it would still fail.

Thanks

On Mon, Jan 11, 2016 at 9:42 AM, Mitsuhiro Tanino <email address hidden>
wrote:

> Seems this bug is duplicate of
> https://bugs.launchpad.net/cinder/+bug/1389728
> Could you check the bug?
>
> -------- snip ----------
> This is probably a qemu-img bug fixed by this patch in qemu:
> http://git.qemu-project.org/?p=qemu.git;a=commitdiff;h=f3a9cfddae
> -------- snip ----------
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1531711
>
> Title:
> Create Volume from image bug (iscsi)
>
> Status in Cinder:
> New
>
> Bug description:
> When attempting to create a volume from image you will get "qemu-img:
> error writing zeroes at sector 0: Input/output error" on the first
> attempt and a success on the second.
>
> Steps to reproduce.
> 1 )Nimble backend using 2.3.9.0
> 2) Ubuntu 14.04 with qemu-utils 2.3 installed
> 3) attempt to create volume from image and you will receive the error
> Or
> 1) iscsiadm --mode node --targetname
> '<iqn.2007-11.com.nimblestorage:volume>' --portal <SAN_IP>:3260 --login
> 2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t
> none -O raw image /dev/disk/by-path/<volume
> ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
> 3)try it again and it will work
>
> My work around for the issue has been to comment out the code that
> sets the qemu-img cache policy to none in image_utils.py. I do not see
> a real reason to set the cache to none since the default is write-
> through which means it will not notify the os that it has completed
> until it has. None does not prevent the underlying os from terminating
> the connection or packet loss from creating issues. On top of that it
> will "possibly" prevent your SAN from caching the data making it so
> that creating instances will not be fast as it can be.
>
> Another option would be to make the call an option. isci_direct=true
> (cinder.conf)
>
> Thanks
> Daniel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cinder/+bug/1531711/+subscriptions
>

--
- Daniel Pryor

Revision history for this message
Daniel Pryor (pryorda) wrote :

Any luck on this?

Revision history for this message
Raunak Kumar (rkumar-b) wrote :

Even I am hitting the same issue. Is there any resolution to this other than commenting out the caching code ?

Revision history for this message
Daniel Pryor (pryorda) wrote :

Nope

On Tue, Apr 5, 2016 at 5:15 PM Raunak Kumar <email address hidden>
wrote:

> Even I am hitting the same issue. Is there any resolution to this other
> than commenting out the caching code ?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1531711
>
> Title:
> Create Volume from image bug (iscsi)
>
> Status in Cinder:
> New
>
> Bug description:
> When attempting to create a volume from image you will get "qemu-img:
> error writing zeroes at sector 0: Input/output error" on the first
> attempt and a success on the second.
>
> Steps to reproduce.
> 1 )Nimble backend using 2.3.9.0
> 2) Ubuntu 14.04 with qemu-utils 2.3 installed
> 3) attempt to create volume from image and you will receive the error
> Or
> 1) iscsiadm --mode node --targetname '<iqn.2007-11.com.nimblestorage:volume>'
> --portal <SAN_IP>:3260 --login
> 2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t
> none -O raw image /dev/disk/by-path/<volume
> ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
> 3)try it again and it will work
>
> My work around for the issue has been to comment out the code that
> sets the qemu-img cache policy to none in image_utils.py. I do not see
> a real reason to set the cache to none since the default is write-
> through which means it will not notify the os that it has completed
> until it has. None does not prevent the underlying os from terminating
> the connection or packet loss from creating issues. On top of that it
> will "possibly" prevent your SAN from caching the data making it so
> that creating instances will not be fast as it can be.
>
> Another option would be to make the call an option. isci_direct=true
> (cinder.conf)
>
> Thanks
> Daniel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cinder/+bug/1531711/+subscriptions
>

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Does this problem happen only on the nimble storage?
I haven't seen this error using LVM driver.

What kind of image do we need to reproduce this bug?
I suppose this bug has some more conditions to reproduce.

Revision history for this message
Raunak Kumar (rkumar-b) wrote :

I tried it with the following image details

X-Image-Meta-Id: e124479a-95c9-4b1c-93f1-65c7ab3f971b
X-Image-Meta-Deleted: False
X-Image-Meta-Checksum: 89d768444e2f254e76555f8d3bfaed20
X-Image-Meta-Status: active
X-Image-Meta-Container_format: bare
X-Image-Meta-Protected: False
X-Image-Meta-Min_disk: 0
X-Image-Meta-Min_ram: 0
X-Image-Meta-Created_at: 2016-02-29T07:23:31.000000
X-Image-Meta-Size: 258474496
Connection: keep-alive
Etag: 89d768444e2f254e76555f8d3bfaed20
X-Image-Meta-Is_public: True
Date: Tue, 05 Apr 2016 06:00:59 GMT
X-Image-Meta-Owner: 1c8fe37a0fe7432d9b58ac5fda1eb50c
X-Image-Meta-Updated_at: 2016-02-29T07:23:32.000000
Content-Type: text/html; charset=UTF-8
X-Openstack-Request-Id: req-9907b5ae-4939-4515-bd36-0ec97d5b1de7
X-Image-Meta-Disk_format: qcow2
X-Image-Meta-Name: ubuntu14.04

Error :

packages/cinder/image/image_utils. py", line 322, in fetch_to_volume_format
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume run_as_root=run_as_root)
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume File "/usr/lib/python2.7/dist-packages/cinder/image/image_utils. py", line 147, in convert_image
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume out_format, run_as_root=run_as_root)
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume File "/usr/lib/python2.7/dist-packages/cinder/image/image_utils. py", line 121, in _convert_image
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume utils.execute(*cmd, run_as_root=run_as_root)
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume File "/usr/lib/python2.7/dist-packages/cinder/utils.py", line 155, in execute
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume return processutils.execute(*cmd, **kwargs)
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume File "/usr/lib/python2.7/dist-packages/oslo_concurrency/ processutils.py", line 275, in execute
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume cmd=sanitized_cmd)
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume ProcessExecutionError: Unexpected error while running command.
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t none -O raw /var/lib/cinder/conversion/tmpS3abmD /dev/disk/by-path/ip-1.1.21.100:3260-iscsi-iqn.2007-11.com.nimblestorage:volume- 08d71f8d-5982-45cb-b8d2-79d012973bc7-v181eaf4b33f068e3.00000075.2beec7a8-lun-0
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume Exit code: 1
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume Stdout: u''
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume Stderr: u'qemu-img: error writing zeroes at sector 0: Input/output error\n'
2016-04-05 15:01:00.585 31087 ERROR cinder.volume.flows.manager.create_volume

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

This is qemu-img bug rather than Cinder bug. Therefore this should be fixed at qemu community.
Please report this bug to the community.

IMO, adding new option for a workaround of this problem is not a good idea.
But if you believe adding that option improves image conversion not only for this case but also common cases, that will be accepted.

Please feel free to post a patch.
Thanks.

Revision history for this message
Daniel Pryor (pryorda) wrote :

I'm not sure I agree that this is a qemu bug. Could you show us your
reasoning?

On Wed, Apr 6, 2016 at 9:16 AM Mitsuhiro Tanino <email address hidden>
wrote:

> This is qemu-img bug rather than Cinder bug. Therefore this should be
> fixed at qemu community.
> Please report this bug to the community.
>
> IMO, adding new option for a workaround of this problem is not a good idea.
> But if you believe adding that option improves image conversion not only
> for this case but also common cases, that will be accepted.
>
> Please feel free to post a patch.
> Thanks.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1531711
>
> Title:
> Create Volume from image bug (iscsi)
>
> Status in Cinder:
> New
>
> Bug description:
> When attempting to create a volume from image you will get "qemu-img:
> error writing zeroes at sector 0: Input/output error" on the first
> attempt and a success on the second.
>
> Steps to reproduce.
> 1 )Nimble backend using 2.3.9.0
> 2) Ubuntu 14.04 with qemu-utils 2.3 installed
> 3) attempt to create volume from image and you will receive the error
> Or
> 1) iscsiadm --mode node --targetname '<iqn.2007-11.com.nimblestorage:volume>'
> --portal <SAN_IP>:3260 --login
> 2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t
> none -O raw image /dev/disk/by-path/<volume
> ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
> 3)try it again and it will work
>
> My work around for the issue has been to comment out the code that
> sets the qemu-img cache policy to none in image_utils.py. I do not see
> a real reason to set the cache to none since the default is write-
> through which means it will not notify the os that it has completed
> until it has. None does not prevent the underlying os from terminating
> the connection or packet loss from creating issues. On top of that it
> will "possibly" prevent your SAN from caching the data making it so
> that creating instances will not be fast as it can be.
>
> Another option would be to make the call an option. isci_direct=true
> (cinder.conf)
>
> Thanks
> Daniel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cinder/+bug/1531711/+subscriptions
>

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Based on your report, the error happens during image conversion using qemu-img without Cinder;
like #qemu-img convert -t none -O raw <image> /dev/disk/by-path/<volume>
Therefore I think the root cause is in qemu-img.

Do you use special image file or special file format?
Normally this doesn't happen using Cirros image or Fedora cloud image, etc.
Can I get the image file from web?

By the way, the '-t none' was introduced to solve following bug.
https://bugs.launchpad.net/cinder/+bug/1363016

>>2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t none -O raw image /dev/disk/by-path/<volume
ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
>>3)try it again and it will work

One question.
Do you always succeed second try of qemu-img? This also seems hitting a qemu-img bug.

Thanks.

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Please ping me (mtanino) if your are on Cinder IRC channel.

Revision history for this message
Raunak Kumar (rkumar-b) wrote :

Please let me know too

Revision history for this message
Daniel Pryor (pryorda) wrote :

What is the server and I will connect?

Thanks,
Dan

On Wed, Apr 6, 2016 at 11:01 AM Raunak Kumar <email address hidden>
wrote:

> Please let me know too
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1531711
>
> Title:
> Create Volume from image bug (iscsi)
>
> Status in Cinder:
> New
>
> Bug description:
> When attempting to create a volume from image you will get "qemu-img:
> error writing zeroes at sector 0: Input/output error" on the first
> attempt and a success on the second.
>
> Steps to reproduce.
> 1 )Nimble backend using 2.3.9.0
> 2) Ubuntu 14.04 with qemu-utils 2.3 installed
> 3) attempt to create volume from image and you will receive the error
> Or
> 1) iscsiadm --mode node --targetname '<iqn.2007-11.com.nimblestorage:volume>'
> --portal <SAN_IP>:3260 --login
> 2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t
> none -O raw image /dev/disk/by-path/<volume
> ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
> 3)try it again and it will work
>
> My work around for the issue has been to comment out the code that
> sets the qemu-img cache policy to none in image_utils.py. I do not see
> a real reason to set the cache to none since the default is write-
> through which means it will not notify the os that it has completed
> until it has. None does not prevent the underlying os from terminating
> the connection or packet loss from creating issues. On top of that it
> will "possibly" prevent your SAN from caching the data making it so
> that creating instances will not be fast as it can be.
>
> Another option would be to make the call an option. isci_direct=true
> (cinder.conf)
>
> Thanks
> Daniel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cinder/+bug/1531711/+subscriptions
>

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Based on the chatting with Daniel, now I understand more detail of the problem.

- This problem happens only at nimble back end.
- Deploying same glance image to other backend(ex. Netapp) succeeded.

I agree this is Cinder bug, especially driver specific bug. Thanks.

Eric Harney (eharney)
Changed in cinder:
importance: Undecided → Medium
Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Hi Eric, Raunak,

In my understanding, _convert_image() is checking o_direct support(*1) for a target volume before writing glance image using 'qemu-img convert'.

From the above error log, this bug happens after checking o_direct support. Therefore, I think o_direct support is not root cause to this bug.

Based on the discussion with bug reporter, following two ways worked at his environment as workaround. So I suppose root cause of this issue is that the target volume is not initialized yet when qemu-img tried to write glance image. Seems like timing issue.

  (1) Cached write(async write)
  (2) sleep 30 before 'qemu-img convert'

But I'd like to have comments from Nomble and Huawei teams.

(*1) check_for_odirect_support()

Revision history for this message
Raunak Kumar (rkumar-b) wrote :

Yes, it does pass the check_for_odirect_support() check but fails in writing zeroes to it.
As Eric mentioned we can try this approach and have a silent error to failback to retry the base command without -t none.

I like your 1st suggestions as well, sleep is not a deterministic approach but could be used in debugging.

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

>> I like your 1st suggestions as well,

To be clear, that is not my suggestion, but workarounds which bug reporter mentioned.

Revision history for this message
Eric Harney (eharney) wrote :

> Hi Eric, Raunak,

> In my understanding, _convert_image() is checking o_direct support(*1) for a target volume before writing glance image using 'qemu-img convert'.

> From the above error log, this bug happens after checking o_direct support. Therefore, I think o_direct support is not root cause to this bug.

After looking at this further, I'm starting to think the same thing. We already attempt to do a check for this -- and if it works the second time, this may point more toward a race with device setup than qemu-img arguments.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/312797

Changed in cinder:
assignee: nobody → Mitsuhiro Tanino (mitsuhiro-tanino)
status: New → In Progress
Revision history for this message
Raunak Kumar (rkumar-b) wrote :

Hi Eric, Mitsuhiro,

On newer linux kernels (3.7 and above) Nimble Array doesn’t handle the WRITE SAME command correctly.
It is the 'unmap' flag in the WRITE SAME(16) command causing the problem. Older kernel does not set the 'unmap' flag in the WRITE SAME(16) and treats the 'unmap' flag as an illegal command.

Following error occurs :

localhost kernel: [158389.494672] sd 61:0:0:0: [sdb] Add. Sense: Invalid field in cdb
 localhost kernel: [158389.494679] sd 61:0:0:0: [sdb] CDB: Write same(16) 93 08 00 00 00 00 00 00 00 00 00 00 80 00 00 00
 localhost kernel: [158389.494685] blk_update_request: critical target error, dev sdb, sector 0

This is scheduled to be fixed in future release of the array software. Not a cinder bug.

Huawei could check the same for their array.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Mitsuhiro Tanino (<email address hidden>) on branch: master
Review: https://review.openstack.org/312797
Reason: This will be fixed storage side.

Eric Harney (eharney)
Changed in cinder:
status: In Progress → Invalid
Changed in cinder:
importance: Medium → Undecided
assignee: Mitsuhiro Tanino (mitsuhiro-tanino) → nobody
Revision history for this message
Daniel Pryor (pryorda) wrote : Re: [Bug 1531711] Re: Create Volume from image bug (iscsi)

So how do you fix this storage side?
On May 6, 2016 12:05 PM, "Mitsuhiro Tanino" <email address hidden>
wrote:

> ** Changed in: cinder
> Importance: Medium => Undecided
>
> ** Changed in: cinder
> Assignee: Mitsuhiro Tanino (mitsuhiro-tanino) => (unassigned)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1531711
>
> Title:
> Create Volume from image bug (iscsi)
>
> Status in Cinder:
> Invalid
>
> Bug description:
> When attempting to create a volume from image you will get "qemu-img:
> error writing zeroes at sector 0: Input/output error" on the first
> attempt and a success on the second.
>
> Steps to reproduce.
> 1 )Nimble backend using 2.3.9.0
> 2) Ubuntu 14.04 with qemu-utils 2.3 installed
> 3) attempt to create volume from image and you will receive the error
> Or
> 1) iscsiadm --mode node --targetname
> '<iqn.2007-11.com.nimblestorage:volume>' --portal <SAN_IP>:3260 --login
> 2)sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -t
> none -O raw image /dev/disk/by-path/<volume
> ERROR: qemu-img: error writing zeroes at sector 0: Input/output error
> 3)try it again and it will work
>
> My work around for the issue has been to comment out the code that
> sets the qemu-img cache policy to none in image_utils.py. I do not see
> a real reason to set the cache to none since the default is write-
> through which means it will not notify the os that it has completed
> until it has. None does not prevent the underlying os from terminating
> the connection or packet loss from creating issues. On top of that it
> will "possibly" prevent your SAN from caching the data making it so
> that creating instances will not be fast as it can be.
>
> Another option would be to make the call an option. isci_direct=true
> (cinder.conf)
>
> Thanks
> Daniel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cinder/+bug/1531711/+subscriptions
>

Revision history for this message
Raunak Kumar (rkumar-b) wrote :

The next software release for the release train will have the fix (2.3.16). Once it is out I can update this bug.

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Daniel,

I'm not sure the detail.
Raunak mentioned array software will be fixed.

>>This is scheduled to be fixed in future release of the array software. Not a cinder bug.

Revision history for this message
Mitsuhiro Tanino (mitsuhiro-tanino) wrote :

Thank you, Raunak.

Revision history for this message
Mayur Indalkar (mayurindalkar) wrote :

Hi Raunak,
This issue faced by me too.
kernel version used by my backend in 3.10.
Its working for raw image, but failing for qcow2 image.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.