VolumeNotCreated - Instance failed, cinder too slow with

Bug #1272076 reported by gadLinux
40
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Cinder
Won't Fix
Undecided
Unassigned
OpenStack Compute (nova)
Won't Fix
Undecided
Unassigned

Bug Description

Hi,

I've found that under certain circumstances cinder does not create volumes fast enough.

I can launch an image from a new volume from image with 4GB. I use LVM to allocate space. After a while I found that the instance didn't worked.

Looking at logs I can find:

2014-01-23 21:44:15.337 2398 TRACE nova.compute.manager [instance: a0e35767-424e-434d-99b4-35e19422054f] attempts=attempts)
2014-01-23 21:44:15.337 2398 TRACE nova.compute.manager [instance: a0e35767-424e-434d-99b4-35e19422054f] VolumeNotCreated: Volume 137bc77b-c9e6-47ba-b2f
5-c83f440a988b did not finish being created even after we waited 66 seconds or 60 attempts.

I was looking around and the cinder was "downloading". I think it was taking the image from the image server and building the volume. I don't know why it took so long since installation is gigabit ethernet and even more, the image is in a instance launched on the cinder hardware machine. So it does not even any networking. All resolves internally.

Image is saucy (About 300MB).

The problem is that after a while volume creation finished and instance failed. So I recereated instance and made it work from volume with no problems.

How should I track where the processing slows down?

I know that iscsi attachment is slow. One of possible point of faillure is when you have iscsi target that are in a machine that's not reachable. This slows down the rest of processing but I'm not sure if this is a point here.

Anyway. I'm sure hardware is not the best but pretty decent. Raid1 array with WD black label. Good sata controller and Intel gigabit network cards. So disk should not be the problem. I'm thinking about networking/config related problem.

But I'm lost on this.

Any help.

Tags: volumes
Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

This is the target stats.

 iscsiadm -m session -r 4 --stats
Stats for session [sid: 4, target: iqn.2010-10.org.openstack:volume-137bc77b-c9e6-47ba-b2f5-c83f440a988b, portal: 172.16.0.119,3260]
iSCSI SNMP:
 txdata_octets: 1137908
 rxdata_octets: 26062596
 noptx_pdus: 0
 scsicmd_pdus: 2601
 tmfcmd_pdus: 0
 login_pdus: 0
 text_pdus: 0
 dataout_pdus: 56
 logout_pdus: 0
 snack_pdus: 0
 noprx_pdus: 0
 scsirsp_pdus: 2601
 tmfrsp_pdus: 0
 textrsp_pdus: 0
 datain_pdus: 2381
 logoutrsp_pdus: 0
 r2t_pdus: 27
 async_pdus: 0
 rjt_pdus: 0
 digest_err: 0
 timeout_err: 0
iSCSI Extended:
 tx_sendpage_failures: 0
 rx_discontiguous_hdr: 0
 eh_abort_cnt: 0

ping 172.16.0.119
PING 172.16.0.119 (172.16.0.119) 56(84) bytes of data.
64 bytes from 172.16.0.119: icmp_seq=1 ttl=64 time=0.033 ms
64 bytes from 172.16.0.119: icmp_seq=2 ttl=64 time=0.033 ms

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Machine even doesn't finish booting.

Thi is volume stats. I removed the rest for brief

Cada 2,0s: iostat -N Thu Jan 23 22:21:32 2014

Linux 3.11.0-12-generic (blue-compute) 23/01/14 _x86_64_ (4 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
           5,01 0,00 1,46 6,22 0,00 87,31

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
data_vg-volume--137bc77b--c9e6--47ba--b2f5--c83f440a988b 123,15 30,75 461,86 140304 2107208

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Got it.

I see lots of:

Jan 23 22:28:01 blue-compute kernel: [ 4957.380009] connection2:0: detected conn error (1020)
Jan 23 22:28:01 blue-compute kernel: [ 4957.380062] connection1:0: detected conn error (1020)
Jan 23 22:28:01 blue-compute iscsid: semop down failed 22
Jan 23 22:28:01 blue-compute iscsid: semop down failed 22
Jan 23 22:28:02 blue-compute tgtd: conn_close(103) connection closed, 0xdb6350 1
Jan 23 22:28:02 blue-compute tgtd: conn_close(103) connection closed, 0xdbd400 1
Jan 23 22:28:04 blue-compute kernel: [ 4960.387587] connection2:0: detected conn error (1020)
Jan 23 22:28:04 blue-compute kernel: [ 4960.387637] connection1:0: detected conn error (1020)
Jan 23 22:28:04 blue-compute iscsid: semop down failed 22
Jan 23 22:28:04 blue-compute iscsid: semop down failed 22
Jan 23 22:28:05 blue-compute tgtd: conn_close(103) connection closed, 0xdb6350 1
Jan 23 22:28:05 blue-compute tgtd: conn_close(103) connection closed, 0xdbd400 1
Jan 23 22:28:07 blue-compute kernel: [ 4963.394790] connection2:0: detected conn error (1020)
Jan 23 22:28:07 blue-compute kernel: [ 4963.394836] connection1:0: detected conn error (1020)
Jan 23 22:28:07 blue-compute iscsid: semop down failed 22
Jan 23 22:28:07 blue-compute iscsid: semop down failed 22
Jan 23 22:28:08 blue-compute tgtd: conn_close(103) connection closed, 0xdb6350 1
Jan 23 22:28:08 blue-compute tgtd: conn_close(103) connection closed, 0xdbd400 1

Revision history for this message
John Griffith (john-griffith) wrote :

We need to look at how we do this in Nova. The 66 seconds isn't going to be sufficient in some cases depending on what backend you're using.

The reason for example in your case of LVM:
1 The Cinder volume is created.
2. The image is downloaded from Glance to a temp-file on the cinder node
3. The image is converted to raw and written to the volume

For the 256 MB image I'm kinda sad it's not fast enough, I'd be curious if you timed just the image download in Cinder to see what you come up with? Do a "cinder create --image-id xxxxx 5".

With some setups this process will be pretty quick, but depending on the cinder backend in use, and the load on the system this can end up taking a while. We should probably look at ways to improve how we do this on the Nova side. Time-outs are always tricky though. I would have to look to see if this timeout is already configurable in Nova, if it's not that might be a reasonable temporary solution.

One thing that we're working on in cinder is getting rid of the intermediate temp file to do the convert, but its not simple because you can't do qemu convert in place.

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Hi John,

I didn't know about the temp file. This also can cause the system to fail because this temp file does not fit into the cinder disk that can be small. I suppose that if you get rid of the temp file it will perform better.

I propose to sort things out. So when volume is create it should be attached to the image and leave the instance ready and downloading. So if it fails we can retry. Right now if it fails you have to redo everything. Terminate instance and build a new one. But from my point of view, the instance is ready the only problem is that the SO is not downloaded and installed. This should be retryable.

I discovered that I'm suffering a tgtd/iscsi but. So the iscsi system disconnects every 2 seconds. I'm looking at it. So this is possible the responsible for the timeout.

If you tell me what you want to meassure. I will do it. Maybe we can find bottlenecks.

I must report the iscsi problem and will post here when done.

tags: added: volumes
Joe Gordon (jogo)
Changed in nova:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/80619
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=4111e3586de88d5d580e6f11bafb644ff9144400
Submitter: Jenkins
Branch: master

commit 4111e3586de88d5d580e6f11bafb644ff9144400
Author: JordanP <email address hidden>
Date: Fri Mar 14 16:21:47 2014 +0000

    Increase volume creation max waiting time

    In some cases waiting 60sec, is not enough for cinder volume to be created.
    For example, a 20Go LVM volume (hosting a Windows Image) would take 100sec
    to be copied assuming a 200Mo/s cinder volume backend.

    Issue has already been discussed in I6bcd395a where a config option was
    rejected.

    This patches raises the max timeout at 180sec.

    Change-Id: I7e739fc20af0ae963269459b5e224c4a8b7bb87d
    Partial-Bug: #1253612
    Partial-Bug: #1272076

Revision history for this message
QuentinHartman (qhartman) wrote : Re: VolumeNotCreated - Instance failed, cinder too slow

I see this same behavior in an openstack deployment where my cinder instance is ceph backed any time I create a "large" volume. Anything over about 500G usually fails because cinder can't produce the volume fast enough.

I'm using raw images (about 2.2G) also stored in ceph as my source, so no qemu convert should be needed in this case.

I poked through the code and config, and it seems like this behavior should be controlled by the instance_build_timeout variable, which defaults to 0, so it should wait forever, but that does not seem to be the case.

Here is a Redhat bug that seems to be tracking the same problem. https://bugzilla.redhat.com/show_bug.cgi?id=1019401 Note though that the code seems to have changed since their link into github was created. The line it links to does not seem relevant.

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Great, this seems to be addressed. I upgraded to icehouse and had no problem so long.

Revision history for this message
Tony Link (tlink) wrote :

For ceph backed glance and cinder , would it not be appropriate to have the copy occur between ceph pools rather than via the tmp folder locations on cinder ?

Revision history for this message
Yaguang Tang (heut2008) wrote :

Tony,

This is going to be addressed by this blueprint https://blueprints.launchpad.net/cinder/+spec/enable-cloning-glance-rbd-image

tags: added: ceph
Yaguang Tang (heut2008)
Changed in nova:
status: Confirmed → Invalid
Revision history for this message
Jeff Silverman (jeffsilverm) wrote :

I looked at the patch that is discussed in https://git.openstack.org/cgit/openstack/nova/commit/?id=4111e3586de88d5d580e6f11bafb644ff9144400. 180 is not enough for a 60 GByte image. I changed the value of max_tries to 600 and that seems to work better for me. I made some measurements and the largest time it actually took was 7-1/2 minutes.

Revision history for this message
Josh Durgin (jdurgin) wrote :

If anyone is seeing this with ceph when they have copy-on-write cloning enabled, please post a cinder log.

Jon Bernard (jbernard)
Changed in cinder:
status: New → Confirmed
Revision history for this message
John Griffith (john-griffith) wrote :

Ok, there are a couple of things here:

First, if you ask nova to create a cinder volume from an image and boot, it unfortunately doesn't do any checks or make any compensation for the fact that the download/convert process from glance to the Cinder volume can take a bit of time.

It sounds like Josh and Jon are also looking at an issue specific to Ceph; I wonder if perhaps that should be posted as a separate bug?

This issue in the general sense I think is a problem with Nova and it's expectations that cinder create should always be speedy as it is with a raw volume. FWIW, I believe someone was looking at this recently in Liberty.

Honestly, I've always though the combining of the create-from-image and boot call shouldn't be used and it's better to do the steps explicitly outside of each other.

summary: - VolumeNotCreated - Instance failed, cinder too slow
+ VolumeNotCreated - Instance failed, cinder too slow with Ceph
summary: - VolumeNotCreated - Instance failed, cinder too slow with Ceph
+ VolumeNotCreated - Instance failed, cinder too slow with
Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

I believe this is something that will need to be addressed on the nova side. The reality is some backends take longer than others to create a volume. This isn't something that's going to change, certainly not by the OpenStack community. I think we would need to have some adaptable timeouts for this. Perhaps some kind of async polling to make sure the request is still being worked on.

Changed in nova:
status: Invalid → Incomplete
Changed in cinder:
status: Confirmed → Won't Fix
Revision history for this message
Augustina Ragwitz (auggy) wrote :

Changing the status of this bug to "New" from "Incomplete". We use "Incomplete" in Nova when we need further information from the original reporter. In this case, it looks like the issue has been evaluated and on Nova the question is about verifying if there's anything on the Nova side that can be done to address this issue.

Changed in nova:
status: Incomplete → New
tags: removed: ceph
Revision history for this message
Sean Dague (sdague) wrote :

Just making Nova wait longer doesn't seem to be a good strategy, because then we just end up in a different deadlock scenario of waiting with no idea when things are finished.

I think that for long provisioning situations like this we just need to expect people to preprovision the volumes first from cinder. Nova will never be good enough at orchestration to solve this issue generically.

Marking as Won't Fix. This isn't a simple bug, this is going to require at least one spec to change the way flows work.

Changed in nova:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.