cinder is putting volume state to error while retrying cinder creates

Bug #1445601 reported by Tom Barron
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
wanghao

Bug Description

When a cinder create volume attempt fails, as when a backend driver throws an exception, the attempt is retried CONF.scheduler_max_attempts times. Currently, taskflow sets the volume state from creating to error at the end of each errored attempt, via cinder.volume.flows.common.error_out_volume().

Because of this volume state transition, other operations such as a cinder delete can start during the sequence of retries and the result is that a delete and (retried) create can run at the same time on the same volume. This is not good, as each operation may find the volume in unexpected state. I have, for instance, seen the volume delete fail because the volume has no 'host' set and it tries to split a nonexistent host string on '#'.

Tom Barron (tpb)
tags: added: scheduler taskflow
Revision history for this message
Tom Barron (tpb) wrote :

13:53 harlowja: tbarron so what i'd try is to use the information @ https://github.com/openstack/cinder/blob/master/cinder/volum
e/flows/manager/create_volume.py#L170 and have that conditionally stop error_out_volume from being triggered
13:55 harlowja: and maybe log a warning instead of activating error_out_volume and just let it be (and then when rescheduling st
ops this will really enter error)
13:56 tbarron: harlowja: that looks like a good approach to me.

Changed in cinder:
assignee: nobody → Vilobh Meshram (vilobhmm)
Changed in cinder:
status: New → Confirmed
importance: Undecided → Medium
wanghao (wanghao749)
Changed in cinder:
assignee: Vilobh Meshram (vilobhmm) → wanghao (wanghao749)
Changed in cinder:
status: Confirmed → In Progress
Changed in cinder:
status: In Progress → Fix Committed
Changed in cinder:
milestone: none → liberty-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: liberty-2 → 7.0.0
Revision history for this message
Bin Zhou (binzhou) wrote :

gerrit record is missed here:
https://review.openstack.org/#/c/185545/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.