When a cinder create volume attempt fails, as when a backend driver throws an exception, the attempt is retried CONF.scheduler_max_attempts times. Currently, taskflow sets the volume state from creating to error at the end of each errored attempt, via cinder.volume.flows.common.error_out_volume().
Because of this volume state transition, other operations such as a cinder delete can start during the sequence of retries and the result is that a delete and (retried) create can run at the same time on the same volume. This is not good, as each operation may find the volume in unexpected state. I have, for instance, seen the volume delete fail because the volume has no 'host' set and it tries to split a nonexistent host string on '#'.
13:53 harlowja: tbarron so what i'd try is to use the information @ https:/ /github. com/openstack/ cinder/ blob/master/ cinder/ volum manager/ create_ volume. py#L170 and have that conditionally stop error_out_volume from being triggered
e/flows/
13:55 harlowja: and maybe log a warning instead of activating error_out_volume and just let it be (and then when rescheduling st
ops this will really enter error)
13:56 tbarron: harlowja: that looks like a good approach to me.