OpenStack Heat

Cinder create error reason not visible

Bug #1450861 reported by Joe D'Andrea on 2015-05-01

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Invalid	Wishlist	Unassigned
	OpenStack Heat	Triaged	Medium	Joe D'Andrea	OpenStack Heat no-milestone-taged-bugs

Bug Description

1. Create a stack with Cinder volumes *and* a lack of enough disk space on the cluster.
2. Stack reaches CREATE_FAILED state (as expected).
3. Use 'heat stack-show' and look for stack_status_reason:

Resource CREATE failed: ResourceInError: Went to status error due to "Unknown"

4. Expected a reason other than "Unknown" (e.g., out of disk space). However, status_reason is never set in the args for ResourceInError.

5. Look at heat engine log and find:

Traceback (most recent call last):
  File "/opt/stack/heat/heat/engine/resource.py", line 466, in _action_recorder
    yield
  File "/opt/stack/heat/heat/engine/resource.py", line 536, in _do_action
    yield self.action_handler_task(action, args=handler_args)
  File "/opt/stack/heat/heat/engine/scheduler.py", line 312, in wrapper
    step = next(subtask)
  File "/opt/stack/heat/heat/engine/resource.py", line 510, in action_handler_task
    while not check(handler_data):
  File "/opt/stack/heat/heat/engine/resources/aws/volume.py", line 139, in check_create_complete
    resource_status=vol.status)
ResourceInError: Went to status error due to "Unknown"

Note: The above traceback does not refer to the most recent kilo (volume.py has since been moved, see below). However, the source for check_create_complete() doesn't appear to have changed since Juno.

https://github.com/openstack/heat/blob/master/heat/engine/resources/aws/ec2/volume.py

By comparison, Cinder backup objects set fail_reason in the event of an error. There is no fail_reason in Cinder objects, however, leaving folks to go on a proverbial wild goose chase to discover what went wrong.

See original description

Joe D'Andrea (jdandrea) on 2015-05-01

description:

updated

Joe D'Andrea (jdandrea) on 2015-05-01

Changed in heat:
assignee:	nobody → Joe D'Andrea (joedandrea)

Joe D'Andrea (jdandrea) on 2015-05-01

Changed in heat:
assignee:	Joe D'Andrea (joedandrea) → nobody
description:	updated

Joe D'Andrea (jdandrea) on 2015-05-01

Changed in heat:
assignee:	nobody → Joe D'Andrea (joedandrea)

Joe D'Andrea (jdandrea) on 2015-05-01

description:

updated

Steve Baker (steve-stevebaker) on 2015-05-04

Changed in heat:
status:	New → Triaged
importance:	Undecided → Medium

Revision history for this message

Joe D'Andrea (jdandrea) wrote on 2015-05-04:

Question asked of Cinder, with x-ref back here:

https://answers.launchpad.net/ubuntu/+source/cinder/+question/266260

Revision history for this message

Joe D'Andrea (jdandrea) wrote on 2015-05-05:

Update: Per DuncanT the error reason is not visible at the moment. However, there will be a cross-project session at the Liberty Summit to discuss the best way to go about fixing this. I will look to attend that session.

Joe D'Andrea (jdandrea) on 2015-05-05

description:

updated

Revision history for this message

Joe D'Andrea (jdandrea) wrote on 2015-05-05:

More info: "Part of the problem is that different providers have different requirements as to what level of detail they want to pass back to tenants. Private cloud might be fine with lots of details, whereas a public cloud might not want to pass on more than 'something went wrong, try again later'."

Squaring this particular circle has proven elusive thus far.

Joe D'Andrea (jdandrea) on 2015-05-05

description:

updated

Revision history for this message

Joe D'Andrea (jdandrea) wrote on 2015-05-14:

Of possible interest:

https://etherpad.openstack.org/p/liberty-cross-project-user-notifications
https://launchpad.net/monasca

Revision history for this message

John Griffith (john-griffith) wrote on 2015-06-10:

@Joe D'Andrea
Comment #3 is a pretty good summary of things here IMO.

BUT, I also want to point out... rather than focus so much on reporting failure issues, what about just improving things so shit doesn't fail? Or when it does it is smart enough to dynamically go somewhere else and try again?

If certain service providers have a heavy work load due to support calls for failed items, maybe they need to look at how they've deployed things, or what they used to build their cloud, OR even better making the OpenStack code (particularly Cinder) better.

Changed in cinder:
importance:	Undecided → Wishlist
status:	New → Confirmed

Revision history for this message

John Griffith (john-griffith) wrote on 2015-06-10:

I'll mark as confirmed for now and see if anybody ever does anything with it, but it's been one of those things that gets complained about a fair amount but nobody has any good ideas or submissions to try and fix.

Revision history for this message

Joe D'Andrea (jdandrea) wrote on 2015-06-10:

@john-griffith:

Thanks! I agree, it would be great to improve things so there aren't failures. Alas, stuff will still "fall down go boom" from time to time.

To your comment about heavy work load (if that's even the issue here, I don't know if it is), indeed, it could be the user/admin's fault, but the only way for them to remedy their gaffe is to know what went wrong. "Unknown" doesn't help.

I don't dispute that Cinder, OpensStack could be better. I imagine that will always be the case, but it doesn't obviate the need for correct and informative error reporting. No matter how good we make it, we would do well to help the user/admin know what went wrong and empower them info to help them fix it. User experience FTW!

Revision history for this message

Sean McGinnis (sean-mcginnis) wrote on 2016-09-29:

This is somewhat addressed now with the implementation of the user message API for getting the response messages from async operations.

Changed in cinder:
status:	Confirmed → Invalid

Rico Lin (rico-lin) on 2018-05-07

Changed in heat:
milestone:	none → no-priority-tag-bugs

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

OpenStack Heat

Cinder create error reason not visible

Bug Description

Other bug subscribers

Related questions

Remote bug watches