tempest

rest_client.wait_for_resource_deletion timeout message should be more specific

Bug #1380712 reported by Matt Riedemann on 2014-10-13

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	tempest	Fix Released	Medium	Matt Riedemann

Bug Description

While debugging bug 1373513 we see a ton of these failed tempest runs:

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRmFpbGVkIHRvIGRlbGV0ZSByZXNvdXJjZVwiIEFORCBtZXNzYWdlOlwid2l0aGluIHRoZSByZXF1aXJlZCB0aW1lXCIgQU5EIHRhZ3M6XCJjb25zb2xlXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MTMyMTkyOTUxMDgsIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIn0=

We think those are mostly all volume/snapshot related but the error message doesn't put the actual resource type in the error message. The test class/test case name are in there but when you have a volume delete timeout problem it can impact all volume tests, so we don't want to write an elastic-recheck query on the test names.

We should update the clients to return the specific type of resource they are deleting to make the error message more specific.

Matt Riedemann (mriedem) on 2014-10-13

Changed in tempest:
importance:	Undecided → Medium
status:	New → Triaged
assignee:	nobody → Matt Riedemann (mriedem)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-10-13: Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/128020

Changed in tempest:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-10-15: Fix merged to tempest (master)

Reviewed: https://review.openstack.org/128020
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=d2b9651cb82910b2b86b0f94af1721f717e74024
Submitter: Jenkins
Branch: master

commit d2b9651cb82910b2b86b0f94af1721f717e74024
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 13 10:18:16 2014 -0700

Make rest_client.wait_for_resource_deletion timeout error more specific

    Currently the timeout error message just says 'Failed to delete resource
    within the required time' and includes the test class/test case, but
    when we're seeing all volume-related tests racing with a delete hang, we
    can't fingerprint an elastic-recheck query on the various test
    class/test case names, and there aren't errors in the cinder logs.

    The best thing we have to fingerprint is the timeout message, but we
    need it to be more specific to the type of resource rather than any
    resource type that might have timed out during a delete operation.

    This is similar to the is_resource_deleted method in the various rest
    clients in that the concrete implementations of the base class return
    their specific type so we can use that in the timeout message.

Closes-Bug: #1380712

Change-Id: I36b49c59daa95219c4377d1e207ea4a0c8cf8782