rest_client.wait_for_resource_deletion timeout message should be more specific

Bug #1380712 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Fix Released
Medium
Matt Riedemann

Bug Description

While debugging bug 1373513 we see a ton of these failed tempest runs:

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRmFpbGVkIHRvIGRlbGV0ZSByZXNvdXJjZVwiIEFORCBtZXNzYWdlOlwid2l0aGluIHRoZSByZXF1aXJlZCB0aW1lXCIgQU5EIHRhZ3M6XCJjb25zb2xlXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MTMyMTkyOTUxMDgsIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIn0=

We think those are mostly all volume/snapshot related but the error message doesn't put the actual resource type in the error message. The test class/test case name are in there but when you have a volume delete timeout problem it can impact all volume tests, so we don't want to write an elastic-recheck query on the test names.

We should update the clients to return the specific type of resource they are deleting to make the error message more specific.

Matt Riedemann (mriedem)
Changed in tempest:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Matt Riedemann (mriedem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/128020

Changed in tempest:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/128020
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=d2b9651cb82910b2b86b0f94af1721f717e74024
Submitter: Jenkins
Branch: master

commit d2b9651cb82910b2b86b0f94af1721f717e74024
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 13 10:18:16 2014 -0700

    Make rest_client.wait_for_resource_deletion timeout error more specific

    Currently the timeout error message just says 'Failed to delete resource
    within the required time' and includes the test class/test case, but
    when we're seeing all volume-related tests racing with a delete hang, we
    can't fingerprint an elastic-recheck query on the various test
    class/test case names, and there aren't errors in the cinder logs.

    The best thing we have to fingerprint is the timeout message, but we
    need it to be more specific to the type of resource rather than any
    resource type that might have timed out during a delete operation.

    This is similar to the is_resource_deleted method in the various rest
    clients in that the concrete implementations of the base class return
    their specific type so we can use that in the timeout message.

    Closes-Bug: #1380712

    Change-Id: I36b49c59daa95219c4377d1e207ea4a0c8cf8782

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
Matt Riedemann (mriedem) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.