[tempest] Waiting delete resource in error state does not stop immediately
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Shared File Systems Service (Manila) |
In Progress
|
Undecided
|
Felipe Rodrigues |
Bug Description
During cleanup phase, the Manila should stop wait delete process if the resource is in "error_deleting" or "error" status. Otherwise, it will keep checking until a timeout, slowing the results of the failed test.
From log Example [1]. The cleanup runs the delete snapshot at 18:06:28.950. Then it starts to get the snapshot waiting for a not found error. From the second attempt at 18:06:32.218 the resource is in "erro_deleting" status. It shoudl stop here, but it keeps waiting for not found error, getting the resource. It gives a timeout at 18:11:31.063. Therefore, it took 5 minutes to finish, instead of just 5 seconds.
There are some bad impacts:
1. The failed job takes much more time to finish, spending more resource.
2. The error log can lead to wrong assumptions.
tags: | added: tempest |
Possible soluton:
I think this patch added this wrong behavior [1]. Before this patch, the "_is_resource_ deleted" [2] expected that the "func" returns the resource dict. The "_parse_body" returned the resource dict, extracting the first key. However, the [1] replaced by "rest_client. ResponseBody" which does not take the first key in account. As result, the "_is_resource_ deleted" call of "res.get('status')" is returning None, instead of the "status" of the resource. The correct should be:
res['resource' ].get(" status" )
[1]https:/ /review. opendev. org/c/openstack /manila- tempest- plugin/ +/788248 /github. com/openstack/ manila- tempest- plugin/ blob/master/ manila_ tempest_ tests/services/ share/json/ shares_ client. py#L330
[2] https:/