GroupSnapshotsV319Test.test_reset_group_snapshot_status intermittently fails waiting to reset status to 'creating'

Bug #1770179 reported by Matt Riedemann on 2018-05-09
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
High
Unassigned

Bug Description

http://logs.openstack.org/31/566931/1/check/nova-cells-v1/8faceb3/job-output.txt.gz#_2018-05-08_17_19_56_621704

The reset status request is here:

http://logs.openstack.org/31/566931/1/check/nova-cells-v1/8faceb3/job-output.txt.gz#_2018-05-08_17_19_56_665430

2018-05-08 17:19:56.665430 | primary | 2018-05-08 17:16:37,767 3606 INFO [tempest.lib.common.rest_client] Request (GroupSnapshotsV319Test:test_reset_group_snapshot_status): 202 POST http://198.72.124.74/volume/v3/aafd6b059a2041fb87e4975b06c96d6d/group_snapshots/e2d2b2bf-dde6-4357-9af1-79b57017690a/action 0.071s
2018-05-08 17:19:56.665803 | primary | 2018-05-08 17:16:37,767 3606 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Openstack-Api-Version': 'volume 3.19', 'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
2018-05-08 17:19:56.665933 | primary | Body: {"reset_status": {"status": "creating"}}
2018-05-08 17:19:56.666751 | primary | Response - Headers: {u'openstack-api-version': 'volume 3.19', u'connection': 'close', u'date': 'Tue, 08 May 2018 17:16:37 GMT', u'content-type': 'text/html; charset=UTF-8', u'x-openstack-request-id': 'req-e73546a4-7365-4132-b289-0b1aee94e2ac', 'status': '202', u'content-length': '0', 'content-location': 'http://198.72.124.74/volume/v3/aafd6b059a2041fb87e4975b06c96d6d/group_snapshots/e2d2b2bf-dde6-4357-9af1-79b57017690a/action', u'vary': 'OpenStack-API-Version', u'server': 'Apache/2.4.18 (Ubuntu)'}

I'm not sure why this is a 202 because it's a synchronous DB update in cinder-api, and there are no errors in the c-api logs:

http://logs.openstack.org/31/566931/1/check/nova-cells-v1/8faceb3/logs/screen-c-api.txt.gz#_May_08_17_16_37_704529

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22group_snapshot%5C%22%20AND%20message%3A%5C%22failed%20to%20reach%20creating%20status%20(current%20available)%20within%20the%20required%20time%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d

Matt Riedemann (mriedem) on 2018-05-09
Changed in cinder:
status: New → Confirmed
importance: Undecided → High
description: updated
TommyLike (hu-husheng) wrote :

Most of cinder's reset status actions are returning 202 instead of 200, I think we can fix them all.

Matt Riedemann (mriedem) wrote :

@Tommy, to be clear, the bug isn't the 202 response, it's that the group snapshot status wasn't actually updated, or at least there is a race when updating it.

TommyLike (hu-husheng) wrote :

@Matt, oh, thanks :)

Sean McGinnis (sean-mcginnis) wrote :

Based on the bug that was just duped here that had a traceback, that looked like a test issue in that it was looking for the transient "creating" state, but it had already transitioned to the "available" state. The test should probably be updated to allow for either if it doesn't catch the transition in time.

Sean McGinnis (sean-mcginnis) wrote :

Doesn't appear to actually be that state transitions too fast. The test explicitly tries setting each status:

https://github.com/openstack/tempest/blob/92f97e2d2f75eadc6d78707c011fa94dab1ba280/tempest/api/volume/admin/test_group_snapshots.py#L249

Matt Riedemann (mriedem) wrote :

Is it possible there are multiple tests changing state on the same group at the same time somehow?

Matt Riedemann (mriedem) wrote :

Or is it possible that there is some kind of periodic task which is changing the status? I just don't understand how this could be a race when the API call is synchronous.

tags: added: test-coverage
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers