test_attach_scsi_disk_with_config_drive intermittently fails at detaching volume

Bug #1907084 reported by Balazs Gibizer
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Fix Released
Undecided
Balazs Gibizer

Bug Description

2020-12-01 07:23:50.118692 | controller | Captured traceback:
2020-12-01 07:23:50.118702 | controller | ~~~~~~~~~~~~~~~~~~~
2020-12-01 07:23:50.118712 | controller | Traceback (most recent call last):
2020-12-01 07:23:50.118730 | controller |
2020-12-01 07:23:50.118742 | controller | File "/opt/stack/tempest/tempest/api/compute/admin/test_volume.py", line 117, in test_attach_scsi_disk_with_config_drive
2020-12-01 07:23:50.118752 | controller | self.assertEqual(0, len(volume_after_detach),
2020-12-01 07:23:50.118779 | controller |
2020-12-01 07:23:50.118807 | controller | File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 415, in assertEqual
2020-12-01 07:23:50.118825 | controller | self.assertThat(observed, matcher, message)
2020-12-01 07:23:50.118844 | controller |
2020-12-01 07:23:50.118861 | controller | File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 502, in assertThat
2020-12-01 07:23:50.118871 | controller | raise mismatch_error
2020-12-01 07:23:50.118886 | controller |
2020-12-01 07:23:50.118895 | controller | testtools.matchers._impl.MismatchError: 0 != 1: Failed to detach volume

The volume reaches available state but nova still returns that the attachment exists[2]:

2020-12-01 07:23:50.122191 | controller | 2020-12-01 06:51:27,540 116889 INFO [tempest.common.waiters] volume 3a7cb8bc-1aec-4cce-92a6-cee218fce350 reached available after waiting for 1.540623 seconds
2020-12-01 07:23:50.122201 | controller | 2020-12-01 06:51:27,614 116889 INFO [tempest.lib.common.rest_client] Request (AttachSCSIVolumeTestJSON:test_attach_scsi_disk_with_config_drive): 200 GET https://158.69.73.253/compute/v2.1/servers/95a2acba-418c-4944-ada9-b7a7a27bf66b/os-volume_attachments 0.073s
2020-12-01 07:23:50.122211 | controller | 2020-12-01 06:51:27,614 116889 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
2020-12-01 07:23:50.122221 | controller | Body: None
2020-12-01 07:23:50.122230 | controller | Response - Headers: {'date': 'Tue, 01 Dec 2020 06:51:27 GMT', 'se
2020-12-01 07:23:50.122240 | controller | rver': 'Apache/2.4.41 (Ubuntu)', 'content-length': '197', 'content-type': 'application/json', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', 'x-openstack-request-id': 'req-1e8787dd-af52-4d62-98e4-be35966f1917', 'x-compute-request-id': 'req-1e8787dd-af52-4d62-98e4-be35966f1917', 'connection': 'close', 'status': '200', 'content-location': 'https://158.69.73.253/compute/v2.1/servers/95a2acba-418c-4944-ada9-b7a7a27bf66b/os-volume_attachments'}
2020-12-01 07:23:50.122250 | controller | Body: b'{"volumeAttachments": [{"id": "3a7cb8bc-1aec-4cce-92a6-cee218fce350", "volumeId": "3a7cb8bc-1aec-4cce-92a6-cee218fce350", "serverId": "95a2acba-418c-4944-ada9-b7a7a27bf66b", "device": "/dev/sdc"}]}'

The nova code first delete the attachment in cinder and then delete the BDM from the nova DB[1]. So I think what we see here is a race condition in the test.

Here is a logstash signature[3]

[1] novacode: https://github.com/openstack/nova/blob/97729aabaa7768e1af2fa4a0bdda0df7ad598426/nova/compute/manager.py#L7071-L7086
[2] example failure: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_0a3/764356/1/check/neutron-ovn-tempest-ovs-release/0a316c0/job-output.txt
[3] http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22in%20test_attach_scsi_disk_with_config_drive%5C%22%20AND%20build_status%3AFAILURE

Revision history for this message
Lee Yarwood (lyarwood) wrote :

As discussed the test is using the servers_client to lookup the active volume attachments of an instance via the n-api os-volume_attachments API. There's a small window for a race when using this API to determine if a volume is attached as we only delete the block_device_mapping record in Nova *after* we have marked the volume as detached (and thus `available`) in c-api. The test should be *waiting* until there are zero volume attachments listed for the instance via n-api.

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :
Changed in tempest:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :
Changed in tempest:
status: New → In Progress
Revision history for this message
Martin Kopec (mkopec) wrote :
Changed in tempest:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tempest 26.0.0

This issue was fixed in the openstack/tempest 26.0.0 release.

Martin Kopec (mkopec)
Changed in tempest:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.