test_create_second_image_when_first_image_is_being_saved intermittently fails

Bug #1834046 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Fix Released
Undecided
Unassigned

Bug Description

The test_create_second_image_when_first_image_is_being_saved test creates a snapshot image of a server and then tries to immediately create another snapshot of the same server while the first snapshot is still being created and the server's task_state is not None, asserting that the 2nd snapshot attempt will result in a 409 conflict error.

This can race, however, as seen here:

First createImage request:

2019-06-22 14:18:20,586 25045 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"createImage": {"name": "tempest-ImagesOneServerNegativeTestJSON-image-1209078691"}}
    Response - Headers: {'date': 'Sat, 22 Jun 2019 14:18:20 GMT', 'server': 'Apache/2.4.29 (Ubuntu)', 'content-type': 'text/html; charset=UTF-8', 'content-length': '0', 'location': 'https://158.69.68.108/image/images/12221410-dba4-4f0c-9ec3-0c6da5ed8d0d', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', 'x-openstack-request-id': 'req-454f2e87-9ea1-4721-96e0-f15d0271a4ee', 'x-compute-request-id': 'req-454f2e87-9ea1-4721-96e0-f15d0271a4ee', 'connection': 'close', 'status': '202', 'content-location': 'https://158.69.68.108/compute/v2.1/servers/a1c17162-58e1-4c0c-9bb4-8d7265755840/action'}
        Body: b''

Second createImage request:

2019-06-22 14:18:23,626 25045 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"createImage": {"name": "tempest-ImagesOneServerNegativeTestJSON-image-1333717627"}}
    Response - Headers: {'date': 'Sat, 22 Jun 2019 14:18:20 GMT', 'server': 'Apache/2.4.29 (Ubuntu)', 'content-type': 'text/html; charset=UTF-8', 'content-length': '0', 'location': 'https://158.69.68.108/image/images/10e941c2-ecba-4c83-a01e-80131fc7ee3c', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', 'x-openstack-request-id': 'req-fb6b60d6-5d19-4c95-99a0-c118e0179af5', 'x-compute-request-id': 'req-fb6b60d6-5d19-4c95-99a0-c118e0179af5', 'connection': 'close', 'status': '202', 'content-location': 'https://158.69.68.108/compute/v2.1/servers/a1c17162-58e1-4c0c-9bb4-8d7265755840/action'}
        Body: b''

For whatever reason, the requests are about 3 seconds apart.

In this case we're using the libvirt compute driver and the host has direct I/O, meaning the guest will not be paused/stopped to perform the snapshot and since it's a small cirros image the live snapshot is fast and completes in about 2 seconds:

Jun 22 14:18:22.977604 ubuntu-bionic-ovh-bhs1-0008227296 nova-compute[17826]: INFO nova.compute.manager [None req-454f2e87-9ea1-4721-96e0-f15d0271a4ee tempest-ImagesOneServerNegativeTestJSON-507692939 tempest-ImagesOneServerNegativeTestJSON-507692939] [instance: a1c17162-58e1-4c0c-9bb4-8d7265755840] Took 2.11 seconds to snapshot the instance on the hypervisor.

Which will set the server task_state to None before the 2nd createImage request is made which means it doesn't fail with a 409 conflict error as the test expects.

Since this is a racy test it's hard to make this kind of assertion especially when live snapshots are involved. It looks like this is pretty rare - 4 hits in the last 7 days:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22testtools.matchers._impl.MismatchError%3A%20%3Cbound%20method%20BaseV2ComputeTest.create_image_from_server%20of%20%3Cclass%20'tempest.api.compute.images.test_images_oneserver_negative.ImagesOneServerNegativeTestJSON'%3E%3E%20returned%20%7B%7D%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d

Tags: compute
Revision history for this message
Matt Riedemann (mriedem) wrote :

BTW the logs in the description are from this change:

http://logs.openstack.org/52/645352/5/check/tempest-full-py3/c974580/

Changed in tempest:
status: New → Triaged
tags: added: compute
Revision history for this message
Martin Kopec (mkopec) wrote :
Revision history for this message
Martin Kopec (mkopec) wrote :

Might be related to https://bugs.launchpad.net/tripleo/+bug/1881592

For the record, the test got skipped by this review on June 15th 2020:
https://review.opendev.org/c/openstack/tempest/+/733154

The test got probably fixed by:
https://review.opendev.org/c/openstack/tempest/+/735631

Revision history for this message
Martin Kopec (mkopec) wrote :

Based on the previous comment I'm marking this as Fixed ... if you still experience the issue, please don't hesitate to reopen by sharing new info, thanks

Changed in tempest:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.