Tempest test_minimum_basic_scenario fails randomly in volume info comparison

Bug #1797169 reported by Ilya Shakhat
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
tempest
New
Undecided
Unassigned

Bug Description

The bug report is based on investigation of the following Tempest job: http://logs.openstack.org/92/602992/2/check/tempest-full-py3-profiler-enabled/e04ae5c/

Test case
---------

Scenario test_minimum_basic_scenario contains the following code working with cinder volumes
(http://git.openstack.org/cgit/openstack/tempest/tree/tempest/scenario/test_minimum_basic.py#n28):

def test_minimum_basic_scenario(self):
    ....
    volume = self.create_volume()
    volumes = self.volumes_client.list_volumes()['volumes']
    self.assertIn(volume['id'], [x['id'] for x in volumes])

    self.cinder_show(volume)

def show_volume(self, volume):
    got_volume = self.volumes_client.show_volume(volume['id'])['volume']
    self.assertEqual(volume, got_volume)

The failure is reported in the last statement where queried volume is compared with the one returned by "create_volume()" function. The only difference is in field "update_time":

    reference: '2018-10-10T11:24:22.000000'
    actual: '2018-10-10T11:24:23.000000'

The difference is 1 second.

API requests
------------

According to logs of Cinder API:

    * 11:24:22.367325 - the last query from wait_for_volume_resource_status (called inside of 'create_volume')
    * 11:24:22.448961 - the query that gets volume inside of 'create_volume' function - this volume is used as reference in the test case
    * 11:24:22.525480 - call 'list_volumes'
    * 11:24:22.605738 - the query to get volume from 'show_volume' - this value is used in assertion to compare with the reference

Observation #1
--------------
Timestamp stored in volume object comes from the future (11:24:23.000000 vs 11:24:22.605738). The reason of this in datetime rounding that happens in MySQL (https://bugs.mysql.com/bug.php?id=68760). However this behavior does not explain why the update_time changes after the volume is considered to be active.

Cinder Volume logs
------------------

According to logs from c-vol:

    * 11:24:22.023005 - Task 'cinder.volume.flows.manager.create_volume.CreateVolumeOnFinishTask;volume:create, create.end' (6a73d49e-e72e-421d-a2c1-2334660deb02) transitioned into state 'RUNNING' from state 'PENDING' - started execution of task that updates the status of volume to 'available' and thus resumes execution of of Tempest test case

    * 11:24:22.566680 - Volume volume-cfcf5241-4088-4602-a385-8ada5df2648e: created successfully - the task is finished - now the volume object in database has status 'available' and 'updated_at' refers to time of change - as we see from API logs at 11:24:22.367325 the volume was in state 'available' and at 11:24:22.448961 it was queried by Tempest.

    * 11:24:22.570767 - Flow 'volume_create_manager' (46e4030d-51f0-4dbc-84b1-30a0bdbfa680) transitioned into state 'SUCCESS' from state 'RUNNING' - the whole flow is completed

    * 11:24:22.578602 - cinder.volume.manager reports "Created volume successfully."

Observation #2
--------------

Tempest test case assumes that the object does not change after it is reported as 'available' for the first time. However this is not the case. Function 'create_volume' from 'cinder.volume.manager' has the following lines that go _after_ execution of workflow:

        updates = {'service_uuid': self.service_uuid,
                   'shared_targets': shared_targets}

        volume.update(updates)
        volume.save()

        LOG.info("Created volume successfully.", resource=volume)

As result of update operation the value stored in 'updated_at' attribute may also change if the time crosses rounding boundaries. In other words, if the first call was at 0.4 second and the second is at 0.6 then (due to MySQL rounding) the first value would become 0 and the second would turn into 1.

Summary
-------

The following is timeline that explains test failure:

    * 11:24:22.448961 - Tempest requested 'reference' volume object with 'updated_at' time set to '2018-10-10T11:24:22.000000' - this happened while Cinder volume was still executing the flow
    * between 11:24:22.570767 and 11:24:22.578602 Cinder volume manager issued another update on volume object which turns 'updated_at' field to '2018-10-10T11:24:23.000000'
    * 11:24:22.605738 - Tempest queries 'actual' volume object which already has different 'updated_at' value.

Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

The tempest test should handle this, either by being less precise for this field or just ignoring it.

Changed in cinder:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.