Race in heat stack update

Bug #1308682 reported by Sean Dague
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Thomas Herve
tempest
Invalid
High
Unassigned

Bug Description

When attempting to update a heat stack there is no indication when it's done, causing races in test_update.py in Tempest:

The offending code is:

        # Add one resource via a stack update
        self.update_stack(stack_identifier, self.update_template)
        updated_resources = {'random1': 'OS::Heat::RandomString',
                             'random2': 'OS::Heat::RandomString'}
        self.assertEqual(updated_resources,
                         self.list_resources(stack_identifier))

In approx 25% of cases the update has not actually completed by the time list_resources is called (http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwidGVtcGVzdC5hcGkub3JjaGVzdHJhdGlvbi5zdGFja3MudGVzdF91cGRhdGUuVXBkYXRlU3RhY2tUZXN0SlNPTi50ZXN0X3N0YWNrX3VwZGF0ZV9hZGRfcmVtb3ZlXCIgQU5EIChtZXNzYWdlOlwiRkFJTFwiIE9SIG1lc3NhZ2U6XCJva1wiKSIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM5NzY3MjYzMDgyOCwibW9kZSI6InNjb3JlIiwiYW5hbHl6ZV9maWVsZCI6ImJ1aWxkX3N0YXR1cyJ9)

this leads to a failure on the equality assert.

An example failure at - http://logs.openstack.org/54/87554/9/check/check-tempest-dsvm-postgres-full/7922180/console.html#_2014-04-16_13_04_22_366

Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/88056

Sean Dague (sdague)
Changed in heat:
importance: Undecided → Critical
Changed in tempest:
importance: Undecided → High
Changed in heat:
importance: Critical → High
Revision history for this message
Thomas Herve (therve) wrote :

The logstack brings another failure, but there is a race condition in Heat update code: we do a state_set around the end of update_task, but before setting the new template and calling store, so there is a window where the state is UPDATE_COMPLETE but the new resources aren't there. We shouldn't do that.

Changed in heat:
milestone: none → juno-1
assignee: nobody → Thomas Herve (therve)
status: New → Confirmed
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/88075

Changed in heat:
status: Confirmed → In Progress
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/88056
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=0ef007a7cadd509fbb628bb39fb0d8f735ecde0a
Submitter: Jenkins
Branch: master

commit 0ef007a7cadd509fbb628bb39fb0d8f735ecde0a
Author: Sean Dague <email address hidden>
Date: Wed Apr 16 14:53:26 2014 -0400

    skip test_stack_update_add_remove because of race

    test_stack_update_add_remove races about 25% of the time in the
    gate because the update is eventually consistent, and doesn't
    happen before the list returns. It's unclear if this is
    expected or unexpected behavior for heat, but 25% failure rate
    is far too high.

    Change-Id: I0cfd9c10c1fd5b674c1a2211f98a6aba042227da
    Related-Bug: #1308682

Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/88075
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=f89edeaf39698329ec00367721dd9e11ebbad006
Submitter: Jenkins
Branch: master

commit f89edeaf39698329ec00367721dd9e11ebbad006
Author: Thomas Herve <email address hidden>
Date: Wed Apr 16 21:55:29 2014 +0200

    Push COMPLETE status change at the end of update

    This patch moves the status change of stack update to the very end of
    the update task, to avoid a race condition where the status of the stack
    is UPDATE_COMPLETE but stack-show could return the previous resources.

    Change-Id: I04abcf2e876d1de36a36bcd1596a2fd510669744
    Closes-Bug: #1308682

Changed in heat:
status: In Progress → Fix Committed
Joe Gordon (jogo)
Changed in tempest:
status: New → Invalid
Thierry Carrez (ttx)
Changed in heat:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/121573

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/121573
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=f68ed53243d14ef59039e6f2b7f8a156b5539900
Submitter: Jenkins
Branch: master

commit f68ed53243d14ef59039e6f2b7f8a156b5539900
Author: Matthew Treinish <email address hidden>
Date: Mon Sep 15 10:10:21 2014 -0400

    Unskip test_stack_update_add_remove()

    test_stack_update_add_remove was skipped because of bug 1308682 which
    is now in a state which should allow us to unskip it.

    Change-Id: I9b728abe3432f61f883d50b4a9d7d8d9c2b1f8b2
    Related-Bug: #1308682

Thierry Carrez (ttx)
Changed in heat:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.