While sending a signal to a WaitCondition is synchronous, the actual update
of the WaitConditionHandle metadata happens asynchronously since the fix
for bug 1394095. As a result, it's not guaranteed that even the first 6
signals (which are sent in serially, as opposed to the later ones which are
deliberately sent in parallel) will be stored in the same order that they
are sent.
Crucially, that means that one or more of the signals explicitly sent with
id 5 may arrive when there have been only three previous signals stored.
This means that the next signal to arrive with an implicit ID will be the
fifth signal stored, and therefore also get id 5. Of course we have a log
message to indicate when an existing signal is overwritten by another with
the same ID, and we are not seeing it except in the intended case where we
explicitly send in the same ID twice. That's because the keys have
different types in the data dict - the explicitly specified ID is the
string "5", but the implicitly calculated one is the integer 5. But - get
this - when we serialise the data to JSON both keys are serialised to the
string "5", and upon deserialisation they collide and one is silently
dropped on the floor.
So if the signal with the explicit ID "5" is stored just before the one
with reason "signal 4", then "signal 4" will effectively be silently
ignored as the 5th signal to arrive - a slot already filled. And since that
signal is ignored, the next signal will also be treated as the 5th to
arrive and ignored, and so on. This leads inexorably to the dreaded
"WaitConditionTimeout: resources.wait_condition: 4 of 25 received" error.
For this reason, it's a bad idea to mix explicit IDs that are also integers
with implicitly assigned IDs. Use an ID that won't collide instead.
Reviewed: https:/ /review. openstack. org/558826 /git.openstack. org/cgit/ openstack/ heat/commit/ ?id=3ac0bbedccb 50e4e5b6bad663a 872985d908e82a
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit 3ac0bbedccb50e4 e5b6bad663a8729 85d908e82a
Author: Zane Bitter <email address hidden>
Date: Wed Apr 4 09:51:12 2018 -0400
Avoid race in OSWaitCondition test
While sending a signal to a WaitCondition is synchronous, the actual update
of the WaitConditionHandle metadata happens asynchronously since the fix
for bug 1394095. As a result, it's not guaranteed that even the first 6
signals (which are sent in serially, as opposed to the later ones which are
deliberately sent in parallel) will be stored in the same order that they
are sent.
Crucially, that means that one or more of the signals explicitly sent with
id 5 may arrive when there have been only three previous signals stored.
This means that the next signal to arrive with an implicit ID will be the
fifth signal stored, and therefore also get id 5. Of course we have a log
message to indicate when an existing signal is overwritten by another with
the same ID, and we are not seeing it except in the intended case where we
explicitly send in the same ID twice. That's because the keys have
different types in the data dict - the explicitly specified ID is the
string "5", but the implicitly calculated one is the integer 5. But - get
this - when we serialise the data to JSON both keys are serialised to the
string "5", and upon deserialisation they collide and one is silently
dropped on the floor.
So if the signal with the explicit ID "5" is stored just before the one ionTimeout: resources. wait_condition: 4 of 25 received" error.
with reason "signal 4", then "signal 4" will effectively be silently
ignored as the 5th signal to arrive - a slot already filled. And since that
signal is ignored, the next signal will also be treated as the 5th to
arrive and ignored, and so on. This leads inexorably to the dreaded
"WaitCondit
For this reason, it's a bad idea to mix explicit IDs that are also integers
with implicitly assigned IDs. Use an ID that won't collide instead.
This patch is backported from /git.openstack. org/cgit/ openstack/ heat-tempest- plugin/ commit/ ?id=2cff12bceb4 b568cd8673c9ffa 5668d37fcc9da9
https:/
Change-Id: Ie2608285ba9c0e c3f1e4a8bbf1a14 7ce35ccae00 /review. openstack. org/550682
Depends-On: https:/
Closes-Bug: #1738653