The logged error in bug #1328983 occurs in the following scenario:
* single heat-engine (so no multi-engine locking paths are invoked)
* multiple stack deletes on the same stack
* when the threads are stopped from a previous delete call so
that the new delete call can acquire the lock and do the delete
This change does an eventlet sleep before forcefully stopping
the thread when taking over a delete from a different
thread. This gives the previous delete thread an opportunity to
finish naturally.
This may not prevent the error in bug #1328983 from ever
being logged in a production environment, but it is an internal error
which does not propagate to the user. This change should however
prevent this error from being logged during a tempest run, and the
original gate failure was due to ERROR log entry detection rather
than actual failing tests.
By experimentation the timeout of 0.2s was chosen. The error in bug #1328983 started being logged again when the timeout was
reduced to 0.02s
Reviewed: https:/ /review. openstack. org/103716 /git.openstack. org/cgit/ openstack/ heat/commit/ ?id=cb966ff38a9 829e1d7d86701cb 73bcdd2537595a
Committed: https:/
Submitter: Jenkins
Branch: master
commit cb966ff38a9829e 1d7d86701cb73bc dd2537595a
Author: Steve Baker <email address hidden>
Date: Wed Jul 2 15:16:06 2014 +1200
Sleep before stopping threads for delete
The logged error in bug #1328983 occurs in the following scenario:
* single heat-engine (so no multi-engine locking paths are invoked)
* multiple stack deletes on the same stack
* when the threads are stopped from a previous delete call so
that the new delete call can acquire the lock and do the delete
This change does an eventlet sleep before forcefully stopping
the thread when taking over a delete from a different
thread. This gives the previous delete thread an opportunity to
finish naturally.
This may not prevent the error in bug #1328983 from ever
being logged in a production environment, but it is an internal error
which does not propagate to the user. This change should however
prevent this error from being logged during a tempest run, and the
original gate failure was due to ERROR log entry detection rather
than actual failing tests.
By experimentation the timeout of 0.2s was chosen. The error in
bug #1328983 started being logged again when the timeout was
reduced to 0.02s
Closes-Bug: #1328983
Change-Id: I8f95f29bd238e0 97ed9f4b889afe1 2c88d193240