Conflict errors when snapshot-restore stacks in parallel
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Heat |
Fix Released
|
High
|
Kairat Kushaev |
Bug Description
When I tried the test heat with rally I found the following problem:
1. rally input task is the following
HeatStacks.
-
args:
runner:
type: "constant"
times: 8
context:
users:
tenants: 3
So rally creates a stack in 8 threads with 3 tenants ans 2 users per tenant.
create_
2. When launched test I got the following:
2015-05-19 16:48:34.521 26343 INFO rally.benchmark
2015-05-19 16:48:34.519 26343 INFO rally.benchmark
2015-05-19 16:48:34.531 26343 INFO rally.benchmark
2015-05-19 16:48:34.547 26343 INFO rally.benchmark
2015-05-19 16:48:59.881 26343 INFO rally.benchmark
2015-05-19 16:48:59.911 26343 INFO rally.benchmark
2015-05-19 16:49:16.641 26343 INFO rally.benchmark
2015-05-19 16:49:16.652 26343 INFO rally.benchmark
2015-05-19 16:49:23.423 26343 INFO rally.benchmark
2015-05-19 16:49:23.436 26343 INFO rally.benchmark
2015-05-19 16:49:23.534 26343 INFO rally.benchmark
2015-05-19 16:49:23.549 26343 INFO rally.benchmark
2015-05-19 16:49:46.729 26343 INFO rally.benchmark
2015-05-19 16:49:55.416 26343 INFO rally.benchmark
2015-05-19 16:50:01.911 26343 INFO rally.benchmark
2015-05-19 16:50:03.542 26343 INFO rally.benchmark
+------
| Response Times (sec) |
+------
| action | min | median | 90%ile | 95%ile | max | avg | success | count |
+------
| heat.create_stack | 12.068 | 13.911 | 15.478 | 15.537 | 15.595 | 14.122 | 100.0% | 8 |
| heat.delete_stack | 4.732 | 10.61 | 14.012 | 14.048 | 14.084 | 9.544 | 87.5% | 8 |
| heat.restore_stack | 7.282 | 11.237 | 11.991 | 12.38 | 12.77 | 10.553 | 87.5% | 8 |
| heat.snapshot_stack | 7.475 | 9.467 | 10.286 | 10.827 | 11.369 | 9.231 | 100.0% | 8 |
| total | 38.474 | 42.094 | 48.922 | 48.958 | 48.994 | 43.43 | 87.5% | 8 |
+------
So restore for stack rally_stack_Ax7mX0n was not finished successfully.
Please see the heat-engine logs in attachment.
Env:
heat - master branch - liberty.1
rally - master branch - liberty.1
Changed in heat: | |
milestone: | none → liberty-1 |
status: | Fix Committed → Fix Released |
Changed in heat: | |
milestone: | liberty-1 → 5.0.0 |
So if you will deep down into logs you will see the following: stack_lock [^[[00; 36m-^[[ 00;32m] ^[[01;35m^ [[00;32mEngine 21ec8e3e- 369e-4ed0- 9e2b-e9220629b4 9e released lock on stack 5665946a- 64fa-4ae7- a05f-04f198e301 03^[[00m ^[[00;33mfrom (pid=11867) release /opt/stack/ heat/heat/ engine/ stack_lock. py:140^ [[00m
2015-05-19 16:48:59.849 ^[[00;32mDEBUG heat.engine.
and after that
2015-05-19 16:48:59.857 ^[[00;32mDEBUG heat.engine. stack_lock [^[[01; 36mreq- a0e537a5- 9784-4a55- bead-1fb82c0172 4f ^[[00;36mNone ctx_rally_ 38c5e989- 23bc-4d45- 825a-b8ae19e2be b0_tenant_ 2^[[00; 32m] ^[[01;35m^ [[00;32mLock on stack 5665946a- 64fa-4ae7- a05f-04f198e301 03 is owned by engine 21ec8e3e- 369e-4ed0- 9e2b-e9220629b4 9e^[[00m ^[[00;33mfrom (pid=11867) acquire /opt/stack/ heat/heat/ engine/ stack_lock. py:90^[ [00m rpc.dispatcher [^[[01; 36mreq- a0e537a5- 9784-4a55- bead-1fb82c0172 4f ^[[00;36mNone ctx_rally_ 38c5e989- 23bc-4d45- 825a-b8ae19e2be b0_tenant_ 2^[[00; 32m] ^[[01;35m^ [[00;32mExpecte d exception during message handling (Stack rally_stack_Ax7mX0n already has an action (SNAPSHOT) in progress.)^[[00m ^[[00;33mfrom (pid=11867) _dispatch_and_reply /usr/local/ lib/python2. 7/dist- packages/ oslo_messaging/ rpc/dispatcher. py:145^ [[00m
2015-05-19 16:48:59.857 ^[[00;32mDEBUG oslo_messaging.
It seems that lock release does not work correctly in multithread environment.