instance delete requests occasionally lost after nova-api says OK
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Doing performance testing on the latest openstack code, instance delete requests that I'm issuing are occasionally "lost". By "lost", I mean that nova-api responds HTTP 200, but nothing seems to happen to the instance, even after several minutes. Hence the request seems to be lost after nova-api casts the RPC to nova-compute. A subsequent delete request for the instance usually works as expected.
In a run where I boot 10 instances in parallel then delete each instances as soon as it goes ACTIVE, usually 2 of my delete requests are lost. To get a sense of timing, consider that a run of 5 instances currently takes ~8s on my system; I can't report on a run of 10 instances, because they never successfully finish.
I'm using mysql and rabbitmq.
I plan on digging into the nova logs to see what happens with these lost requests. I'll always include the req-XXX tag in the HTTP response, then grep for the req-XXX tags of the lost requests.
Seems to be fixed by https:/ /review. openstack. org/#/c/ 57509/. Applied locally and tested.