[stable/newton] Deleting heat stack failed due to error "QueuePool limit of size 50 overflow 50 reached, connection timed out, timeout 30"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Won't Fix
|
Undecided
|
Unassigned | ||
neutron |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
In my stable/newton setup running on a VMware NSX platform, I brought up 5 heat stacks each having 100 nova instances in the same /16 network.
Deleting those heat stacks failed due to the below error.
"
2016-11-03 17:27:34.146 2399 ERROR nova.api.
"
Because of this error, out of 500 instances, deletion of about 67 instances got failed.
With default parameters in neutron.conf, I'm getting the below neutron error.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
.
.
.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
2016-11-02 20:42:06.557 18058 ERROR neutron.
After changing the below parameters in /etc/neutron/
max_pool_size = 50
retry_interval = 10
max_overflow = 50
pool_max_size = 50
pool_max_overflow = 50
pool_timeout = 30
below parameters in nova.conf and restarted the services and re-executed the testcase.Still deleting heat stack is failing with the below error
max_pool_size = 50
max_overflow = 50
n-api.log:
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
.
.
.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.146 2399 ERROR nova.api.
2016-11-03 17:27:34.148 2399 INFO nova.api.
<class 'sqlalchemy.
2016-11-03 17:27:34.148 2399 DEBUG nova.api.
<class 'sqlalchemy.
Please look into this.
summary: |
- [stable/newton] Deleting heat stack failed due to neutron error - "QueuePool limit of size 50 overflow 50 reached, connection timed out, - timeout 30" + [stable/newton] Deleting heat stack failed due to error "QueuePool limit + of size 50 overflow 50 reached, connection timed out, timeout 30" |
description: | updated |
Changed in nova: | |
status: | Incomplete → New |
Changed in neutron: | |
status: | Incomplete → New |
Hi Sujai, thanks for reporting this. The issue is not yet clear to me.
Can you please provide:
- max_pool_size, retry_interval, max_overflow settings of your first run
- the nova error message of the first and the second run
- the neutron error message of the first and second run
At the moment you provided a neutron message for the first run and a nova message for the second. So I can't see what changed after updating the settings. In general it looks like that for deleting 500 instances in parallel a lot of sql queries are required. It seems like for such a use case you need to tune your parameters..
But let's compare your error message of the first and second run first. If they are equal, I tend to set this to invalid