Error during the termination of large group of VMs

Bug #1321626 reported by Timur Nurlygayanov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Environment:
OpenStack Havana, CentOS 6.4, Neutron with GRE, 1 controller node and 4 compute nodes.

Steps To Reproduce:
1. Create 200 VMs with Cirros Image.
2. Delete 100+ VMs with one request (in Horizon dashboard)

Observed Result:
part of VMs will be successfully removed, but when we select the large group of servers, we can see the error message in Horizon and the traceback in nova.log:

-288a-4c9f-96cb-4e01bebceb4d HTTP/1.1" status: 204 len: 179 time: 0.1963580
<182>May 21 07:36:20 node-4 nova-nova.osapi_compute.wsgi.server INFO: (17832) accepted ('198.11.197.103', 47982)
<0>May 21 07:36:50 node-4 �<179>nova-nova.api.openstack ERROR: Caught error: Request Failed: internal server error while processing your request.
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/__init__.py", line 119, in __call__
    return req.get_response(self.application)
  File "/usr/lib/python2.6/site-packages/webob/request.py", line 1296, in send
    application, catch_exc_info=False)
  File "/usr/lib/python2.6/site-packages/webob/request.py", line 1260, in call_application
    app_iter = application(self.environ, start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 144, in __call__
    return resp(environ, start_response)
  File "/usr/lib/python2.6/site-packages/keystoneclient/middleware/auth_token.py", line 545, in __call__
    return self.app(env, start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 144, in __call__
    return resp(environ, start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 144, in __call__
    return resp(environ, start_response)
  File "/usr/lib/python2.6/site-packages/Routes-1.12.3-py2.6.egg/routes/middleware.py", line 131, in __call__
    response = self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 144, in __call__
    return resp(environ, start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 130, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 195, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/wsgi.py", line 938, in __call__
    content_type, body, accept)
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/wsgi.py", line 1023, in _process_stack
    request, action_args)
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/wsgi.py", line 911, in post_process_extensions
    **action_args)
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/compute/contrib/security_groups.py", line 578, in detail
    self._extend_servers(req, list(resp_obj.obj['servers']))
  File "/usr/lib/python2.6/site-packages/nova/api/openstack/compute/contrib/security_groups.py", line 527, in _extend_servers
    servers))
  File "/usr/lib/python2.6/site-packages/nova/network/security_group/neutron_driver.py", line 348, in get_instances_security_groups_bindings
    security_groups = self._get_secgroups_from_port_list(ports, neutron)
  File "/usr/lib/python2.6/site-packages/nova/network/security_group/neutron_driver.py", line 332, in _get_secgroups_from_port_list
    search_results = neutron.list_security_groups(**sg_search_opts)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 111, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 474, in list_security_groups
    retrieve_all, **_params)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1250, in list
    for r in self._pagination(collection, path, **params):
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1263, in _pagination
    res = self.get(path, params=params)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1236, in get
    headers=headers, params=params)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1221, in retry_request
    headers=headers, params=params)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1164, in do_request
    self._handle_fault_response(status_code, replybody)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1134, in _handle_fault_response
    exception_handler_v20(status_code, des_error_body)
  File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 84, in exception_handler_v20
    message=error_dict)
NeutronClientException: Request Failed: internal server error while processing your request.
<0>May 21 07:36:50 node-4 �<182>nova-nova.api.openstack INFO: http://198.11.197.103:8774/v2/cda5c4502688428d93e6570696716717/servers/detail?project_id=cda5c4502688428d93e6570696716717&limit=201 returned with HTTP 500
<182>May 21 07:36:50 node-4 nova-nova.osapi_compute.wsgi.server INFO: 198.11.197.103 "GET /v2/cda5c4502688428d93e6570696716717/servers/detail?project_id=cda5c4502688428d93e6570696716717&limit=201 HTTP/1.1" status: 500 len: 335 time: 29.3159540

Tags: db
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Download full text (18.0 KiB)

In Neutron logs we can see the following:

<167>May 21 07:37:40 node-4 neutron-server 2014-05-21 07:37:34.517 22445 ERROR neutron.openstack.common.rpc.common [-] Returning exception QueuePool limit of size 10 overflow 20 reached, connection timed out, timeout 10 to caller
<167>May 21 07:37:40 node-4 neutron-server 2014-05-21 07:37:34.517 22445 ERROR neutron.openstack.common.rpc.common [-] ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.6/site-packages/neutron/openstack/common/rpc/amqp.py", line 438, in _process_data\n **args)\n', ' File "/usr/lib/python2.6/site-packages/neutron/common/rpc.py", line 45, in dispatch\n neutron_ctxt, version, method, namespace, **kwargs)\n', ' File "/usr/lib/python2.6/site-packages/neutron/openstack/common/rpc/dispatcher.py", line 172, in dispatch\n result = getattr(proxyobj, method)(ctxt, **kwargs)\n', ' File "/usr/lib/python2.6/site-packages/neutron/db/securitygroups_rpc_base.py", line 142, in security_group_rules_for_devices\n port = self.get_port_from_device(device)\n', ' File "/usr/lib/python2.6/site-packages/neutron/plugins/openvswitch/ovs_neutron_plugin.py", line 95, in get_port_from_device\n port = ovs_db_v2.get_port_from_device(device)\n', ' File "/usr/lib/python2.6/site-packages/neutron/plugins/openvswitch/ovs_db_v2.py", line 331, in get_port_from_device\n port_and_sgs = query.all()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2115, in all\n return list(self)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2227, in __iter__\n return self._execute_and_instances(context)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2240, in _execute_and_instances\n close_with_result=True)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2231, in _connection_from_session\n **kw)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 777, in connection\n close_with_result=close_with_result)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 783, in _connection_for_bind\n return engine.contextual_connect(**kwargs)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 2489, in contextual_connect\n self.pool.connect(),\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 236, in connect\n return _ConnectionFairy(self).checkout()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 401, in __init__\n rec = self._connection_record = pool._do_get()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 738, in _do_get\n (self.size(), self.overflow(), self._timeout))\n', 'TimeoutError: QueuePool limit of size 10 overflow 20 reached, connection timed out, timeout 10\n']
<167>May 21 07:37:40 node-4 neutron-server 2014-05-21 07:37:34.518 22445 ERROR neutron.openstack.common.rpc.amqp [-] Exception during message handling
<167>May 21 07:37:40 node-4 neutron-server 2014-05-21 07:37:34.518 22445 TRACE neutron.openstack.common.rpc.amqp Traceback (most recent call last):
<167>May 21 07:37:40 node-4 ne...

summary: - Error during the group of VMs termination
+ Error during the termination of large group of VMs
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

The root of this problem in the limit on requests/seconds for Neutron API and we can fix this problem - we should just increase the number of Neutron workers: need to set 'api_workers = 10' in /etc/neutron/neutron.conf.

Do we have another way to improve the performance of Neutron API?

Revision history for this message
Jack McCann (jack-mccann) wrote :

You might also try https://review.openstack.org/#/c/58017/ that went into icehouse.

description: updated
tags: added: db
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.