Comment 14 for bug 2015065

Revision history for this message
yatin (yatinkarel) wrote :

While checking another issue https://bugs.launchpad.net/grenade/+bug/2020643 which is triggered by this stuck issue, i saw one request in neutron too timed out after 900 seconds[1].

Tempest triggered security group delete but it timed out and retried and succeeded in the second attempt:-

2023-05-22 04:28:48.568 69207 WARNING urllib3.connectionpool [-] Retrying (Retry(total=9, connect=None, read=None, redirect=5, status=None)) after connection broken by 'ReadTimeoutError("HTTPConnectionPool(host='173.231.255.73', port=80): Read timed out. (read timeout=60)")': /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146
2023-05-22 04:28:48.707 69207 INFO tempest.lib.common.rest_client [req-1fd0ae50-5f36-40de-9d48-243f74c59335 req-1fd0ae50-5f36-40de-9d48-243f74c59335 ] Request (SecurityGroupRulesTestJSON:tearDownClass): 202 DELETE http://173.231.255.73/compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 60.179s

The original request is stuck in nova/neutron and times out after 900s(client_socket_timeout).

nova:-
May 22 04:27:48.532171 np0034092361 <email address hidden>[51094]: DEBUG nova.api.openstack.wsgi [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Calling method '<function Controller.__getattribute__.<locals>.version_select at 0x7f9129c5dea0>' {{(pid=51094) _process_stack /opt/stack/old/nova/nova/api/openstack/wsgi.py:513}}
May 22 04:42:48.780841 np0034092361 <email address hidden>[51094]: DEBUG neutronclient.v2_0.client [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Error message: {"error": {"code": 401, "title": "Unauthorized", "message": "The request you have made requires authentication."}} {{(pid=51094) _handle_fault_response /usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:262}}
May 22 04:42:48.781975 np0034092361 <email address hidden>[51094]: INFO nova.api.openstack.requestlog [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] 173.231.255.73 "DELETE /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146" status: 500 len: 0 microversion: 2.1 time: 900.252509
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: Mon May 22 04:42:48 2023 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 (ip 173.231.255.73) !!!
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: Mon May 22 04:42:48 2023 - uwsgi_response_writev_headers_and_body_do(): Broken pipe [core/writer.c line 306] during DELETE /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 (173.231.255.73)
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: CRITICAL nova [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Unhandled error: OSError: write error
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: ERROR nova OSError: write error
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: ERROR nova

nova worker 51094 not doing anything and stuck during those 900 seconds.

neutron, an another req(security-group-rule delete associated with the security group):-
May 22 04:27:48.525507 np0034092361 neutron-server[51851]: DEBUG neutron.api.rpc.handlers.resources_rpc [req-b7d96c61-9d75-4c14-8fd7-6f2dd459b68f req-40ff2231-f1d6-4785-9596-2ccd296618b6 tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Pushing event deleted for resources: {'SecurityGroupRule': ['ID=75e2d70f-0a9d-4bc3-958f-220455905b69,revision_number=None']} {{(pid=51851) push /opt/stack/old/neutron/neutron/api/rpc/handlers/resources_rpc.py:237}}
May 22 04:42:48.524257 np0034092361 neutron-server[51851]: DEBUG neutron.wsgi [req-b7d96c61-9d75-4c14-8fd7-6f2dd459b68f req-40ff2231-f1d6-4785-9596-2ccd296618b6 tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] (51851) timed out ('173.231.255.73', 33990) {{(pid=51851) process_request /usr/local/lib/python3.10/dist-packages/eventlet/wsgi.py:842}}

neutron-api worker processing other requests during this time so where it's stuck?

[1] https://zuul.opendev.org/t/openstack/build/4a37d236576b41648ef16c25b6daaff0