The subcloud audit runs every 3 minutes by default, looping over all subclouds in a single audit interval running its checks to decide if a subcloud is online/offline. The checks must fail twice over consecutive audit intervals in order for a subcloud to be declared offline, so the average time a subcloud should take to go offline should be on average 4.5 minutes with a max of 6 minutes. The problem I believe, is that it takes a long time for the initial retrieval of the keystone client for a subcloud to timeout (~4.5 mins per subcloud according to below logs). When all the subclouds are powered down at the same time this causes a single subcloud audit to take > 40 minutes, and the next one to take > 50 minutes, as the second pass also hits timeouts trying to inform dcorch of the subcloud state changes when setting each subcloud offline. First pass of audit logs (snipped): 2019-12-04 18:39:18.647 728803 INFO dcmanager.manager.subcloud_audit_manager [-] Triggered subcloud audit. 2019-12-04 18:39:59.172 729366 INFO eventlet.wsgi.server [req-6131930f-560b-448e-a1ea-5536d9e0de13 a7a1ad96b9a341e28cd1b92f3b836a6b - - default default] fd01:6::2 "GET /v1.0/subclouds/ HTTP/1.1" status: 200 len: 7380 time: 0.0553961 2019-12-04 18:41:26.310 728803 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://[fd01:7::2]:5000/v3. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://[fd01:7::2]:5000/v3: HTTPConnectionPool(host='fd01:7::2', port=5000): Max retries exceeded with url: /v3 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 18:42:45.896 728803 INFO dcmanager.manager.patch_audit_manager [-] Triggered patch audit. 2019-12-04 18:43:33.542 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Identity or Platform endpoint for online subcloud: subcloud1 not found.: ConnectFailure: Unable to establish connection to http://[fd01:7::2]:5000/v3/auth/tokens: HTTPConnectionPool(host='fd01:7::2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 18:45:41.158 728803 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://[fd01:8::2]:5000/v3. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://[fd01:8::2]:5000/v3: HTTPConnectionPool(host='fd01:8::2', port=5000): Max retries exceeded with url: /v3 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 18:47:45.917 728803 INFO dcmanager.manager.patch_audit_manager [-] Triggered patch audit. 2019-12-04 18:47:48.390 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Identity or Platform endpoint for online subcloud: subcloud2 not found.: ConnectFailure: Unable to establish connection to http://[fd01:8::2]:5000/v3/auth/tokens: HTTPConnectionPool(host='fd01:8::2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) Second pass of audit logs (snipped): 2019-12-04 19:21:47.182 728803 INFO dcmanager.manager.subcloud_audit_manager [-] Triggered subcloud audit. 2019-12-04 19:22:46.055 728803 INFO dcmanager.manager.patch_audit_manager [-] Triggered patch audit. 2019-12-04 19:23:54.790 728803 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://[fd01:7::2]:5000/v3. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://[fd01:7::2]:5000/v3: HTTPConnectionPool(host='fd01:7::2', port=5000): Max retries exceeded with url: /v3 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:26:02.022 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Identity or Platform endpoint for online subcloud: subcloud1 not found.: ConnectFailure: Unable to establish connection to http://[fd01:7::2]:5000/v3/auth/tokens: HTTPConnectionPool(host='fd01:7::2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:26:02.022 728803 INFO dcmanager.manager.subcloud_audit_manager [-] Setting new availability status: offline on subcloud: subcloud1 2019-12-04 19:26:19.431 729365 INFO eventlet.wsgi.server [req-232a7694-7382-4bf9-b479-e1745ac9456c a7a1ad96b9a341e28cd1b92f3b836a6b - - default default] fd01:6::2 "GET /v1.0/subclouds/ HTTP/1.1" status: 200 len: 7381 time: 0.0503249 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Timed out waiting for a reply to message ID 6aa6c3ef31b8438ba5fbc4274de11069: MessagingTimeout: Timed out waiting for a reply to message ID 6aa6c3ef31b8438ba5fbc4274de11069 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager Traceback (most recent call last): 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/manager/subcloud_audit_manager.py", line 220, in _periodic_subcloud_audit_loop 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager avail_to_set) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 99, in update_subcloud_states 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager availability_status=availability_status)) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 49, in call 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager return client.call(ctxt, method, **kwargs) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 465, in call 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager return self.prepare().call(ctxt, method, **kwargs) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=self.retry) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager timeout=timeout, retry=retry) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 566, in send 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=retry) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 555, in _send 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager result = self._waiter.wait(msg_id, timeout) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 447, in wait 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager message = self.waiters.get(msg_id, timeout=timeout) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 335, in get 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager 'to message ID %s' % msg_id) 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager MessagingTimeout: Timed out waiting for a reply to message ID 6aa6c3ef31b8438ba5fbc4274de11069 2019-12-04 19:27:02.124 728803 ERROR dcmanager.manager.subcloud_audit_manager 2019-12-04 19:27:02.125 728803 WARNING dcmanager.manager.subcloud_audit_manager [-] Problem informing dcorch of subcloud state change, subcloud: subcloud1: MessagingTimeout: Timed out waiting for a reply to message ID 6aa6c3ef31b8438ba5fbc4274de11069 2019-12-04 19:27:02.137 728803 INFO dcmanager.manager.subcloud_manager [-] Updating all subclouds, endpoint: None sync: unknown 2019-12-04 19:27:46.071 728803 INFO dcmanager.manager.patch_audit_manager [-] Triggered patch audit. 2019-12-04 19:29:10.053 728803 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://[fd01:8::2]:5000/v3. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://[fd01:8::2]:5000/v3: HTTPConnectionPool(host='fd01:8::2', port=5000): Max retries exceeded with url: /v3 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:30:17.000 728803 INFO oslo_messaging._drivers.amqpdriver [-] No calling threads waiting for msg_id : 6aa6c3ef31b8438ba5fbc4274de11069 2019-12-04 19:31:17.286 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Identity or Platform endpoint for online subcloud: subcloud2 not found.: ConnectFailure: Unable to establish connection to http://[fd01:8::2]:5000/v3/auth/tokens: HTTPConnectionPool(host='fd01:8::2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:31:17.286 728803 INFO dcmanager.manager.subcloud_audit_manager [-] Setting new availability status: offline on subcloud: subcloud2 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Timed out waiting for a reply to message ID 9df04986d7d849e8901ba3ecb67fd517: MessagingTimeout: Timed out waiting for a reply to message ID 9df04986d7d849e8901ba3ecb67fd517 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager Traceback (most recent call last): 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/manager/subcloud_audit_manager.py", line 220, in _periodic_subcloud_audit_loop 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager avail_to_set) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 99, in update_subcloud_states 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager availability_status=availability_status)) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 49, in call 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager return client.call(ctxt, method, **kwargs) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 465, in call 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager return self.prepare().call(ctxt, method, **kwargs) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=self.retry) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager timeout=timeout, retry=retry) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 566, in send 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=retry) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 555, in _send 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager result = self._waiter.wait(msg_id, timeout) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 447, in wait 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager message = self.waiters.get(msg_id, timeout=timeout) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 335, in get 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager 'to message ID %s' % msg_id) 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager MessagingTimeout: Timed out waiting for a reply to message ID 9df04986d7d849e8901ba3ecb67fd517 2019-12-04 19:32:17.398 728803 ERROR dcmanager.manager.subcloud_audit_manager 2019-12-04 19:32:17.399 728803 WARNING dcmanager.manager.subcloud_audit_manager [-] Problem informing dcorch of subcloud state change, subcloud: subcloud2: MessagingTimeout: Timed out waiting for a reply to message ID 9df04986d7d849e8901ba3ecb67fd517 2019-12-04 19:32:17.410 728803 INFO dcmanager.manager.subcloud_manager [-] Updating all subclouds, endpoint: None sync: unknown 2019-12-04 19:32:46.084 728803 INFO dcmanager.manager.patch_audit_manager [-] Triggered patch audit. 2019-12-04 19:34:25.317 728803 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://[fd01:9::2]:5000/v3. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://[fd01:9::2]:5000/v3: HTTPConnectionPool(host='fd01:9::2', port=5000): Max retries exceeded with url: /v3 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:35:32.264 728803 INFO oslo_messaging._drivers.amqpdriver [-] No calling threads waiting for msg_id : 9df04986d7d849e8901ba3ecb67fd517 2019-12-04 19:36:32.550 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Identity or Platform endpoint for online subcloud: subcloud3 not found.: ConnectFailure: Unable to establish connection to http://[fd01:9::2]:5000/v3/auth/tokens: HTTPConnectionPool(host='fd01:9::2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] ETIMEDOUT',)) 2019-12-04 19:36:32.550 728803 INFO dcmanager.manager.subcloud_audit_manager [-] Setting new availability status: offline on subcloud: subcloud3 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager [-] Timed out waiting for a reply to message ID c470e5c93a794d119d1e333266e85f89: MessagingTimeout: Timed out waiting for a reply to message ID c470e5c93a794d119d1e333266e85f89 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager Traceback (most recent call last): 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/manager/subcloud_audit_manager.py", line 220, in _periodic_subcloud_audit_loop 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager avail_to_set) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 99, in update_subcloud_states 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager availability_status=availability_status)) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcorch/rpc/client.py", line 49, in call 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager return client.call(ctxt, method, **kwargs) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 465, in call 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager return self.prepare().call(ctxt, method, **kwargs) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=self.retry) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager timeout=timeout, retry=retry) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 566, in send 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager retry=retry) 2019-12-04 19:37:32.657 728803 ERROR dcmanager.manager.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 555, in _send