Distributed Cloud: Many clients received http 500 errors in batch deployment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Jessica Castelino |
Bug Description
Brief Description
-----------------
The add subcloud REST api returns http 500 errors
Severity
--------
Major - the add subcloud requests are still being processed by the backend but these errors will give the users the impression that their requests have failed. This also breaks automated batch subcloud deployment.
Steps to Reproduce
------------------
Send 50 add subcloud requests via CLI or REST API simultaneously
Expected Behavior
------------------
REST API request - http 200 resonse
CLI request - a confirmed response with subcloud UUID
Actual Behavior
----------------
REST API - http 500 response
CLI - ERROR Unable to add subcloud
A sample timed out log in dcmanager.log
Traceback (most recent call last):
File ""/usr/
return self.rpc_
File ""/usr/
payload=
File ""/usr/
return client.call(ctxt, method, **kwargs)
File ""/usr/
return self.prepare(
File ""/usr/
retry=
File ""/usr/
timeout=
File ""/usr/
retry=retry)
File ""/usr/
result = self._waiter.
File ""/usr/
message = self.waiters.
File ""/usr/
'to message ID %s' % msg_id)
MessagingTimeout: Timed out waiting for a reply to message ID c654c39f48624cf
Reproducibility
---------------
100% reproducible
System Configuration
-------
IPv6 distributed cloud
Branch/Pull Time/Commit
-------
Feb. 22 master
Last Pass
---------
Not certain when this test case was verified
Timestamp/Logs
--------------
dcmanager logs attached
Test Activity
-------------
Evaluation
Changed in starlingx: | |
assignee: | Tao Liu (tliu88) → Jessica Castelino (jcasteli) |
stx.4.0 / medium priority - issue appears to be tied to the # of subclouds being added simultaneously. Should be investigated/ addressed.