Distributed Cloud: failure during add subcloud prevents subcloud from being added again
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Jessica Castelino |
Bug Description
Brief Description
-----------------
If the "dcmanager subcloud add" command fails at certain points, the data for the subcloud is not cleaned up properly, leaving the system in a state where any future attempt to add the same subcloud will fail.
Severity
--------
Major: failure is severe and will affect users
Steps to Reproduce
------------------
Attempt to add a subcloud and cause a failure during one of the steps. For example, during the add_subcloud RPC sent from dcmanager to dcorch or in the creation of the endpoints.
Expected Behavior
------------------
All data for the subcloud should be cleaned up so a future attempt to add the subcloud can succeed.
Actual Behavior
----------------
Some data is not cleaned up (e.g. the dcorch data for the subcloud), which results in failures when the subcloud is added again. For example:
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
2020-02-11 13:38:13.491 860107 ERROR oslo_messaging.
Reproducibility
---------------
Reproducible (but only in failure cases)
System Configuration
-------
Distributed cloud
Branch/Pull Time/Commit
-------
Designer load built from a pull on February 4, 2020.
Last Pass
---------
Unknown
Timestamp/Logs
--------------
See above
Test Activity
-------------
Developer Testing
Workaround
----------
Use a different name for the subcloud when attempting to re-add it. However, this will leave data for the previous name in various locations.
Changed in starlingx: | |
assignee: | nobody → Dariush Eslimi (deslimi) |
Changed in starlingx: | |
assignee: | Dariush Eslimi (deslimi) → Jessica Castelino (jcasteli) |
stx.4.0 / medium priority - issue with recovering from previous failure condition