Internal server error in gui when subcloud goes down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Tyler Smith |
Bug Description
Brief Description
-----------------
It is possible for a user to be redirected to an offline subcloud which results in the horizon error page.
Severity
--------
Major
Steps to Reproduce
------------------
Navigate to a subcloud which is managed but offline, or login from a fresh browser and be sent directly there (semi-reproducible, depends on the ordering of the keystone endpoints)
Expected Behavior
------------------
User is not sent to the offline subcloud
Actual Behavior
----------------
User sees the horizon error screen due to timeout exceptions being thrown after 5mins or so
Reproducibility
---------------
Intermittent
System Configuration
-------
Distributed Cloud
Branch/Pull Time/Commit
-------
Master
Test Activity
-------------
Feature Testing
Workaround
----------
None, other than recovering the subcloud
Changed in starlingx: | |
assignee: | nobody → Tyler Smith (tyler.smith) |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.5.0 |
There are 4 scenarios where a user can end up on an error page like this, the issue was noticed when the first scenario was hit:
1. The user logs into horizon for the first time or with a new browser, and the endpoints happen to be arranged in such a way that a subcloud is chosen to be the primary login region, and that subcloud is offline for whatever reason
2. The user jumps to a region which is managed but offline from the cloud overview page
3. The user was browsing a subcloud while it goes offline
4. The user was browsing an online subcloud and gets logged out, and when the user tries to log in again is automatically redirected to that subcloud which has since gone offline
I will submit a fix that will remedy scenarios 1 and 2. It looks like fixing 4 could only be done by introducing a new setting in upstream horizon to override the services_region cookie, and i don't see a practical way to fix scenario 3. The workaround for 3 and 4 is to clear your browser cookies