Horizon dashboard dies periodically

Bug #1910300 reported by Tony Gray
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MicroStack
Triaged
High
Unassigned

Bug Description

We're running MicroStack on a test system, and while it's generally working very well, the dashboard dies from time to time, and networking dies on some (but not all) instances at the same time. The instances themselves all still appear to be running, but we lose connectivity with some and not others.

Sometimes the system runs for weeks at a time before failing. Most recently, the dashboard died within 18 hours of the system being restarted.

System version is ubuntu 20.04.
Microstack version is microstack ussuri 218 latest/beta installed in devmode.

When the dashboard is dead, the traceback is displayed as follows (hostname obscured):

-------------

Environment:

Request Method: GET
Request URL: https://example.com/auth/login/

Django Version: 2.2.12
Python Version: 3.8.5
Installed Applications:
['openstack_dashboard.dashboards.project',
 'openstack_dashboard.dashboards.admin',
 'openstack_dashboard.dashboards.identity',
 'openstack_dashboard.dashboards.settings',
 'openstack_dashboard',
 'django.contrib.contenttypes',
 'django.contrib.auth',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.humanize',
 'django_pyscss',
 'debreach',
 'openstack_dashboard.django_pyscss_fix',
 'compressor',
 'horizon',
 'openstack_auth']
Installed Middleware:
('openstack_auth.middleware.OpenstackAuthMonkeyPatchMiddleware',
 'debreach.middleware.RandomCommentMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'horizon.middleware.OperationLogMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'horizon.middleware.HorizonMiddleware',
 'horizon.themes.ThemeMiddleware',
 'django.middleware.locale.LocaleMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware',
 'openstack_dashboard.contrib.developer.profiler.middleware.ProfilerClientMiddleware',
 'openstack_dashboard.contrib.developer.profiler.middleware.ProfilerMiddleware')

Traceback:

File "/snap/microstack/218/lib/python3.8/site-packages/django/core/handlers/exception.py" in inner
  34. response = get_response(request)

File "/snap/microstack/218/lib/python3.8/site-packages/django/core/handlers/base.py" in _get_response
  145. response = self.process_exception_by_middleware(e, request)

File "/snap/microstack/218/lib/python3.8/site-packages/django/core/handlers/base.py" in _get_response
  143. response = response.render()

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/response.py" in render
  106. self.content = self.rendered_content

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/response.py" in rendered_content
  81. template = self.resolve_template(self.template_name)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/response.py" in resolve_template
  63. return select_template(template, using=self.using)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/loader.py" in select_template
  42. return engine.get_template(template_name)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/backends/django.py" in get_template
  34. return Template(self.engine.get_template(template_name), self)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/engine.py" in get_template
  143. template, origin = self.find_template(template_name)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/engine.py" in find_template
  125. template = loader.get_template(name, skip=skip)

File "/snap/microstack/218/lib/python3.8/site-packages/django/template/loaders/base.py" in get_template
  18. for origin in self.get_template_sources(template_name):

File "/snap/microstack/218/lib/python3.8/site-packages/horizon/themes.py" in get_template_sources
  149. os.path.abspath('openstack_dashboard')

File "/usr/lib/python3.8/posixpath.py" in abspath
  379. cwd = os.getcwd()

Exception Type: FileNotFoundError at /auth/login/
Exception Value: [Errno 2] No such file or directory

-------------

The same issue appears to have been reported by someone else over at https://forum.snapcraft.io/t/microstack-dashboard-stops-working-after-a-while/21045

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hi Tony,

Thank you for reporting this. I can confirm that I'm able to recreate it as well so I will triage this bug now. I just restarted the dashboard though so of course it is back to working ok as a result which makes it harder to debug. I will dig in more the next time I hit it.

Thanks,
Corey

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Tony do you have any more specifics on how networking dies? I think we may want to open a separate bug to track that as it is likely a different issue.

Changed in microstack:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Corey Bryant (corey.bryant) wrote :

While there is a workaround [1] this really does not present itself well so triaging as high.
[1] sudo systemctl restart snap.microstack.horizon-uwsgi

Revision history for this message
Billy Olsen (billy-olsen) wrote :

I have encountered this as well. Horizon is erroring out due to the os.cwd() command raising FileNotFound, which essentially means that the directory or path that the command was operating from is no longer available.

I began to wonder why this would happen as the current working directory shouldn't be removed (or changed to a tempdir that was cleaned up).

Checking logs for snapd, I can see it was refreshed. Indeed, I can force this error by reverting snapd to a previous revision, e.g.:

$ snap list --all snapd
Name Version Rev Tracking Publisher Notes
snapd 2.49 11115 latest/stable canonical✓ snapd,disabled
snapd 2.49.1 11408 latest/stable canonical✓ snapd

$ sudo snap revert snapd
2021-03-26T14:22:48-07:00 INFO Waiting for automatic snapd restart...
snapd reverted to 2.49

and voila, the horizon dashboard now fails :-( with the traceback in this path.

(p.s. revert back to the latest snapd with sudo snap revert --revision <foo> snapd - which will break horizon-uwsgi once again).

This actually may also be related to losing networking as well.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Scratch the part about being related to losing networking as well. I can't recreate any network blips with my above steps, but I can 100% recreate the horizon issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.