Subcloud audit error after upgrading the system controller

Bug #1883589 reported by Jessica Castelino
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Jessica Castelino

Bug Description

Brief Description
-----------------
After the system controllers have been upgraded to 20.06, the dcmanager subcloud audit was not operational due to an exception caused by the absence of subcloud_alarms table in 20.06. The migration of this table is skipped due to a bug is in the upgrades code (get_upgrade_databases in controllerconfig/upgrades/management.py)

Severity
--------
Major: Affects upgrade testing

Steps to Reproduce
------------------
Exception arises after the system controller is upgraded to 20.06

Expected Behavior
------------------
Subcloud_alarms table must be migrated during upgrade

Actual Behavior
----------------
Subcloud_alarms table is not getting migrated during upgrade

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
6th June, 2020

Last Pass
---------
Unknown

Timestamp/Logs
--------------
2020-06-12 01:02:29.864 216032 INFO dcmanager.audit.subcloud_audit_manager [-] Triggered subcloud audit.
2020-06-12 01:02:29.872 216032 INFO dcmanager.audit.subcloud_audit_manager [-] Waiting for subcloud audits to complete.
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager [-] Error in periodic subcloud audit loop: AttributeError: 'SubcloudNotFound' object has no attribute 'msg'
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager Traceback (most recent call last):
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/audit/subcloud_audit_manager.py", line 107, in periodic_subcloud_audit
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager self._periodic_subcloud_audit_loop()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/audit/subcloud_audit_manager.py", line 199, in _periodic_subcloud_audit_loop
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager thread.wait()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/oslo_service/threadgroup.py", line 61, in wait
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager return self.thread.wait()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 180, in wait
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager return self._exit_event.wait()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 125, in wait
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager result = hub.switch()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 297, in switch
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager return self.greenlet.switch()
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 219, in main
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager result = function(*args, **kwargs)
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/audit/subcloud_audit_manager.py", line 395, in _audit_subcloud
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager self.alarm_aggr.update_alarm_summary(subcloud_name, fm_client)
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/audit/alarm_aggregation.py", line 45, in update_alarm_summary
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager LOG.error('Failed to update alarms for %s error: %s' % (name, e))
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager File "/usr/lib/python2.7/site-packages/dcmanager/common/exceptions.py", line 58, in __unicode__
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager return encodeutils.exception_to_unicode(self.msg)
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manager AttributeError: 'SubcloudNotFound' object has no attribute 'msg'
2020-06-12 01:02:32.140 216032 ERROR dcmanager.audit.subcloud_audit_manage

Test Activity
-------------
Developer Testing

Ghada Khalil (gkhalil)
summary: - Subcloud audit error after upgrading the system controller to 20.06
+ Subcloud audit error after upgrading the system controller
tags: added: stx.distcloud stx.update
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / high priority - this impacts the stx.4.0 system upgrades feature

Changed in starlingx:
importance: Undecided → High
tags: added: stx.4.0
Changed in starlingx:
assignee: nobody → Jessica Castelino (jcasteli)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/735651
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=2a68d4c4c0c6e410c983e8bb8d176272bbf24c6a
Submitter: Zuul
Branch: master

commit 2a68d4c4c0c6e410c983e8bb8d176272bbf24c6a
Author: Jessica Castelino <email address hidden>
Date: Mon Jun 15 13:08:37 2020 -0400

    Subcloud audit error after upgrading the system controller

    After the system controllers have been upgraded to 20.06, the dcmanager
    subcloud audit was not operational due to an exception caused by the
    absence of subcloud_alarms table in 20.06. Fixes have been made to allow
    the migration of this table to the newer software version.

    Change-Id: I1e8e6578f8c9b81c852889eedd2bab7f3e467af4
    Partial-Bug: 1883589
    Signed-off-by: Jessica Castelino <email address hidden>

Revision history for this message
Bart Wensley (bartwensley) wrote :

The stx.4.0 upgrades issues has been fixed. The traceback revealed two other issues that should be fixed in the stx.5.0 release. From the review comments:

If you decide to cleanup the logs associated with this failure, there's probably 2 places
1) Where the SubcloudNotFound was raised with improper arguments
https://github.com/starlingx/distcloud/blob/master/distributedcloud/dcmanager/db/sqlalchemy/api.py#L804
2) Where self.msg does not get set if improper arguments are passed
https://github.com/starlingx/distcloud/blob/master/distributedcloud/dcmanager/common/exceptions.py#L46

Revision history for this message
Ghada Khalil (gkhalil) wrote :

I suggest you open a separate LP for the cleanup required for stx.5.0 and mark this LP as Fix Released.

tags: added: stx.5.0
removed: stx.4.0
tags: added: stx.4.0
removed: stx.5.0
Revision history for this message
Bart Wensley (bartwensley) wrote :

Raised bug 1884575 for the additional cleanup work.

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.