cert-mon: uncaught exceptions within audit greenpool are not logged

Bug #2006136 reported by Manoel Benedito Neto
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Manoel Benedito Neto

Bug Description

Brief Description
-----------------
It was difficult to catch an uncaught exception within the subcloud audit greenpool thread was not logged.

We need to ensure the greenpool is properly hooked into logging subsystem.

Severity
--------
Minor

Steps to Reproduce
------------------
Create an uncaught exception in certificate_mon_manager.py:do_subcloud_audit()

Expected Behavior
-----------------
We should see a traceback in the log file.

Actual Behavior
---------------
Silence

Reproducibility
---------------
Reproducible

System Configuration
--------------------
N/A

Branch/Pull Time/Commit
-----------------------
unknown

Last Pass
---------
unknown

Timestamp/Logs
--------------
2021-12-08T15:08:09.000 controller-0 cert-mon: info 2021-12-08 15:08:09.948 175173 INFO sysinv.cert_mon.service [req-6aa5fce8-6d44-4778-8a62-9912460797c9 - - - - -] subcloud845 is online. An online audit is queued
2021-12-08T15:08:09.000 controller-0 cert-mon: info 2021-12-08 15:08:09.950 175173 INFO sysinv.cert_mon.subcloud_audit_queue [req-6aa5fce8-6d44-4778-8a62-9912460797c9 - - - - -] Enqueued: SubcloudAuditData: \{name: subcloud845, audit_count: 1}
2021-12-08T15:08:12.000 controller-0 cert-mon: info 175173 INFO sysinv.cert_mon.certificate_mon_manager [-] Auditing subcloud subcloud845, attempt #1 [qsize: 0]
2021-12-08T15:08:12.000 controller-0 cert-mon: err 175173 ERROR sysinv.cert_mon.utils [-] Cannot find sysinv endpoint for subcloud845
2021-12-08T15:08:12.000 controller-0 cert-mon: info 175173 INFO sysinv.cert_mon.utils [-] api_cmd http://[2620:10a:a001:a114::d00]:8119/v1.0/subclouds/subcloud845
2021-12-08T15:17:21.000 controller-0 cert-mon: info 175173 INFO sysinv.cert_mon.utils [-] api_cmd http://[2620:10a:a001:a114::d00]:8119/v1.0/subclouds/subcloud10

Alarms
------
N/A

Test Activity
-------------
System testing

Workaround
----------
N/A

Changed in starlingx:
assignee: nobody → Manoel Benedito Neto (mbenedit)
Changed in starlingx:
status: New → In Progress
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/872700
Committed: https://opendev.org/starlingx/config/commit/51afe1ee79463ea9879380819f901520e1cf9bdd
Submitter: "Zuul (22348)"
Branch: master

commit 51afe1ee79463ea9879380819f901520e1cf9bdd
Author: Manoel Benedito Neto <email address hidden>
Date: Fri Feb 3 15:59:40 2023 -0300

    Add exception clause and logs for subcloud audit

    Added an except clause to log uncaught exceptions that may happen when
    _subcloud_audit function is called.

    Test Plan:
    PASS: In a DX DC system with AIO-SX subcloud offline, observe the log
          exception message and callstack traceback to be thrown on the
          /var/log/cert-mon.log file, as code execution proceeds to
          finally clause in do_subcloud_audit function.
    PASS: In a DX DC system with simplex subcloud online, force an invalid
          subcloud audit item to assume None value. Observe the log
          exception message and callstack traceback to be thrown in the
          /var/log/cert-mon.log file, as code execution proceeds to finally
          clause in do_subcloud_audit function.

    Closes-Bug: 2006136
    Signed-off-by: Manoel Benedito Neto <email address hidden>
    Change-Id: I3e92db11a9339a9637ea036b2c6522a47b4a803f

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.9.0 stx.security
Ghada Khalil (gkhalil)
tags: added: stx.config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.