"280.003 Subcloud Backup Failure" alarm is not removed after unmanaging/deleting subcloud

Bug #2004638 reported by Christopher de Oliveira Souza
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Christopher de Oliveira Souza

Bug Description

Brief Description
---------------------
When the subcloud backup fails, the 280.003 alarm is not removed after unmanaging/deleting the subcloud.

Severity
---------------------
Minor.

Steps to Reproduce
---------------------
    Create subcloud backup
    dcmanager subcloud-backup create --subcloud subcloud1 --sysadmin-password Li69nux*
    Watch Ansible logs and power off (or sudo reboot) the subcloud after reaching the following task: "Run subcloud1 backup playbook"
    Backup fails as it cannot connect to the subcloud (it will take some minutes)
    'Failed to connect to the host via ssh: ssh: connect to host 2620:10a:a001:a103::1016 port 22: No route to host'
    Check Backup status goes to 'failed' state
    Check 280.003 alarm using 'fm alarm-list'
    Unmanage/Delete the subcloud
    Wait some time and re-check 'fm alarm-list'. The alarm will be there even after deleting the subcloud.

Expected Behavior
----------------------
The alarm cleared after deleting the subcloud

Actual Behavior
----------------------
The alarm is still present after deleting the subcloud.

Reproducibility
----------------------
1 out of 1.

System Configuration
----------------------
Distributed Cloud

Load info (eg: 2022-03-10_20-00-07)
---------------------
22.12 - 2022-12-18_09-30-35

Last Pass
---------------------
NA

Timestamp/Logs
----------------------
// Alarm
280.003 | Subcloud Backup Failure (subcloud=subcloud1) | subcloud=subcloud1 | minor | 2022-12-28T14:17:12.924253

Alarms
---------------------
NA

Test Activity
---------------------
Feature testing

Workaround
---------------------
NA.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/distcloud/+/872643

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Christopher de Oliveira Souza (cdeolive)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/c/starlingx/distcloud/+/872643
Committed: https://opendev.org/starlingx/distcloud/commit/591a06e858526f23ccd85e5a684903d0b0e6d156
Submitter: "Zuul (22348)"
Branch: master

commit 591a06e858526f23ccd85e5a684903d0b0e6d156
Author: Christopher Souza <email address hidden>
Date: Fri Feb 3 07:57:07 2023 -0300

    Clear 280.003 alarm on subcloud deletion

    When a subcloud was deleted, the 280.003 alarm was not cleared.
    The change was to add the 280.003 alarm to the list of alarms that are checked.

    Test Plan:

    PASS: Install and bootstrap DC.
    PASS: create a subcloud backup and power off the subcloud after reaching:
    "Run subcloud1 backup playbook".
    PASS: Wait for the alarm 280.003 to be set, then umanage and
    delete the subcloud and check if the 280.003 alarm was cleared.

    Closes-bug: 2004638

    Signed-off-by: Christopher Souza <email address hidden>
    Change-Id: I70458b1fe6da4b9c0bf458ee57720a4ffaeafbfb

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.8.0 stx.distcloud
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.