Active certificate alarms are not cleared on the system

Bug #1978730 reported by Reinildes Oliveira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Reinildes Oliveira

Bug Description

*+Brief Description+*

500.210 oidc-auth-apps-certificate expired not deleted automatically when the cert is deleted

*+Severity+*

Major

*+Steps to Reproduce+*

1)Create the following issuer

{code:java}
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: system-selfsigning-issuer
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: cloudplatform-rootca-certificate
  namespace: kube-system
spec:
  secretName: cloudplatform-rootca-secret
  commonName: "cloudplatform-rootca"
  isCA: true
  duration: 43800h0m0s
  renewBefore: 720h0m0s
  issuerRef:
    name: system-selfsigning-issuer
    kind: ClusterIssuer
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: cloudplatform-rootca-issuer
  namespace: kube-system
spec:
  ca:
    secretName: cloudplatform-rootca-secret
---
{code}
2)request the following cert
{code:java}
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: oidc-auth-apps-certificate
  namespace: kube-system
spec:
  duration: 1h
  renewBefore: 55m
  secretName: oidc-auth-apps-certificate
  dnsNames:
  - cgcs-subcloud1.eng.com
  ipAddresses:
  - 2620:10a:a001:ac00::a
  organization:
  - MY-System
  issuerRef:
    name: cloudplatform-rootca-issuer
    kind: Issuer
{code}

3)verify cert is issued
{code:java}
[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get cert -n kube-system
NAME READY SECRET AGE
cloudplatform-rootca-certificate True cloudplatform-rootca-secret 12s
oidc-auth-apps-certificate True oidc-auth-apps-certificate 22h

{code}

4)now delete the issuer
{code:java}
kubectl delete -f issuer.yaml
{code}

5)also delete the certificate

6)Now system raises the expiring alarm and after sometime the alarm changed to expired alarm.

I expect active alarm audit to run every 1 hr only on alarms that are active in FM system
I waited more than an hour but its not cleared

{code:java}
controller-0:~$ source /etc/platform/openrc
fm a[sysadmin@controller-0 ~(keystone_admin)]$ fm alarm-list
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| 500.200 | Certificate namespace=kube-system, certificate=oidc-auth-apps-certificate is expiring soon on | namespace=kube-system. | major | 2022-05-25T14:00: |
| | 2022-05-25, 14:58:17 | certificate=oidc-auth-apps- | | 00.925098 |
| | | certificate | | |
| | | | | |
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
[sysadmin@controller-0 ~(keystone_admin)]$ date
Wed May 25 14:50:43 UTC 2022
[sysadmin@controller-0 ~(keystone_admin)]$ fm alarm-list
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| 500.200 | Certificate namespace=kube-system, certificate=oidc-auth-apps-certificate is expiring soon on | namespace=kube-system. | major | 2022-05-25T14:00: |
| | 2022-05-25, 14:58:17 | certificate=oidc-auth-apps- | | 00.925098 |
| | | certificate | | |
| | | | | |
+----------+-------------------------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
[sysadmin@controller-0 ~(keystone_admin)]$ watch fm alarm-list
[sysadmin@controller-0 ~(keystone_admin)]$

{code}

*+Expected Behavior+*

I expect active alarm audit to run every 1 hr only on alarms that are active in FM system
I waited more than an hour but its not cleared

*+Actual Behavior+*

alarm stays forever

*+Reproducibility+*

100%

*+System Configuration+*

ipv6 standard system

*+Alarms+*
{code:java}
Every 2.0s: fm alarm-list Wed May 25 15:12:18 2022

+----------+------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
| 500.210 | Certificate namespace=kube-system, certificate=oidc-auth-apps-certificate expired. | namespace=kube-system. | critical | 2022-05-25T14:59: |
| | | certificate=oidc-auth-apps- | | 59.536261 |
| | | certificate | | |
| | | | | |
+----------+------------------------------------------------------------------------------------+-----------------------------+----------+-------------------+
{code}

*+Test Activity+*

regression

*+Workaround+*

manually delete the alarms

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/845817

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Reinildes Oliveira (rjosemat)
description: updated
Ghada Khalil (gkhalil)
tags: added: stx.7.0 stx.config stx.security
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/845817
Committed: https://opendev.org/starlingx/config/commit/66ac141a28c6a5a802123b7d2ed58b35e2c95ced
Submitter: "Zuul (22348)"
Branch: master

commit 66ac141a28c6a5a802123b7d2ed58b35e2c95ced
Author: Rei Oliveira <email address hidden>
Date: Tue Jun 14 17:34:02 2022 -0300

    Delete certificate alarm when secret is deleted

    For certificates stored as kubernetes tls secrets, the alarm should be
    cleared when the secret is deleted.

    This changes the audit_for_deleted_certificates function to also check
    for deleted secrets and subsequently clear the alarm and delete the
    certificate snapshot information.

    Test plan:

    PASS: Verify that when a certificate is deleted the alarm is cleared
          from the system
    PASS: Verify that deploying a soon-to-expire certificates results in
          an alarm in 'fm alarm-list'
    PASS: Verify that an existing certificate alarm is cleared up by
          renewing the certificate to get a valid certificate

    Closes-Bug: 1978730

    Signed-off-by: Rei Oliveira <email address hidden>
    Change-Id: I6ed9248766b2abbbcc616e10d4575b4ae0471c9d

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.