applying cert-manager failed during installation

Bug #1885295 reported by Difu Hu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Bob Church

Bug Description

Brief Description
-----------------
applying cert-manager failed with "Failed to copy Secret ceph-pool-kube-rbd from Namespace kube-system to Namespace cert-manager: (404)"

Severity
--------
Major

Steps to Reproduce
------------------
install controller-0
wait for cert-manager to be applied (1)
install other nodes (controller-1, compute-0, compute-1, compute-2)
wait for cert-manager to be applied (2)

Expected Behavior
------------------
both applying cert-manager succeed

Actual Behavior
----------------
The (1) apply succceed, but the (2) failed

Reproducibility
---------------
Intermittent - happened 2/2 times on one specific system

System Configuration
--------------------
Lab-name: wp_3_7

Branch/Pull Time/Commit
-----------------------
2020-06-24_22-16-59

Last Pass
---------
Not sure

Timestamp/Logs
--------------
+----------+---------------------------+-----------------------+----------+----------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+---------------------------+-----------------------+----------+----------------+
| 750.002 | Application Apply Failure | k8s_application=cert- | major | 2020-06-26T07: |
| | | manager | | 00:05.995388 |
| | | | | |
+----------+---------------------------+-----------------------+----------+----------------+

sysinv 2020-06-26 06:59:49.938 99483 INFO sysinv.conductor.manager [-] There has been an overrides change, setting up reapply of cert-manager
sysinv 2020-06-26 06:59:55.467 99483 INFO sysinv.conductor.manager [-] Reapplying cert-manager app
sysinv 2020-06-26 06:59:55.989 99483 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app cert-manager
sysinv 2020-06-26 06:59:59.150 99483 INFO sysinv.conductor.kube_app [-] Application cert-manager (20.06-4) apply started.
sysinv 2020-06-26 07:00:05.473 99483 ERROR sysinv.common.kubernetes [-] Failed to copy Secret ceph-pool-kube-rbd from Namespace kube-system to Namespace cert-manager: (404)
sysinv 2020-06-26 07:00:06.567 99483 INFO sysinv.conductor.kube_app [-] Deregister the abort status of app cert-manager

Test Activity
-------------
Regression Testing

Revision history for this message
Difu Hu (difuhu) wrote :
Difu Hu (difuhu)
description: updated
description: updated
description: updated
Revision history for this message
Difu Hu (difuhu) wrote :

manually reapply it successfully.
system application-apply cert-manager

Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / medium - intermittent issue, started happening about 2 days ago, so it could be related to recent commits in the cert-mgr helm application.

tags: added: stx.4.0 stx.containers
Changed in starlingx:
assignee: nobody → Bob Church (rchurch)
importance: Undecided → High
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/738288

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/738288
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=b827e856a7d9f5412e96b9cb24fc2242cd3c5add
Submitter: Zuul
Branch: master

commit b827e856a7d9f5412e96b9cb24fc2242cd3c5add
Author: Robert Church <email address hidden>
Date: Fri Jun 26 17:54:41 2020 -0400

    Skip copying the rbd provisioner secret for platform apps

    Do not attempt to copy/delete the rbd provisioner secret for the
    platform applications that do not require a persistent volume.

    Change-Id: Ic4adcae53c69703233a9c523941715166da19e3d
    Closes-Bug: #1885295
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Difu Hu (difuhu) wrote :

verified on wp_3_7 build 2020-06-27_00-41-42

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.