cert-manager failed to override,apply on post-install
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Sabeel Ansari |
Bug Description
Brief Description
-----------------
Post installation, override the cm with the following values on standard,duplex system fails
replicaCount: 2
podLabels:
test: pv
and also just override with podLabels also fails on simplex,duplex and standard systems
Severity
--------
Major
Steps to Reproduce
------------------
1)After installation, override the cm with the following values
[sysadmin@
+------
| Property | Value |
+------
| name | cert-manager |
| namespace | cert-manager |
| user_overrides | podLabels: |
| | test: pv |
| | replicaCount: 2 |
| | |
+------
2)Apply the application
[sysadmin@
+------
| Property | Value |
+------
| active | True |
| app_version | 1.0-0 |
| created_at | 2020-05-
| manifest_file | certmanager-
| manifest_name | cert-manager-
| name | cert-manager |
| progress | None |
| status | applying |
| updated_at | 2020-05-
+------
Please use 'system application-list' or 'system application-show cert-manager' to view the current progress.
3)The controller pod is scaled to standby controller, but an extra pod keeps in pending state with the following error which eventually fails the application apply
[sysadmin@
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cm-cert-
cm-cert-
cm-cert-
cm-cert-
cm-cert-
[sysadmin@
Name: cm-cert-
Namespace: cert-manager
Priority: 0
Node: <none>
Labels: app=cert-manager
Annotations: prometheus.io/path: /metrics
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/
Containers:
cert-manager:
Image: registry.
Port: 9402/TCP
Host Port: 0/TCP
Args:
--v=2
-
-
-
Environment:
POD_
Mounts:
/
Conditions:
Type Status
PodScheduled False
Volumes:
cm-cert-
Type: Secret (a volume populated by a Secret)
SecretName: cm-cert-
Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.
Tolerations: node.kubernetes
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 27s (x10 over 8m3s) default-scheduler 0/2 nodes are available: 2 node(s) didn't match pod affinity/
[sysadmin@
4)Also tried to override with just podLabel without replica on simplex subcloud on DC and the new pods stuck in pending state eventually failing the override apply
cat cm_values_
podLabels:
test: pv
[sysadmin@
Name: cm-cert-
Namespace: cert-manager
Priority: 0
Node: <none>
Labels: app=cert-manager
Annotations: prometheus.io/path: /metrics
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/
Containers:
cert-manager:
Image: registry.
Port: 9402/TCP
Host Port: 0/TCP
Args:
--v=2
-
-
-
Environment:
POD_
Mounts:
/
Conditions:
Type Status
PodScheduled False
Volumes:
cm-cert-
Type: Secret (a volume populated by a Secret)
SecretName: cm-cert-
Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.
Tolerations: node.kubernetes
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod affinity/
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod affinity/
[sysadmin@
NAME STATUS ROLES AGE VERSION
controller-0 Ready master 149m v1.18.1
[sysadmin@
NAME READY STATUS RESTARTS AGE
cm-cert-
cm-cert-
cm-cert-
cm-cert-
[sysadmin@
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cm-cert-
cm-cert-
cm-cert-
cm-cert-
[sysadmin@
Expected Behavior
------------------
The cm controller pods should be scaled on both the controller nodes without any errors
Actual Behavior
----------------
an extra pod stays in pending state with an error
Reproducibility
---------------
100%
System Configuration
-------
duplex system,
Branch/Pull Time/Commit
-------
2020-04-28
Last Pass
---------
NA
Timestamp/Logs
--------------
2020-05-
Test Activity
-------------
Feature testing
Workaround
----------
remove,delete and apply with default values in chart
description: | updated |
description: | updated |
description: | updated |
summary: |
- cert-manager failed to override,apply on duplex,standard system with - replicaCount 2 + cert-manager failed to override,apply on post-install |
description: | updated |
Changed in starlingx: | |
assignee: | Ghada Khalil (gkhalil) → Sabeel Ansari (sansariwr) |
Marking as stx.4.0 for now until further investigation. Issue related to the stx.4.0 cert-mgr feature.
Need clarification from Greg Waines about which user scenario would require this override. I had assumed the replicaCount would be adjusted by the system based on the deployment config (SX vs DX)