Comment 0 for bug 1876328

Revision history for this message
ayyappa (mantri425) wrote : cert-manager failed to override,apply on duplex,standard system with replicaCount 2

Brief Description
-----------------
Post installation, override the cm with the following values on standard,duplex system fails

replicaCount: 2
podLabels:
 test: pv

Severity
--------
Major

Steps to Reproduce
------------------
1)After installation, override the cm with the following values
[sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-update --values cm_values.yaml cert-manager cert-manager cert-manager
+----------------+-----------------+
| Property | Value |
+----------------+-----------------+
| name | cert-manager |
| namespace | cert-manager |
| user_overrides | podLabels: |
| | test: pv |
| | replicaCount: 2 |
| | |
+----------------+-----------------+

2)Apply the application
[sysadmin@controller-0 ~(keystone_admin)]$ system application-apply cert-manager
+---------------+----------------------------------+
| Property | Value |
+---------------+----------------------------------+
| active | True |
| app_version | 1.0-0 |
| created_at | 2020-05-01T14:17:40.817460+00:00 |
| manifest_file | certmanager-manifest.yaml |
| manifest_name | cert-manager-manifest |
| name | cert-manager |
| progress | None |
| status | applying |
| updated_at | 2020-05-01T14:18:31.900064+00:00 |
+---------------+----------------------------------+
Please use 'system application-list' or 'system application-show cert-manager' to view the current progress.

3)The controller pod is scaled to standby controller, but an extra pod keeps in pending state with the following error which eventually fails the application apply

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get pods -n cert-manager -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cm-cert-manager-5ff78759f-5qm4j 0/1 Pending 0 84s <none> <none> <none> <none>
cm-cert-manager-7b8b94bf9f-27425 1/1 Running 0 84s 172.16.166.132 controller-1 <none> <none>
cm-cert-manager-7b8b94bf9f-cfwr8 1/1 Running 1 57m 172.16.192.75 controller-0 <none> <none>
cm-cert-manager-cainjector-56b68989b5-xx4ln 1/1 Running 1 57m 172.16.192.77 controller-0 <none> <none>
cm-cert-manager-webhook-7d5c897795-p6d64 1/1 Running 1 57m 172.16.192.76 controller-0 <none> <none>

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl describe pod cm-cert-manager-5ff78759f-5qm4j -n cert-manager
Name: cm-cert-manager-5ff78759f-5qm4j
Namespace: cert-manager
Priority: 0
Node: <none>
Labels: app=cert-manager
                app.kubernetes.io/component=controller
                app.kubernetes.io/instance=cm-cert-manager
                app.kubernetes.io/managed-by=Tiller
                app.kubernetes.io/name=cert-manager
                helm.sh/chart=cert-manager-v0.1.0
                pod-template-hash=5ff78759f
                test=pv
Annotations: prometheus.io/path: /metrics
                prometheus.io/port: 9402
                prometheus.io/scrape: true
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/cm-cert-manager-5ff78759f
Containers:
  cert-manager:
    Image: registry.local:9001/quay.io/jetstack/cert-manager-controller:v0.15.0-alpha.1
    Port: 9402/TCP
    Host Port: 0/TCP
    Args:
      --v=2
      --cluster-resource-namespace=$(POD_NAMESPACE)
      --leader-election-namespace=kube-system
      --acme-http01-solver-image=registry.local:9001/quay.io/jetstack/cert-manager-acmesolver:v0.15.0-alpha.1
    Environment:
      POD_NAMESPACE: cert-manager (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from cm-cert-manager-token-hfhnn (ro)
Conditions:
  Type Status
  PodScheduled False
Volumes:
  cm-cert-manager-token-hfhnn:
    Type: Secret (a volume populated by a Secret)
    SecretName: cm-cert-manager-token-hfhnn
    Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node.kubernetes.io/not-ready:NoExecute for 30s
                 node.kubernetes.io/unreachable:NoExecute for 30s
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Warning FailedScheduling 27s (x10 over 8m3s) default-scheduler 0/2 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules.
[sysadmin@controller-0 ~(keystone_admin)]$

Expected Behavior
------------------
The cm controller pods should be scaled on controller nodes without any erros

Actual Behavior
----------------
an extra pod stays in pending state with an error

Reproducibility
---------------
100%

System Configuration
--------------------

duplex system,wc_61_62_ipv4

Branch/Pull Time/Commit
-----------------------
2020-04-25

Last Pass
---------
2020-02-24
The override values didn't take effect, but the apply didn't get rejected

Timestamp/Logs
--------------
2020-04-28 10:03:51,363

Test Activity
-------------
Feature testing

Workaround
----------
remove,delete and apply with default values in chart