*+Brief Description+*
Kube root CA orchestration failed to upgrade.
{code:java}
Logs from subcloud1 nfv-vim.log
2022-05-17T20:46:55.932 controller-0 VIM_Thread[1371003] INFO nfvi_infrastructure_api.py.1086 Existing Host state for controller-0 is updated-host-update-certs
2022-05-17T20:46:56.060 controller-0 VIM_Thread[1371003] ERROR Caught API exception while trying kube-rootca-update-host. error=[OpenStack Rest-API Exception: method=POST, url=https://[2620:10a:a001:ac12::42]:6386/v1/ihosts/ca8cc06a-7491-4cc0-b65c-e3f211644489/kube_update_ca , headers={'Content-Type': 'application/json', 'User-Agent': 'vim/1.0'}, body={"phase": "trust-new-ca"}, status_code=400, reason=HTTP Error 400: Bad Request, response_headers=[('Date', 'Tue, 17 May 2022 20:46:56 GMT'), ('Content-Length', '199'), ('Strict-Transport-Security', 'max-age=63072000; includeSubDomains'), ('Content-Type', 'application/json')], response_body={"error_message": "{\"debuginfo\": null, \"faultcode\": \"Client\", \"faultstring\": \"kube-rootca-host-update phase trust-new-ca rejected: failed to get new root CA cert secret
from kubernetes.\"}"}]
Traceback (most recent call last):
dcmanager kube-rootca-update-strategy show
+------------------------+----------------------------+
| Field | Value |
+------------------------+----------------------------+
| strategy type | kube-rootca-update |
| subcloud apply type | None |
| max parallel subclouds | None |
| stop on failure | False |
| state | failed |
| created_at | 2022-05-17 21:06:04.824714 |
| updated_at | 2022-05-17 21:08:37.273878 |
+------------------------+----------------------------+
[sysadmin@controller-0 ~(keystone_admin)]$
[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
compute-0 Ready <none> 6d20h v1.21.8
compute-1 Ready <none> 6d20h v1.21.8
compute-2 Ready <none> 6d20h v1.21.8
compute-3 Ready <none> 6d20h v1.21.8
compute-4 Ready <none> 6d20h v1.21.8
compute-5 Ready <none> 6d20h v1.21.8
controller-0 Ready control-plane,master 6d21h v1.21.8
controller-1 Ready control-plane,master 6d21h v1.21.8
[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get clusterrole
NAME CREATED AT
admin 2022-05-11T18:41:52Z
armada-api-runner 2022-05-11T18:42:12Z
calico-kube-controllers 2022-05-11T18:41:56Z
calico-node 2022-05-11T18:41:56Z
cephfs-provisioner 2022-05-11T19:36:42Z
cluster-admin 2022-05-11T18:41:52Z
cm-cert-manager-cainjector 2022-05-11T18:46:03Z
cm-cert-manager-controller-approve:cert-manager-io 2022-05-11T18:46:03Z
cm-cert-manager-controller-certificates 2022-05-11T18:46:03Z
cm-cert-manager-controller-certificatesigningrequests 2022-05-11T18:46:03Z
cm-cert-manager-controller-challenges 2022-05-11T18:46:03Z
cm-cert-manager-controller-clusterissuers 2022-05-11T18:46:03Z
cm-cert-manager-controller-ingress-shim 2022-05-11T18:46:03Z
cm-cert-manager-controller-issuers 2022-05-11T18:46:03Z
cm-cert-manager-controller-orders 2022-05-11T18:46:03Z
cm-cert-manager-edit 2022-05-11T18:46:03Z
cm-cert-manager-view 2022-05-11T18:46:03Z
cm-cert-manager-webhook:subjectaccessreviews 2022-05-11T18:46:03Z
edit 2022-05-11T18:41:52Z
ic-nginx-ingress-ingress-nginx 2022-05-11T18:45:25Z
kubeadm:get-nodes 2022-05-11T18:41:54Z
manager-role 2022-05-11T18:46:36Z
mon-elastic-services 2022-05-13T15:18:15Z
mon-filebeat-cluster-role 2022-05-13T15:38:46Z
mon-ingress-nginx 2022-05-13T15:15:28Z
mon-kube-state-metrics 2022-05-13T15:16:00Z
mon-metricbeat-cluster-role 2022-05-13T15:40:14Z
multus 2022-05-11T18:41:59Z
platform-deployment-manager-proxy-role 2022-05-11T18:46:36Z
privileged-psp-user 2022-05-11T18:42:02Z
rbd-provisioner 2022-05-11T19:36:16Z
restricted-psp-user 2022-05-11T18:42:02Z
system:aggregate-to-admin 2022-05-11T18:41:52Z
system:aggregate-to-edit 2022-05-11T18:41:52Z
system:aggregate-to-view 2022-05-11T18:41:52Z
system:auth-delegator 2022-05-11T18:41:52Z
system:basic-user 2022-05-11T18:41:52Z
system:certificates.k8s.io:certificatesigningrequests:nodeclient 2022-05-11T18:41:52Z
system:certificates.k8s.io:certificatesigningrequests:selfnodeclient 2022-05-11T18:41:52Z
system:certificates.k8s.io:kube-apiserver-client-approver 2022-05-11T18:41:52Z
system:certificates.k8s.io:kube-apiserver-client-kubelet-approver 2022-05-11T18:41:52Z
system:certificates.k8s.io:kubelet-serving-approver 2022-05-11T18:41:52Z
system:certificates.k8s.io:legacy-unknown-approver 2022-05-11T18:41:52Z
system:controller:attachdetach-controller 2022-05-11T18:41:52Z
system:controller:certificate-controller 2022-05-11T18:41:52Z
system:controller:clusterrole-aggregation-controller 2022-05-11T18:41:52Z
system:controller:cronjob-controller 2022-05-11T18:41:52Z
system:controller:daemon-set-controller 2022-05-11T18:41:52Z
system:controller:deployment-controller 2022-05-11T18:41:52Z
system:controller:disruption-controller 2022-05-11T18:41:52Z
system:controller:endpoint-controller 2022-05-11T18:41:52Z
system:controller:endpointslice-controller 2022-05-11T18:41:52Z
system:controller:endpointslicemirroring-controller 2022-05-11T18:41:52Z
system:controller:ephemeral-volume-controller 2022-05-11T18:41:52Z
system:controller:expand-controller 2022-05-11T18:41:52Z
system:controller:generic-garbage-collector 2022-05-11T18:41:52Z
system:controller:horizontal-pod-autoscaler 2022-05-11T18:41:52Z
system:controller:job-controller 2022-05-11T18:41:52Z
system:controller:namespace-controller 2022-05-11T18:41:52Z
system:controller:node-controller 2022-05-11T18:41:52Z
system:controller:persistent-volume-binder 2022-05-11T18:41:52Z
system:controller:pod-garbage-collector 2022-05-11T18:41:52Z
system:controller:pv-protection-controller 2022-05-11T18:41:52Z
system:controller:pvc-protection-controller 2022-05-11T18:41:52Z
system:controller:replicaset-controller 2022-05-11T18:41:52Z
system:controller:replication-controller 2022-05-11T18:41:52Z
system:controller:resourcequota-controller 2022-05-11T18:41:52Z
system:controller:root-ca-cert-publisher 2022-05-11T18:41:52Z
system:controller:route-controller 2022-05-11T18:41:52Z
system:controller:service-account-controller 2022-05-11T18:41:52Z
system:controller:service-controller 2022-05-11T18:41:52Z
system:controller:statefulset-controller 2022-05-11T18:41:52Z
system:controller:ttl-after-finished-controller 2022-05-11T18:41:52Z
system:controller:ttl-controller 2022-05-11T18:41:52Z
system:coredns 2022-05-11T18:41:54Z
system:discovery 2022-05-11T18:41:52Z
system:heapster 2022-05-11T18:41:52Z
system:kube-aggregator 2022-05-11T18:41:52Z
system:kube-controller-manager 2022-05-11T18:41:52Z
system:kube-dns 2022-05-11T18:41:52Z
system:kube-scheduler 2022-05-11T18:41:52Z
system:kubelet-api-admin 2022-05-11T18:41:52Z
system:monitoring 2022-05-11T18:41:52Z
system:node 2022-05-11T18:41:52Z
system:node-bootstrapper 2022-05-11T18:41:52Z
system:node-problem-detector 2022-05-11T18:41:52Z
system:node-proxier 2022-05-11T18:41:52Z
system:persistent-volume-provisioner 2022-05-11T18:41:52Z
system:public-info-viewer 2022-05-11T18:41:52Z
system:service-account-issuer-discovery 2022-05-11T18:41:52Z
system:volume-scheduler 2022-05-11T18:41:52Z
view 2022-05-11T18:41:52Z
{code}
Output from subcloud1:
kubectl get clusterole, kubectl get role commands failed to retrieve in subclouds
{code:java}
sw-manager kube-rootca-update-strategy show
Strategy Kubernetes RootCA Update Strategy:
strategy-uuid: 175e1f58-66c5-4b1a-bb89-47eabac74231
controller-apply-type: serial
storage-apply-type: parallel
worker-apply-type: parallel
max-parallel-worker-hosts: 10
default-instance-action: migrate
alarm-restrictions: relaxed
current-phase: abort
current-phase-completion: 100%
state: aborted
apply-result: failed
apply-reason: remote error: apiexception (403)
reason: forbidden
http response headers: httpheaderdict({'content-length': '256', 'x-content-type-options': 'nosniff', 'x-kubernetes-pf-prioritylevel-uid': 'c199c7aa-c5a4-48be-a84a-43fbc163f6fc', 'cache-control': 'no-cache, private', 'date': 'tue, 17 may 2022 21:07:23 gmt', 'x-kubernetes-pf-flowschema-uid': 'ef9c6975-7b47-4e6e-adef-c93f7b7de4b7', 'content-type': 'application/json'})
http response body: {"kind":"status","apiversion":"v1","metadata":{},"status":"failure","message":"nodes is forbidden: user \"kubernetes-admin\" cannot list resource \"nodes\" in api group \"\" at the cluster scope","reason":"forbidden","details":{"kind":"nodes"},"code":403}
[u'traceback (most recent call last):\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/amqp.py", line 436, in _process_data\n **args)\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch\n result = getattr(proxyobj, method)(ctxt, **kwargs)\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 12088, in get_system_health\n alarm_ignore_list=alarm_ignore_list)\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/common/health.py", line 525, in get_system_health_kube_upgrade\n alarm_ignore_list=alarm_ignore_list)\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/common/health.py", line 385, in get_system_health\n success, error_nodes = self._check_kube_nodes_ready()\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/common/health.py", line 217, in _check_kube_nodes_ready\n nodes = self._kube_operator.kube_get_nodes()\n', u' file "/usr/lib64/python2.7/site-packages/sysinv/common/kubernetes.py", line 296, in kube_get_nodes\n api_response = self._get_kubernetesclient_core().list_node()\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/apis/core_v1_api.py", line 13437, in list_node\n (data) = self.list_node_with_http_info(**kwargs)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/apis/core_v1_api.py", line 13534, in list_node_with_http_info\n collection_formats=collection_formats)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 321, in call_api\n _return_http_data_only, collection_formats, _preload_content, _request_timeout)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 155, in __call_api\n _request_timeout=_request_timeout)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 342, in request\n headers=headers)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/rest.py", line 231, in get\n query_params=query_params)\n', u' file "/usr/lib/python2.7/site-packages/kubernetes/client/rest.py", line 222, in request\n raise apiexception(http_resp=r)\n', u'apiexception: (403)\nreason: forbidden\nhttp response headers: httpheaderdict({\'content-length\': \'256\', \'x-content-type-options\': \'nosniff\', \'x-kubernetes-pf-prioritylevel-uid\': \'c199c7aa-c5a4-48be-a84a-43fbc163f6fc\', \'cache-control\': \'no-cache, private\', \'date\': \'tue, 17 may 2022 21:07:23 gmt\', \'x-kubernetes-pf-flowschema-uid\': \'ef9c6975-7b47-4e6e-adef-c93f7b7de4b7\', \'content-type\': \'application/json\'})\nhttp response body: {"kind":"status","apiversion":"v1","metadata":{},"status":"failure","message":"nodes is forbidden: user \\"kubernetes-admin\\" cannot list resource \\"nodes\\" in api group \\"\\" at the cluster scope","reason":"forbidden","details":{"kind":"nodes"},"code":403}\n\n\n']
abort-result: success
abort-reason:
kubectl get nodes
Error from server (Forbidden): nodes is forbidden: User "kubernetes-admin" cannot list resource "nodes" in API group "" at the cluster scope
[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get clusterrole
Error from server (Forbidden): clusterroles.rbac.authorization.k8s.io is forbidden: User "kubernetes-admin" cannot list resource "clusterroles" in API group "rbac.authorization.k8s.io" at the cluster scope
{code}
*+Severity+*
Provide the severity of the defect.
Minor: Failed to upgrade
*+Steps to Reproduce+*
1. Install 250 AWS subclouds
2. dcmanager kube-rootca-update-strategy create --expiry-date 2030-01-01 --max-parallel-subclouds 250 --force
dcmanager kube-rootca-update-strategy apply
*+Expected Behavior+*
Kube rootca update strategy successfully applied
*+Actual Behavior+*
Kube rootca update strategy failed to apply
*+Reproducibility+*
Reproducible
Reviewed: https:/ /review. opendev. org/c/starlingx /config/ +/845490 /opendev. org/starlingx/ config/ commit/ 144f6fc9c5d8121 7ae4887711ef362 36215e9426
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 144f6fc9c5d8121 7ae4887711ef362 36215e9426
Author: Kaustubh Dhokte <email address hidden>
Date: Fri Jun 10 20:30:49 2022 -0400
Update certs spec to work with version v1
The change https:/ /review. opendev. org/c/starlingx /config/ +/838594 io/v1alpha2 to manager. io/v1. But did not make necessary changes to certificates
updated certificate api-version from cert-manager.
cert-
specs to work with the new version.
This change makes only the required changes to certificates specs to
work with the new version: cert-manager.io/v1
The spec organization[] should now be subject: organizations[ ] /cert-manager. io/v0.13- docs/reference/ api-docs/ #cert-manager. io/v1alpha2. Certificate /cert-manager. io/docs/ reference/ api-docs/ #cert-manager. io/v1.Certifica teSpec
See the difference here,
https:/
and https:/
The organization 'system:masters' in the admin.conf certificate is binding. Without this change, all kubectl commands fail.
required to authorize the access for kubernetes-admin to cluster objects.
This authorization is specified in the 'cluster-admin'
clusterrole
In v1, unlike in v1alpha2, CN is ignored by TLS clients during /cert-manager. io/docs/ reference/ api-docs/ #cert-manager. io/v1.Certifica teSpec) organizations: ['system: masters' ] (in v1), as all the deployment pods-update trust-new- ca" (during rootCA update) with an authorization error -kubelet- client' . -kubelet- client' user.
authorization (https:/
if any subject alt name is set. My initial understanding here was that
the CN field value is being ignored due to
subject:
and daemonset pods were failing after "system kube-rootca-
--phase=
for the user 'kube-apiserver
This forces the removal of organizations from the apiserver kubelet
client certificate as all deployments and daemonset pods authenticate
and authorize with the 'kube-apiserver
Without 'system:nodes' in the kubelet client certificate, -manager fail to authorize. /kubernetes. io/docs/ reference/ access- authn-authz/ node/
kube-scheduler and kube-controller
More Info: https:/
Test Plan:
On CentOS AIO-SX:
PASS: Manual kubernetes RootCA update successful
PASS: Orchestrated kubernetes RootCA update successful.
PASS: All deployments, daemonsets and pods running as expected after
RootCA update.
Closes-Bug: 1978365
Signed-off-by: Kaustubh Dhokte <email address hidden> 510e4eb734cb4e2 82c9918840c
Change-Id: I767a70a07ab540