kubernetes apiserver certificate needs rotation

Bug #1838659 reported by David Sullivan
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Mingyuan Qi

Bug Description

Brief Description
-----------------
When the apiserver/apiserver-kubelet-client certficiates expire access to the kubernetes api server is lost.

Severity
--------
Critical

Steps to Reproduce
------------------
Install and configure an AIO-SX system
Verify the expiry of the apiserver certificate with `openssl x509 -noout -text -in /etc/kubernetes/pki/apiserver.crt`
Set the date of the system to later than the certificate expiry

Expected Behavior
------------------
Sometime before the expiry a new certificate needs to be generated.

Actual Behavior
----------------
Kubelet can not connect to the apiserver as the certificate is no longer valid.

Reproducibility
---------------
100%

System Configuration
--------------------
All systems

Branch/Pull Time/Commit
-----------------------
20190728T233000Z

Last Pass
---------
NA

Timestamp/Logs
--------------
controller-0:/home/sysadmin# openssl x509 -text -noout -in /etc/kubernetes/pki/apiserver.crt
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 5878483830693726211 (0x519491e602608803)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=kubernetes
        Validity
            Not Before: Jul 25 21:06:05 2019 GMT
            Not After : Jul 24 21:06:05 2020 GMT
        Subject: CN=kube-apiserver
...

controller-0:/home/sysadmin# date
Fri Jul 24 21:00:02 UTC 2020
controller-0:/home/sysadmin# date
Fri Jul 24 21:07:18 UTC 2020
controller-0:/home/sysadmin# kubectl get pods -n kube-system
Unable to connect to the server: x509: certificate has expired or is not yet valid

Test Activity
-------------
Developer Testing

Frank Miller (sensfan22)
tags: added: stx.2.0 stx.config stx.containers
Changed in starlingx:
status: New → Triaged
importance: Undecided → High
Cindy Xie (xxie1)
Changed in starlingx:
assignee: nobody → Mingyuan Qi (myqi)
Revision history for this message
Mingyuan Qi (myqi) wrote :

Successfully updated apiserver, controller-manager, scheduler, kubelet, kubectl certificate manually. Figuring out an approach to automatically detect certificate expiration.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/692276

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding the stx.3.0 release tag explicitly as this fix is needed for that release as well

tags: added: stx.3.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fault (master)

Fix proposed to branch: master
Review: https://review.opendev.org/696224

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fault (master)

Reviewed: https://review.opendev.org/696224
Committed: https://git.openstack.org/cgit/starlingx/fault/commit/?id=bc6796cdc7995575409fd47b4d4f9a8b31f91ebf
Submitter: Zuul
Branch: master

commit bc6796cdc7995575409fd47b4d4f9a8b31f91ebf
Author: Mingyuan Qi <email address hidden>
Date: Wed Nov 27 02:41:40 2019 +0000

    Add an alarm for k8s certificate rotation

    This alarm will be raised if the automatic k8s cert rotation failed
    The k8s cert automatic rotation is implemented in commit:
    https://review.opendev.org/#/c/692276/

    Change-Id: Idddd6ae7b83bc40b805e85004994c48cd801ee75
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698408

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698624

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on integ (master)

Change abandoned by Mingyuan Qi (<email address hidden>) on branch: master
Review: https://review.opendev.org/698408
Reason: move the script to sysinv pkg in config repo: https://review.opendev.org/#/c/698624/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/698624
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=7b64d87a26fac2a5bcb2fce31ffa31b92b0e6f17
Submitter: Zuul
Branch: master

commit 7b64d87a26fac2a5bcb2fce31ffa31b92b0e6f17
Author: Mingyuan Qi <email address hidden>
Date: Thu Dec 12 05:57:59 2019 +0000

    Add k8s cert rotation script to sysinv pkg

    This commit is derived from https://review.opendev.org/#/c/692276

    This script checks the cert expiration date and rotates them if they
    expires within 90 days. After cert renewed, all the k8s master
    component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I286c38758ded2f661498367c7adf4dd59e603b83
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Mingyuan, Are there additional commits required to address this bug? The commits above only used "Partial-Bug".

If additional commits are required, when do you expect to have these proposed/merged?
If no additional commits are required, please add a note and mark the bug as Fix Released. Then please proceed to cherry-pick the changes to r/stx.2.0 and r/stx.3.0 once sufficient testing is done in master.

Revision history for this message
Mingyuan Qi (myqi) wrote :

@Ghada, this commit https://review.opendev.org/#/c/692276/ closes this bug, it gets +2 from Bart, but still waiting for Don's review. I'm not sure whether he is still in vacation.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/692276
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=e86f8b90fd71c6c2df5613ac83dcb9a357f5a364
Submitter: Zuul
Branch: master

commit e86f8b90fd71c6c2df5613ac83dcb9a357f5a364
Author: Mingyuan Qi <email address hidden>
Date: Thu Oct 31 11:16:01 2019 +0800

    Rotate k8s certificate automatically

    By default, k8s cluster certificates generated by kubeadm have 1
    year expiration. After certificates expired, k8s will not rotate
    them automatically.

    This commit checks the cert expiration date every day and rotates
    them automatically if they expires within 90 days. After cert
    renewed, all the k8s master component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I383120b8904857bcf09ad6ca999900ce8eda9b95
    Closes-Bug: 1838659
    Depends-On: https://review.opendev.org/#/c/696224/
    Depends-On: https://review.opendev.org/#/c/698624/
    Signed-off-by: Mingyuan Qi <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Mingyuan, Now that all the code is merged in master, please cherrypick to the r/stx.2.0 and r/stx.3.0 branches. Thanks.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/705382

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fault (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/705383

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/705384

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fault (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/705385

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (r/stx.3.0)

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/705386

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/705387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fault (r/stx.3.0)

Reviewed: https://review.opendev.org/705385
Committed: https://git.openstack.org/cgit/starlingx/fault/commit/?id=14b2ac71e9d616e3246d0a032c7f17b0cfaed988
Submitter: Zuul
Branch: r/stx.3.0

commit 14b2ac71e9d616e3246d0a032c7f17b0cfaed988
Author: Mingyuan Qi <email address hidden>
Date: Wed Nov 27 02:41:40 2019 +0000

    Add an alarm for k8s certificate rotation

    This alarm will be raised if the automatic k8s cert rotation failed
    The k8s cert automatic rotation is implemented in commit:
    https://review.opendev.org/#/c/692276/

    Change-Id: Idddd6ae7b83bc40b805e85004994c48cd801ee75
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit bc6796cdc7995575409fd47b4d4f9a8b31f91ebf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fault (r/stx.2.0)

Reviewed: https://review.opendev.org/705383
Committed: https://git.openstack.org/cgit/starlingx/fault/commit/?id=e1e4c671c31f2623f02b1a1fb67ad67612e6aa21
Submitter: Zuul
Branch: r/stx.2.0

commit e1e4c671c31f2623f02b1a1fb67ad67612e6aa21
Author: Mingyuan Qi <email address hidden>
Date: Wed Nov 27 02:41:40 2019 +0000

    Add an alarm for k8s certificate rotation

    This alarm will be raised if the automatic k8s cert rotation failed
    The k8s cert automatic rotation is implemented in commit:
    https://review.opendev.org/#/c/692276/

    Change-Id: Idddd6ae7b83bc40b805e85004994c48cd801ee75
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit bc6796cdc7995575409fd47b4d4f9a8b31f91ebf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.3.0)

Reviewed: https://review.opendev.org/705384
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=7b94ee471648367cdd8f289f2408b580f563c13a
Submitter: Zuul
Branch: r/stx.3.0

commit 7b94ee471648367cdd8f289f2408b580f563c13a
Author: Mingyuan Qi <email address hidden>
Date: Thu Dec 12 05:57:59 2019 +0000

    Add k8s cert rotation script to sysinv pkg

    This commit is derived from https://review.opendev.org/#/c/692276

    This script checks the cert expiration date and rotates them if they
    expires within 90 days. After cert renewed, all the k8s master
    component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I286c38758ded2f661498367c7adf4dd59e603b83
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit 7b64d87a26fac2a5bcb2fce31ffa31b92b0e6f17)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0)

Reviewed: https://review.opendev.org/705382
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=31ce7c575f182e96db9f513843253ab3737ec2d1
Submitter: Zuul
Branch: r/stx.2.0

commit 31ce7c575f182e96db9f513843253ab3737ec2d1
Author: Mingyuan Qi <email address hidden>
Date: Thu Dec 12 05:57:59 2019 +0000

    Add k8s cert rotation script to sysinv pkg

    This commit is derived from https://review.opendev.org/#/c/692276

    This script checks the cert expiration date and rotates them if they
    expires within 90 days. After cert renewed, all the k8s master
    component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I286c38758ded2f661498367c7adf4dd59e603b83
    Partial-Bug: 1838659
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit 7b64d87a26fac2a5bcb2fce31ffa31b92b0e6f17)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/705387
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=16f3c29c7db3b01d48f57437f731c053d11f03c1
Submitter: Zuul
Branch: r/stx.2.0

commit 16f3c29c7db3b01d48f57437f731c053d11f03c1
Author: Mingyuan Qi <email address hidden>
Date: Thu Oct 31 11:16:01 2019 +0800

    Rotate k8s certificate automatically

    By default, k8s cluster certificates generated by kubeadm have 1
    year expiration. After certificates expired, k8s will not rotate
    them automatically.

    This commit checks the cert expiration date every day and rotates
    them automatically if they expires within 90 days. After cert
    renewed, all the k8s master component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I383120b8904857bcf09ad6ca999900ce8eda9b95
    Closes-Bug: 1838659
    Depends-On: https://review.opendev.org/#/c/705383/
    Depends-On: https://review.opendev.org/#/c/705382/
    Signed-off-by: Mingyuan Qi <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (r/stx.3.0)

Reviewed: https://review.opendev.org/705386
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=5fdd0989ffc550ec1cfb38fdf4ad39440de5d96e
Submitter: Zuul
Branch: r/stx.3.0

commit 5fdd0989ffc550ec1cfb38fdf4ad39440de5d96e
Author: Mingyuan Qi <email address hidden>
Date: Thu Oct 31 11:16:01 2019 +0800

    Rotate k8s certificate automatically

    By default, k8s cluster certificates generated by kubeadm have 1
    year expiration. After certificates expired, k8s will not rotate
    them automatically.

    This commit checks the cert expiration date every day and rotates
    them automatically if they expires within 90 days. After cert
    renewed, all the k8s master component configurations will be updated.

    An alarm will be sent to fm to notify the administrator to
    reboot the controllers or renew the certs manually if the automatic
    process fails.

    Change-Id: I383120b8904857bcf09ad6ca999900ce8eda9b95
    Closes-Bug: 1838659
    Depends-On: https://review.opendev.org/#/c/705384/
    Depends-On: https://review.opendev.org/#/c/705385/
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit e86f8b90fd71c6c2df5613ac83dcb9a357f5a364)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/705852

Ghada Khalil (gkhalil)
tags: added: in-r-stx20 in-r-stx30
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (f/centos8)
Download full text (9.5 KiB)

Reviewed: https://review.opendev.org/705852
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=e1f095eb112f76a133734a17f01afeb9828ebaf2
Submitter: Zuul
Branch: f/centos8

commit fc7b9b3d8d811fd50427b584dae5b7488947bb03
Author: Angie Wang <email address hidden>
Date: Tue Jan 28 13:57:52 2020 -0500

    Fix the image download failure on IPv6 system

    "crictl pull" failed to pull images on IPv6 system with
    proxy setting since Containerd doesn't work with the
    NO_PROXY environment variable that has IPv6 addresses
    with square brackets. This commit updates to strip out
    the square brackets from NO_PROXY environment variable.

    Change-Id: I6bb5ad0379f576f66d77a90dfdca94f5e0f28f0c
    Closes-Bug: 1859835
    Signed-off-by: Angie Wang <email address hidden>

commit 950670ac1f0bfaa43e29eeb3ffda71a94de66520
Author: Jim Somerville <email address hidden>
Date: Mon Jan 27 17:09:52 2020 -0500

    Security: Add nospectre_v1 to the security params

    Most of the v1 mitigation is baked into the kernel and not
    optional. The swapgs barriers are, however, optional.
    They have a negative performance impact so we disable them
    by using the nospectre_v1 kernel bootarg.

    Partial-Bug: 1860193
    Depends-On: https://review.opendev.org/#/c/704406
    Change-Id: Iaa11ba3f430fc064ebda679cf290474d3be413da
    Signed-off-by: Jim Somerville <email address hidden>

commit 83775d38804fb665af518127051b37a1daf31e36
Author: David Sullivan <email address hidden>
Date: Wed Jan 15 23:50:23 2020 -0500

    Install secondary controller nodes with kubeadm join

    Kubeadm init is no longer supported for installing secondary nodes in an
    HA kubernetes cluster. kubeadm join with the --controller-plane option
    should be used.

    Change-Id: I21a30b9e871d05c59a19e33a9d278f0217682da6
    Closes-Bug: 1846829
    Depends-On: https://review.opendev.org/702797
    Signed-off-by: David Sullivan <email address hidden>

commit c94fa4a0174b96e0716d39bbea7e6fbbbee415a9
Author: Shuicheng Lin <email address hidden>
Date: Thu Jan 23 02:45:31 2020 +0800

    Fix duplex system controller-1 fail to boot after unlock

    It is due to controller-1 doesn't have /opt/platform/config folder.
    And cause puppet failure due to using non-exist file as source.
    Restrict the code for worker node only, since controller node
    already has ca cert in the ssl folder.

    Test:
    Pass simplex/duplex/multi node deployment with vm created.

    Closes-Bug: 1860529
    Change-Id: I808ee15e5c78ebead114219d0ec428fb45cc9128
    Signed-off-by: Shuicheng Lin <email address hidden>

commit 27f167eb14a04bc67ecca59af3b617c115522101
Author: Angie Wang <email address hidden>
Date: Wed Jan 15 16:15:26 2020 -0500

    Remove puppet-manifests code made obsolete by ansible

    As a result of switch to Ansible, remove the obsolete erb
    templates and remove the dependency of is_initial_config_primary
    facter.

    Change-Id: I4ca6525f01a37da971dc66a11ee99ea4e115e3ad
    Partial-Bug: 1834218
    Depends-On: https://review.opendev.org/#/c/703517/
 ...

Read more...

tags: added: in-f-centos8
Ghada Khalil (gkhalil)
tags: added: stx.4.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.