Certificate cluster-host unit IP is removed from the SAN list after OAM IP change

Bug #2042982 reported by Andy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Andy

Bug Description

Brief Description
-----------------
On a SX system, after its OAM IP address changes, new kubernetes app deployment will fail. kube-scheduler and kube-controller-manager generate error as following:

2023-10-12T17:29:49.450850308Z stderr F E1012 17:29:49.450720 1 reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: Get "https://[fd00:4888:0:2::2]:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0": x509: certificate is valid for fd00:4888:0:1::1, 2001:4888:2a41:4063:406:40a:0:f410, fd00:4888:0:2::1, ::1, not fd00:4888:0:2::2

Severity
--------
Critical: System/Feature is not usable due to the defect

Steps to Reproduce
------------------
- On a AIO-SX system, change OAM IP address by:
  system oam-modify oam_ip=<new IP>"
- Apply app, the app deployment will fail with certificate errors.
- Check /etc/kubernetes/pki/apiserver.crt, the cluster-host unit IP address is missing from SAN list.

Expected Behavior
------------------
- App deployment is successful
- cluster-host unit IP address is in apiserver cert's SAN list.

Actual Behavior
----------------
- App deployment fails.
- cluster-host unit IP address is missing from apiserver cert's SAN list.

Reproducibility
---------------
100% reproducible.

System Configuration
--------------------
One node system.

Branch/Pull Time/Commit
-----------------------
STX master.

Last Pass
---------
Unknown.

Timestamp/Logs
--------------
See Description for the errors in logs.

Test Activity
-------------
Regression Testing.

Workaround
----------
N/A

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/900376

Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Andy (andy.wrs)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/900376
Committed: https://opendev.org/starlingx/stx-puppet/commit/094f9e87d70c62299644a09c5497ee41792dc51d
Submitter: "Zuul (22348)"
Branch: master

commit 094f9e87d70c62299644a09c5497ee41792dc51d
Author: Andy Ning <email address hidden>
Date: Tue Nov 7 16:51:54 2023 -0500

    Add cluster-host unit IP to apiserver cert SANs

    When OAM IP changes, platform::kubernetes::certsans::runtime class in
    kubernetes.pp is applied. The runtime puppet class will update
    apiserver's certificate among other configuration updates. However
    the cluster-host unit IP address is missing from the cert's SAN list
    after the update on SX system.

    This change fixed the issue by adding cluster-host unit IP address back
    to apiserver cert's SAN list for SX system.

    Test Plan:
    PASS: On a AIO-SX system, run "system oam-modify oam_ip=<new IP>" to
          change its OAM IP address. Verify controller's cluster-host unit
          IP address is in the cert's SAN list after OAM IP address has
          been changed.

    Closes-Bug: 2042982
    Signed-off-by: Andy Ning <email address hidden>
    Change-Id: Ib18644a4babf7a9549dc55653119d24bb34a97df

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.config stx.security
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.