k8s upgrade fails if kubeadm configmap formatted in specific way

Bug #2016041 reported by Chris Friesen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Chris Friesen

Bug Description

There is an upstream issue in Kubeadm (affecting at least up till 1.24.4) when the "certSANs" field of the kubeadm configmap contains unquoted IPv6 addresses in "flow style". When this occurs, kubeadm will choke while parsing the configmap. This in turn causes the K8s upgrade to fail.

The problematic formatting looks like this:

        ClusterConfiguration: |
            apiServer:
                certSANs: [::1, 192.168.206.1, 127.0.0.1, 10.20.7.3]

While this is fine:

          ClusterConfiguration: |
            apiServer:
                certSANs:
                - ::1
                - 192.168.206.1
                - 127.0.0.1
                - 10.20.7.3

It also works to wrap each IPv6 address in quotes.

Chris Friesen (cbf123)
Changed in starlingx:
assignee: nobody → Chris Friesen (cbf123)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/880240

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/880240
Committed: https://opendev.org/starlingx/config/commit/5c58f00c11732f59bb559326659e1635f58587d5
Submitter: "Zuul (22348)"
Branch: master

commit 5c58f00c11732f59bb559326659e1635f58587d5
Author: Chris Friesen <email address hidden>
Date: Wed Apr 12 14:04:16 2023 -0600

    assorted kubeadm configmap compatibility issues

    There is an upstream issue in Kubeadm (affecting at least up till
    1.24.4) where if the "certSANs" field of the kubeadm configmap contains
    unquoted IPv6 addresses starting with colons in "flow style" it will
    choke while parsing.

    The problematic formatting looks like this:

            ClusterConfiguration: |
                apiServer:
                    certSANs: [::1, 192.168.206.1, 127.0.0.1, 10.20.7.3]

    While this is fine:

              ClusterConfiguration: |
                apiServer:
                    certSANs:
                    - ::1
                    - 192.168.206.1
                    - 127.0.0.1
                    - 10.20.7.3

    It also works to wrap each IPv6 address in quotes.

    It's not clear what causes the certSANs field to be formatted in flow
    style, but it was seen in testing after a platform upgrade followed
    by a k8s upgrade.

    The workaround is to modify the "upgrade first control plane" code
    to update the configmap 'certSANs' field to block style if it's in
    flow style and contains IPv6 addresses.

    I've opened an upstream issue:
    https://github.com/kubernetes/kubeadm/issues/2858

    We'll hit the same error in _get_kubernetes_join_cmd(), but since that
    code is run more frequently rather than reformatting the configmap
    we modify the code to explicitly set the certificate key rather than
    passing in the whole kubeadm config file. This is arguably how it
    should have been done originally.

    In StarlingX 7 by default we set the "HugePageStorageMediumSize=true"
    feature gate in the kube-apiserver section of the kubeadm configmap.
    In k8s 1.24 it's no longer supported. In StarlingX 8 we remove it
    from various locations (kubelet config, service parameters, etc.)
    but we also need to remove it from the kubeadm configmap.

    Test Plan:
    PASS: platform upgrade from Starlingx 7 to 8, then K8s upgrade to 1.24
    PASS: add "::1" address to certSANS in configmap then upgrade k8s
    PASS: set HugePageStorageMediumSize in cm then upgrade k8s to 1.24

    Change-Id: I45e9e22585a5b2912a339ad5905d011e3adc29ab
    Closes-Bug: 2016041
    Signed-off-by: Chris Friesen <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.containers stx.update
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/880897

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/880897
Committed: https://opendev.org/starlingx/config/commit/ceb5852fcef1e643020f55b607dd6ad56a1c9ff2
Submitter: "Zuul (22348)"
Branch: master

commit ceb5852fcef1e643020f55b607dd6ad56a1c9ff2
Author: Chris Friesen <email address hidden>
Date: Wed Apr 19 18:45:01 2023 -0400

    fixup for kubeadm cert upload

    In commit 5c58f00c11 a change was made to use
    "kubeadm init phase upload-certs --upload-certs --certificate-key <key>"
    to upload the certs to a K8s Secret in order to work around an issue
    involving the YAML representation of IPv6 addresses.

    It turns out that when used in this way, kubeadm does not upload the
    external-etcd-ca.crt/external-etcd.crt/external-etcd.key entries to the
    K8s Secret. This breaks the install on multi-node labs.

    The fix is to revert this code back to the old way of doing it, but to
    call kubeadm_configmap_reformat() to reformat the ConfigMap if
    necessary prior to dumping it out. That way if it does contain IPv6
    addresses in the "wrong" YAML format, it will get corrected.

    Test Plan:
    PASSED: Install in AIO-DX virtualbox with IPv4.
    PASSED: Modified install in AIO-DX virtualbox with IPv6 address added
            to kubeadm-config ConfigMap before unlocking controller-1.

    Closes-Bug: 2017146
    Partial-Bug: 2016041
    Change-Id: I999a161e15a81a50085a1843cc80515b9af0f117
    Signed-off-by: Chris Friesen <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.