Optimized restore fails if kubeadm config is missing during backup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Joshua Kraitberg |
Bug Description
Brief Description
-----------------
DC subcloud BnR: Subcloud restore, post k8s upgrade, failed to intialize kubernetes master
Failure:
TASK [optimized-
Thursday 14 December 2023 18:49:49 +0000 (0:00:01.831) 0:05:59.424 *****
fatal: [localhost]: FAILED! => changed=true
cmd:
- kubeadm
- init
- --ignore-
- --ignore-
- --ignore-
- --ignore-
- --config=
delta: '0:00:00.029620'
end: '2023-12-14 18:49:49.419911'
msg: non-zero return code
rc: 1
start: '2023-12-14 18:49:49.390291'
stderr: |-
W1214 18:49:49.412646 26252 common.go:84] your configuration file uses a deprecated API spec: "kubeadm.
W1214 18:49:49.413923 26252 common.go:84] your configuration file uses a deprecated API spec: "kubeadm.
W1214 18:49:49.414319 26252 initconfigurati
this version of kubeadm only supports deploying clusters with the control plane version >= 1.23.0. Current version: v1.21.8
To see the stack trace of this error execute with --v=5 or higher
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
subcloud state:
kubeadm version
kubeadm version: &version.
## check nodes control-plane version
kubectl get nodes -n deployment
Error from server (ServiceUnavail
Severity
-----------------
<Critical: System/Feature is not usable after the defect>
Steps to Reproduce
-----------------
Run BnR post subcloud platform upgrade and k8s upgrade
steps:
deploy systemcontroller and subclouds with 21.12P10
upgrade systemcontroller and subclouds
upgrade k8s on systemconrtoller and subclouds
backup subcloud
restore subcloud
Expected Behavior
the subcloud should be restored successfully
Actual Behavior
-----------------
The subcloud restore failed
Reproducibility
-----------------
100%
System Configuration
-----------------
DC / subcloud
Load info (eg: 2022-03-
cat /etc/build.info
SW_VERSION="22.12"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID=
SRC_BUILD_
BUILD_BY="jenkins"
BUILD_NUMBER="50"
BUILD_HOST=
BUILD_DATE=
[sysadmin@
Patch ID RR Release Patch State
=======
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_21.
WRCP_22.
WRCP_22.
WRCP_22.
WRCP_22.
Last Pass
Timestamp/Logs
-----------------
Alarms
-----------------
no alarms
Test Activity
-----------------
Manual regression
Workaround
-----------------
Change kubeadm.
Changed in starlingx: | |
assignee: | nobody → Joshua Kraitberg (jkraitbe-wr) |
Changed in starlingx: | |
status: | New → In Progress |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.9.0 stx.update |
Reviewed: https:/ /review. opendev. org/c/starlingx /ansible- playbooks/ +/903720 /opendev. org/starlingx/ ansible- playbooks/ commit/ c9e16717e76ec37 5ad598f063c4401 8e9f24a33b
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit c9e16717e76ec37 5ad598f063c4401 8e9f24a33b
Author: Joshua Kraitberg <email address hidden>
Date: Thu Dec 14 17:17:54 2023 -0500
Do not use kubeadm during optimized restore
During backup, the kubeadm config file is not guaranteed to be present
on the system. This caused an issue during optimized restore because
that file was used to recreate the cluster.
A similar issue can also occur during restore after upgrade because
the kubeadm config will contain deprecated fields.
Rather than using "kubeadm init" to initialize a new cluster,
the K8s certificates, control-plane static pod manifests,
kubelet configuration, and etcd snapshot will be leveraged
to bring up the previous cluster by simply starting kubelet.
TEST PLAN
PASS: Optimized restore on AIO-SX
* stx8
* stx9
PASS: Optimized restore after upgrade, stx9
Closes-Bug: 2047845 854776cc8967e6c ba99d186b66
Change-Id: Ia0a0f83cf6111e
Signed-off-by: Joshua Kraitberg <email address hidden>