Vim k8s upgrade strategy apply failed with timeout. Unexpected State: aborted
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Heitor Matsui |
Bug Description
Brief Description
-----------------
Kube version upgrade strategy failed to apply in subclouds because of timeout.
Severity
--------
Major: failed to upgrade kube version in some subclouds
Steps to Reproduce
------------------
1. Install DC system with 1000 subclouds
2. Upgrade Kube version 250 in parallel
Expected Behavior
------------------
Kube version upgrade successful
Actual Behavior
----------------
Kube version upgrade not successful
Reproducibility
---------------
Reproducible
System Configuration
-------
DC with 1000 subclouds
Branch/Pull Time/Commit
-------
2022-01-10
Last Pass
---------
N/A
Timestamp/Logs
--------------
/var/log/
subcloud667
/var/log/
=======
log-id = 8
event-id = kube-upgrade-
event-type = action-event
event-context = admin
importance = high
entity = orchestration=
reason_text = Kubernetes upgrade auto-apply failed
additional_text =
timestamp = 2022-05-01 07:13:01.519352
=======
/var/log/
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
2022-05-
Test Activity
-------------
Regression Testing
Workaround
----------
Re-apply strategy
Changed in starlingx: | |
assignee: | nobody → Heitor Matsui (heitormatsui) |
Changed in starlingx: | |
status: | New → In Progress |
tags: | added: stx.nfv |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.7.0 |
Reviewed: https:/ /review. opendev. org/c/starlingx /nfv/+/ 841469 /opendev. org/starlingx/ nfv/commit/ a55b65b234329d6 a88f06d26697afc a8ba1fddf1
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit a55b65b234329d6 a88f06d26697afc a8ba1fddf1
Author: Heitor Matsui <email address hidden>
Date: Wed May 11 17:38:11 2022 -0300
Increase timeout for networking step on k8s upgrade
Kubernetes upgrade might fail during the Upgrade Networking
Step with timeout message when upgrading subclouds. The default
timeout of 600s from the parent class does not seem to be enough
for some subclouds to download the networking images.
This commit increases the timeout of the Upgrade Networking Step.
Test Plan:
PASS: upgrade k8s version with 90 subclouds in parallel
Closes-bug: 1973781 0520fc0c2cb7100 816c372107e
Change-Id: I686b9582daf14f
Signed-off-by: Heitor Matsui <email address hidden>