During root CA update some statefulsets may not complete rollout restart within time limit and cause update to fail
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Andy |
Bug Description
Brief Description
-----------------
During k8s root CA update, deployments, daemonsets and statefulsets are rollout restarted in order for the pods deployed by them to take the new root CA certificates. It is observed that some statefulsets may take longer than the time limit (10 minutes) to complete the rollout restart, causing puppet manifests apply timeout and fail the update.
Severity
--------
Major
Steps to Reproduce
------------------
- Deploy an application by statefulset with multiple replicas (such as a mysql database).
- Check if the statefulset takes more than 10 min to complete rollout restart. Can be checked by the following commands:
kubectl rollout restart statefulset <the statefulset> -n <namespace>
kubectl rollout status statefulset <the statefulset> -n <namespace>
If the second command (checking status) takes more than 10 mins, it will cause puppet apply timeout and fail the update.
- Run system commands to update k8s root CA certificate update.
Expected Behavior
------------------
The "system kube-rootca-
Actual Behavior
----------------
"system kube-rootca-
Reproducibility
---------------
Reproducible if the statefulset takes more than 10 mins to complete rollout restart.
System Configuration
-------
Any
Branch/Pull Time/Commit
-------
Latest from STX master
Last Pass
---------
N/A
Timestamp/Logs
--------------
puppet.log:
1401 2021-12-
1402 2021-12-
1403 2021-12-
1404 2021-12-
1405 2021-12-
1406 2021-12-
1407 2021-12-
1408 2021-12-
1409 2021-12-
1410 2021-12-
1411 2021-12-
1412 2021-12-
1413 2021-12-
1414 2021-12-
1415 2021-12-
1416 2021-12-
1417 2021-12-
1418 2021-12-
1419 2021-12-
1420 2021-12-
1421 2021-12-
1422 2021-12-
1423 2021-12-
1424 2021-12-
1425 2021-12-
1426 2021-12-
1427 2021-12-
1428 2021-12-
Test Activity
-------------
Developer Testing
Workaround
----------
N/A
Changed in starlingx: | |
assignee: | nobody → Andy (andy.wrs) |
Fix proposed to branch: master /review. opendev. org/c/starlingx /stx-puppet/ +/821274
Review: https:/