Once k8s comes up after the etcd restore, there is a span of time
(around 20s) that the pod states have not been updated and are reported
as they were at the point in time where the backup was taken. This
returns that the ic-nginx-ingress-ingress-nginx-controller-XXX pod is
"Ready", but it is not... in several instances during my tests, the pod
was restarted 3-10 seconds after the task "Launch Armada with Helm v3"
failed due to not being able to call the webhook. The proposed solution
is to delete the pod preemptively and wait for it to be recreated and
"Ready".
Reviewed: https:/ /review. opendev. org/c/starlingx /ansible- playbooks/ +/852677 /opendev. org/starlingx/ ansible- playbooks/ commit/ c2e5db4305bca4f 39a3391afd136b4 6216cb7d3f
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit c2e5db4305bca4f 39a3391afd136b4 6216cb7d3f
Author: Thiago Brito <email address hidden>
Date: Tue Aug 9 18:34:43 2022 -0300
Deleting ic-nginx- ingress- controller at restore
Once k8s comes up after the etcd restore, there is a span of time ingress- ingress- nginx-controlle r-XXX pod is
(around 20s) that the pod states have not been updated and are reported
as they were at the point in time where the backup was taken. This
returns that the ic-nginx-
"Ready", but it is not... in several instances during my tests, the pod
was restarted 3-10 seconds after the task "Launch Armada with Helm v3"
failed due to not being able to call the webhook. The proposed solution
is to delete the pod preemptively and wait for it to be recreated and
"Ready".
TEST PLAN
PASS restore on virtual AIO-SX (CentOS)
Closes-Bug: #1978899 fcf5d515ef55c6d 47ab968dbf3
Signed-off-by: Thiago Brito <email address hidden>
Change-Id: I20bec1fbbf809b