sysinv kube_app armada pod running and ready check
This modifies sysinv kube_app check for armada pod readiness.
There can be multiple kubernetes armada pods in the system, each
in different condition. This now prefers armada pods that are
running and ready instead of just selecting the first found pod.
i.e., status.phase 'Running' and status.conditions 'Ready=True'
Have been able to recreate multiple armada pod scenarios
where the previous check was insufficient, e.g.,
- Manually delete armada pod, and re-apply application;
see pod in 'Terminating' phase for some time,
and new armada pod being created and eventually Running.
- During B&R of AIO-DX. Deploy AIO-DX, remove armada label on
node controller-0. Delete armada pod, armada pod moves to
controller-1. Take backup. Do the restore up to including
controller-0 unlock. Constantly re-apply cert-manager.
- Change armada pod helm overrides to have 2 replicas, or scale
armada pod to have 2 replicas.
Note that kube_app.py uses helmv2-cli which also has requires
checking for pod ready.
Reviewed: https:/ /review. opendev. org/752305 /git.openstack. org/cgit/ starlingx/ config/ commit/ ?id=e0963bff80f 7e1202f19f109aa 7f5ce0074972b1
Committed: https:/
Submitter: Zuul
Branch: master
commit e0963bff80f7e12 02f19f109aa7f5c e0074972b1
Author: Jim Gauld <email address hidden>
Date: Wed Sep 16 15:26:09 2020 -0400
sysinv kube_app armada pod running and ready check
This modifies sysinv kube_app check for armada pod readiness.
There can be multiple kubernetes armada pods in the system, each
in different condition. This now prefers armada pods that are
running and ready instead of just selecting the first found pod.
i.e., status.phase 'Running' and status.conditions 'Ready=True'
Have been able to recreate multiple armada pod scenarios
where the previous check was insufficient, e.g.,
- Manually delete armada pod, and re-apply application;
see pod in 'Terminating' phase for some time,
and new armada pod being created and eventually Running.
- During B&R of AIO-DX. Deploy AIO-DX, remove armada label on
node controller-0. Delete armada pod, armada pod moves to
controller-1. Take backup. Do the restore up to including
controller-0 unlock. Constantly re-apply cert-manager.
- Change armada pod helm overrides to have 2 replicas, or scale
armada pod to have 2 replicas.
Note that kube_app.py uses helmv2-cli which also has requires
checking for pod ready.
Change-Id: I45523c8da36618 b8ac465269ad95e a28429c03e7 /review. opendev. org/752304
Partial-Bug: 1886429
Depends-On: https:/
Signed-off-by: Jim Gauld <email address hidden>