Activity log for bug #1928018

Date Who What changed Old value New value Message
2021-05-10 21:54:49 Angie Wang bug added bug
2021-05-10 21:54:54 Angie Wang starlingx: assignee Angie Wang (angiewang)
2021-05-10 23:39:31 Angie Wang description Brief Description ----------------- After a reboot or lock/unlock of an AIO-SX, Armada pod stuck in an unknown state and does not recover. Same issue with but this impacts Armada pod https://bugs.launchpad.net/starlingx/+bug/1874858 https://bugs.launchpad.net/starlingx/+bug/1893977 Severity -------- Medium Steps to Reproduce ------------------ Apply stx-openstack application to an AIO-SX system host-lock controller-0 system host-unlock controller-0 Expected Behavior ------------------ All pods should recover and be in a ready/running state shortly after the controller recovers. Actual Behavior ---------------- Armada pod stuck in unknown state Reproducibility --------------- Intermittent - seen rarely System Configuration -------------------- AIO-SX Branch/Pull Time/Commit ----------------------- stx master Timestamp/Logs -------------- [2021-04-21 19:50:21,796] 314 DEBUG MainThread ssh.send :: Send 'kubectl get pod --all-namespaces --field-selector=status.phase=Running -o=wide | grep --color=never -v -E '([0-9])+/\1'' [2021-04-21 19:50:22,133] 436 DEBUG MainThread ssh.expect :: Output: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES armada armada-api-84f66996f6-ztjmv 0/2 Unknown 0 8h <none> controller-0 <none> <none> Warning FailedMount 105m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition Warning FailedMount 103m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition Warning FailedMount 97m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition Warning FailedMount 37m (x22 over 101m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition Warning FailedMount 32m (x43 over 108m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory Warning FailedMount 18m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition Warning FailedMount 8m11s (x3 over 14m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition Warning FailedMount 4m4s (x3 over 16m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition Warning FailedMount 2m (x3 over 20m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition Warning FailedMount 103s (x18 over 22m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory Test Activity ------------- Sanity Workaround ---------- Delete the unknown pod Brief Description ----------------- After a reboot or lock/unlock of an AIO-SX, Armada pod stuck in an unknown state and does not recover. Same issue with the following LPs but this impacts Armada pod https://bugs.launchpad.net/starlingx/+bug/1874858 https://bugs.launchpad.net/starlingx/+bug/1893977 Severity -------- Medium Steps to Reproduce ------------------ Apply stx-openstack application to an AIO-SX system host-lock controller-0 system host-unlock controller-0 Expected Behavior ------------------ All pods should recover and be in a ready/running state shortly after the controller recovers. Actual Behavior ---------------- Armada pod stuck in unknown state Reproducibility --------------- Intermittent - seen rarely System Configuration -------------------- AIO-SX Branch/Pull Time/Commit ----------------------- stx master Timestamp/Logs -------------- [2021-04-21 19:50:21,796] 314 DEBUG MainThread ssh.send :: Send 'kubectl get pod --all-namespaces --field-selector=status.phase=Running -o=wide | grep --color=never -v -E '([0-9])+/\1'' [2021-04-21 19:50:22,133] 436 DEBUG MainThread ssh.expect :: Output: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES armada armada-api-84f66996f6-ztjmv 0/2 Unknown 0 8h <none> controller-0 <none> <none>   Warning FailedMount 105m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition   Warning FailedMount 103m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition   Warning FailedMount 97m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition   Warning FailedMount 37m (x22 over 101m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition   Warning FailedMount 32m (x43 over 108m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory   Warning FailedMount 18m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition   Warning FailedMount 8m11s (x3 over 14m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition   Warning FailedMount 4m4s (x3 over 16m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition   Warning FailedMount 2m (x3 over 20m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition   Warning FailedMount 103s (x18 over 22m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory Test Activity ------------- Sanity Workaround ---------- Delete the unknown pod
2021-05-12 16:09:10 OpenStack Infra starlingx: status New In Progress
2021-05-13 18:44:01 OpenStack Infra starlingx: status In Progress Fix Released
2021-05-14 19:01:09 Ghada Khalil starlingx: importance Undecided Medium
2021-05-14 19:01:55 Ghada Khalil tags stx.6.0 stx.containers
2021-06-07 17:18:23 OpenStack Infra tags stx.6.0 stx.containers in-f-centos8 stx.6.0 stx.containers
2021-06-07 17:18:24 OpenStack Infra bug watch added https://bugzilla.redhat.com/show_bug.cgi?id=1793527
2021-06-07 17:18:24 OpenStack Infra bug watch added https://bugzilla.redhat.com/show_bug.cgi?id=1819868
2021-06-07 17:18:24 OpenStack Infra bug watch added https://github.com/golang/go/issues/40213
2021-06-07 17:18:24 OpenStack Infra cve linked 2018-15473
2021-06-07 17:18:24 OpenStack Infra cve linked 2019-18634
2021-06-07 17:18:24 OpenStack Infra cve linked 2019-6470
2021-06-07 17:18:24 OpenStack Infra cve linked 2020-13817
2021-06-07 17:18:24 OpenStack Infra cve linked 2020-15705
2021-06-07 17:18:24 OpenStack Infra cve linked 2020-15707
2021-06-07 17:18:24 OpenStack Infra cve linked 2021-3156