standby controller reboot and become available but many pods are not -ready/unreachable
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Invalid
|
Medium
|
Alexander Kozyrev |
Bug Description
Brief Description
-----------------
4 mins after standby controller reboot and become available/online again, many pods were still in pending status because node.kubernetes.io not-ready/
19-08-18 05:52:32,829] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-08-18 05:52:34,457] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | compute-2 | worker | unlocked | enabled | available |
| 5 | compute-3 | worker | unlocked | enabled | available |
| 6 | compute-4 | worker | unlocked | enabled | available |
| 7 | controller-1 | controller | unlocked | enabled | available |
| 8 | storage-0 | storage | unlocked | enabled | available |
| 9 | storage-1 | storage | unlocked | enabled | available |
+----+-
+------
| application | version | manifest name | manifest file | status | progress |
+------
| platform-integ-apps | 1.0-7 | platform-
| stx-openstack | 1.0-17-
+------
[2019-08-18 05:55:00,088] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-08-18 05:55:00,088] 301 DEBUG MainThread ssh.send :: Send 'kubectl get pod --field-
[2019-08-18 05:55:00,323] 423 DEBUG MainThread ssh.expect :: Output:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-
kube-system ingress-6c47p 0/1 Init:0/1 0 2m34s 192.168.222.4 controller-1 <none> <none>
kube-system ingress-
kube-system kube-multus-
kube-system rbd-provisioner
openstack cinder-
openstack cinder-
openstack cinder-
openstack cinder-
openstack glance-
openstack heat-api-
openstack ingress-
openstack keystone-
openstack neutron-
openstack nova-api-
openstack nova-api-
openstack nova-api-
openstack nova-conductor-
openstack nova-novncproxy
openstack nova-scheduler-
openstack osh-openstack-
openstack placement-
[sysadmin@
[2019-08-18 05:55:00,323] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
[2019-08-18 05:55:00,426] 423 DEBUG MainThread ssh.expect :: Output:
0
Severity
--------
Major
Steps to Reproduce
------------------
1. Make sure system is installed and good health . No alarms.
2. force reboot standby controller
3. check pod status
Expected Behavior
------------------
all pod recovered
Actual Behavior
----------------
many pods in pending status
Reproducibility
---------------
Seen once
System Configuration
-------
Multi-node system
Lab-name: WCP_113-121
Branch/Pull Time/Commit
-------
2019-08-16_20-59-00
Last Pass
---------
2019-08-09_20-59-00
Timestamp/Logs
--------------
2019-08-18 05:52:34,561]
Appears to be reporting the same issue as: https:/ /bugs.launchpad .net/starlingx/ +bug/1836787