Comment 6 for bug 1833609

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

From the log controller-0 is unlocked at 10:40:12.
2019-06-24T10:40:12.000 controller-1 -sh: info HISTORY: PID=782269 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0

And we could find application re-apply begin at 10:40:19:
2019-06-24 10:40:19.668 333801 INFO sysinv.api.controllers.v1.host [-] Reapplying the stx-openstack app

Begin process osh-openstack-ceph-rgw at 10:51:59:
2019-06-24 10:51:59.235 332533 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-ceph-rgw, overall completion: 46.0%

And here is the log in armada:
2019-06-24 10:51:58.243 36 DEBUG armada.handlers.wait [-] [chart=openstack-ceph-rgw]: Starting to wait on: namespace=openstack, resource type=pod, label_selector=(release_group=osh-openstack-ceph-rgw), timeout=1800 _watch_resource_completions /usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py:362^[[00m
2019-06-24 10:52:34.524 36 DEBUG armada.handlers.lock [-] Updating lock update_lock /usr/local/lib/python3.6/dist-packages/armada/handlers/lock.py:173^[[00m

And controller-0 become unlocked/available at 10:46:38:
2019-06-24T10:46:38.000 controller-1 sm: debug time[3887.884] log<913> INFO: sm[95837]: sm_main_event_handler.c(171): Set node (controller-0) requested, action=2, admin_state=unlocked, oper_state=enabled, avail_status=available, seqno=5.
2019-06-24T10:46:38.000 controller-1 sm: debug time[3887.895] log<914> INFO: sm[95837]: sm_failover.c(1129): controller-1 unlocked-enabled-available, controller-0 unlocked-enabled-available

Then controller-1 force reboot at 10:58:22
2019-06-24T10:58:22.000 controller-1 -sh: info HISTORY: PID=782269 UID=42425 sudo reboot -f

Need further debug to check why stuck at pod ready wait in osh-openstack-ceph-rgw.