Activity log for bug #1881899

Date Who What changed Old value New value Message
2020-06-03 13:08:43 Ovidiu Poncea bug added bug
2020-06-03 13:25:10 Ovidiu Poncea description Brief Description ----------------- Rebooted both controllers but Openstack fails to come back up. [root@controller-1 sysadmin(keystone_admin)]# openstack endpoint list Failed to discover available identity versions when contacting http://keystone.openstack.svc.cluster.local/v3. Attempting to parse version from URL. Service Unavailable (HTTP 503) Node status: [root@controller-1 sysadmin(keystone_admin)]# system host-list +----+--------------+-------------+----------------+-------------+--------------+ | id | hostname | personality | administrative | operational | availability | +----+--------------+-------------+----------------+-------------+--------------+ | 1 | controller-0 | controller | unlocked | enabled | degraded | | 2 | controller-1 | controller | unlocked | enabled | available | +----+--------------+-------------+----------------+-------------+--------------+ [root@controller-1 sysadmin(keystone_admin)]# fm alarm-list +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp | +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ | 400.001 | Service group cloud-services warning; dbmon(enabled-active, ) | service_domain= | minor | 2020-06-03T1 | | | | controller. | | 3:02:13. | | | | service_group=cloud- | | 385662 | | | | services.host= | | | | | | controller-1 | | | | | | | | | | 400.001 | Service group cloud-services warning; dbmon(enabled-standby, ) | service_domain= | minor | 2020-06-03T1 | | | | controller. | | 3:01:13. | | | | service_group=cloud- | | 112434 | | | | services.host= | | | | | | controller-0 | | | | | | | | | | 200.006 | controller-0 is degraded due to the failure of its 'pci-irq- | host=controller-0. | major | 2020-06-03T1 | | | affinity-agent' process. Auto recovery of this major process is in | process=pci-irq- | | 2:46:13. | | | progress. | affinity-agent | | 918380 | | | | | | | +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ [root@controller-1 sysadmin(keystone_admin)]# kubectl get pods -o wide -n openstack | grep -v Running | grep -v Completed NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cinder-api-59979594ff-25hrl 0/1 Init:0/2 0 26m 172.16.192.109 controller-0 <none> <none> cinder-backup-6dd95fc9dd-svp5r 0/1 Init:0/4 0 26m 172.16.192.123 controller-0 <none> <none> cinder-scheduler-76c65f6979-5tmt5 0/1 Init:0/2 0 26m 172.16.192.82 controller-0 <none> <none> cinder-volume-b7dfbb7b9-f47bk 0/1 Init:0/4 0 26m 172.16.192.90 controller-0 <none> <none> cinder-volume-b7dfbb7b9-mtkz8 0/1 Init:3/4 7 9h 172.16.166.185 controller-1 <none> <none> cinder-volume-usage-audit-1591187700-g64xf 0/1 Init:0/1 0 27m 172.16.166.157 controller-1 <none> <none> fm-rest-api-8b5b97bf8-qdlbx 0/1 Init:0/1 0 26m 172.16.192.67 controller-0 <none> <none> fm-rest-api-8b5b97bf8-v5wz9 0/1 CrashLoopBackOff 9 8h 172.16.166.140 controller-1 <none> <none> glance-api-6b74f659d-w9t4g 0/1 Init:0/3 0 26m 172.16.192.101 controller-0 <none> <none> heat-api-846d848bd9-hd46z 0/1 Init:0/1 0 26m 172.16.192.108 controller-0 <none> <none> heat-cfn-9d6f7ffc5-rvb4d 0/1 Init:0/1 0 26m 172.16.192.73 controller-0 <none> <none> heat-engine-6487ff65c6-zk4n7 0/1 Init:0/1 0 26m 172.16.192.80 controller-0 <none> <none> heat-engine-cleaner-1591187700-kd2pn 0/1 Init:0/1 0 27m 172.16.166.156 controller-1 <none> <none> horizon-65d4b5bdcf-ltms2 0/1 Init:0/1 0 21m 172.16.192.83 controller-0 <none> <none> keystone-api-6c76774bf7-l7c4d 0/1 Init:0/1 0 26m 172.16.192.113 controller-0 <none> <none> libvirt-libvirt-default-4mf4v 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> mariadb-server-0 0/1 CrashLoopBackOff 8 9h 172.16.166.158 controller-1 <none> <none> neutron-dhcp-agent-controller-0-937646f6-r5skk 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> neutron-l3-agent-controller-0-937646f6-fk5hc 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> neutron-metadata-agent-controller-0-937646f6-nzkxz 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> neutron-ovs-agent-controller-0-937646f6-dbfc2 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> neutron-server-7c9678cf58-dq52p 0/1 Init:0/1 0 26m 172.16.192.74 controller-0 <none> <none> neutron-server-7c9678cf58-s85dg 0/1 CrashLoopBackOff 8 9h 172.16.166.191 controller-1 <none> <none> neutron-sriov-agent-controller-0-937646f6-qxt9g 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> nova-api-metadata-b9b4fdb9b-d2gr6 0/1 CrashLoopBackOff 8 9h 172.16.166.177 controller-1 <none> <none> nova-api-metadata-b9b4fdb9b-kg859 0/1 Init:0/2 0 26m 172.16.192.110 controller-0 <none> <none> nova-api-osapi-856679d49f-4ljnl 0/1 Init:0/1 0 26m 172.16.192.117 controller-0 <none> <none> nova-compute-controller-0-937646f6-9lrqs 0/2 Init:0/6 1 9h 192.168.204.2 controller-0 <none> <none> nova-conductor-6cbc75dd89-nxvwc 0/1 Init:0/1 0 26m 172.16.192.66 controller-0 <none> <none> nova-novncproxy-5bd676cfc4-82r8x 0/1 Init:0/3 0 26m 172.16.192.72 controller-0 <none> <none> nova-scheduler-7fbf5cdd4-ckmkd 0/1 CrashLoopBackOff 6 9h 172.16.166.172 controller-1 <none> <none> nova-scheduler-7fbf5cdd4-h65j5 0/1 Init:0/1 0 26m 172.16.192.121 controller-0 <none> <none> nova-service-cleaner-1591189200-927h5 0/1 Init:0/1 0 6m56s 172.16.192.92 controller-0 <none> <none> Severity -------- Critical: openstack is unusable Steps to Reproduce ------------------ 1. Reboot both controllers with reboot -f 2. Wait for them to come back up Expected Behavior ------------------ 'openstack endpoint list' should work Actual Behavior ---------------- State what is the actual behavior Reproducibility --------------- 100% reproducible System Configuration -------------------- AIO-DX ipv4 Branch/Pull Time/Commit ----------------------- master Last Pass --------- Did this test scenario pass previously? If so, please indicate the load/pull time info of the last pass. Use this section to also indicate if this is a new test scenario. Test Activity ------------- Developer Testing Workaround ---------- None Brief Description ----------------- Rebooted both controllers but Openstack fails to come back up. [root@controller-1 sysadmin(keystone_admin)]# openstack endpoint list Failed to discover available identity versions when contacting http://keystone.openstack.svc.cluster.local/v3. Attempting to parse version from URL. Service Unavailable (HTTP 503) Node status: [root@controller-1 sysadmin(keystone_admin)]# system host-list +----+--------------+-------------+----------------+-------------+--------------+ | id | hostname | personality | administrative | operational | availability | +----+--------------+-------------+----------------+-------------+--------------+ | 1 | controller-0 | controller | unlocked | enabled | degraded | | 2 | controller-1 | controller | unlocked | enabled | available | +----+--------------+-------------+----------------+-------------+--------------+ [root@controller-1 sysadmin(keystone_admin)]# fm alarm-list +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp | +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ | 400.001 | Service group cloud-services warning; dbmon(enabled-active, ) | service_domain= | minor | 2020-06-03T1 | | | | controller. | | 3:02:13. | | | | service_group=cloud- | | 385662 | | | | services.host= | | | | | | controller-1 | | | | | | | | | | 400.001 | Service group cloud-services warning; dbmon(enabled-standby, ) | service_domain= | minor | 2020-06-03T1 | | | | controller. | | 3:01:13. | | | | service_group=cloud- | | 112434 | | | | services.host= | | | | | | controller-0 | | | | | | | | | | 200.006 | controller-0 is degraded due to the failure of its 'pci-irq- | host=controller-0. | major | 2020-06-03T1 | | | affinity-agent' process. Auto recovery of this major process is in | process=pci-irq- | | 2:46:13. | | | progress. | affinity-agent | | 918380 | | | | | | | +----------+---------------------------------------------------------------------+------------------------+----------+--------------+ [root@controller-1 sysadmin(keystone_admin)]# kubectl get pods -o wide -n openstack | grep -v Running | grep -v Completed NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cinder-api-59979594ff-25hrl 0/1 Init:0/2 0 26m 172.16.192.109 controller-0 <none> <none> cinder-backup-6dd95fc9dd-svp5r 0/1 Init:0/4 0 26m 172.16.192.123 controller-0 <none> <none> cinder-scheduler-76c65f6979-5tmt5 0/1 Init:0/2 0 26m 172.16.192.82 controller-0 <none> <none> cinder-volume-b7dfbb7b9-f47bk 0/1 Init:0/4 0 26m 172.16.192.90 controller-0 <none> <none> cinder-volume-b7dfbb7b9-mtkz8 0/1 Init:3/4 7 9h 172.16.166.185 controller-1 <none> <none> cinder-volume-usage-audit-1591187700-g64xf 0/1 Init:0/1 0 27m 172.16.166.157 controller-1 <none> <none> fm-rest-api-8b5b97bf8-qdlbx 0/1 Init:0/1 0 26m 172.16.192.67 controller-0 <none> <none> fm-rest-api-8b5b97bf8-v5wz9 0/1 CrashLoopBackOff 9 8h 172.16.166.140 controller-1 <none> <none> glance-api-6b74f659d-w9t4g 0/1 Init:0/3 0 26m 172.16.192.101 controller-0 <none> <none> heat-api-846d848bd9-hd46z 0/1 Init:0/1 0 26m 172.16.192.108 controller-0 <none> <none> heat-cfn-9d6f7ffc5-rvb4d 0/1 Init:0/1 0 26m 172.16.192.73 controller-0 <none> <none> heat-engine-6487ff65c6-zk4n7 0/1 Init:0/1 0 26m 172.16.192.80 controller-0 <none> <none> heat-engine-cleaner-1591187700-kd2pn 0/1 Init:0/1 0 27m 172.16.166.156 controller-1 <none> <none> horizon-65d4b5bdcf-ltms2 0/1 Init:0/1 0 21m 172.16.192.83 controller-0 <none> <none> keystone-api-6c76774bf7-l7c4d 0/1 Init:0/1 0 26m 172.16.192.113 controller-0 <none> <none> libvirt-libvirt-default-4mf4v 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> mariadb-server-0 0/1 CrashLoopBackOff 8 9h 172.16.166.158 controller-1 <none> <none> neutron-dhcp-agent-controller-0-937646f6-r5skk 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> neutron-l3-agent-controller-0-937646f6-fk5hc 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> neutron-metadata-agent-controller-0-937646f6-nzkxz 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> neutron-ovs-agent-controller-0-937646f6-dbfc2 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> neutron-server-7c9678cf58-dq52p 0/1 Init:0/1 0 26m 172.16.192.74 controller-0 <none> <none> neutron-server-7c9678cf58-s85dg 0/1 CrashLoopBackOff 8 9h 172.16.166.191 controller-1 <none> <none> neutron-sriov-agent-controller-0-937646f6-qxt9g 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> nova-api-metadata-b9b4fdb9b-d2gr6 0/1 CrashLoopBackOff 8 9h 172.16.166.177 controller-1 <none> <none> nova-api-metadata-b9b4fdb9b-kg859 0/1 Init:0/2 0 26m 172.16.192.110 controller-0 <none> <none> nova-api-osapi-856679d49f-4ljnl 0/1 Init:0/1 0 26m 172.16.192.117 controller-0 <none> <none> nova-compute-controller-0-937646f6-9lrqs 0/2 Init:0/6 1 9h 192.168.204.2 controller-0 <none> <none> nova-conductor-6cbc75dd89-nxvwc 0/1 Init:0/1 0 26m 172.16.192.66 controller-0 <none> <none> nova-novncproxy-5bd676cfc4-82r8x 0/1 Init:0/3 0 26m 172.16.192.72 controller-0 <none> <none> nova-scheduler-7fbf5cdd4-ckmkd 0/1 CrashLoopBackOff 6 9h 172.16.166.172 controller-1 <none> <none> nova-scheduler-7fbf5cdd4-h65j5 0/1 Init:0/1 0 26m 172.16.192.121 controller-0 <none> <none> nova-service-cleaner-1591189200-927h5 0/1 Init:0/1 0 6m56s 172.16.192.92 controller-0 <none> <none> Severity -------- Critical: openstack is unusable Steps to Reproduce ------------------ 1. Reboot both controllers with reboot -f 2. Wait for them to come back up Expected Behavior ------------------ 'openstack endpoint list' should work Actual Behavior ---------------- [root@controller-1 sysadmin(keystone_admin)]# openstack endpoint list Failed to discover available identity versions when contacting http://keystone.openstack.svc.cluster.local/v3. Attempting to parse version from URL. Service Unavailable (HTTP 503) Reproducibility --------------- 100% reproducible System Configuration -------------------- AIO-DX ipv4 Branch/Pull Time/Commit ----------------------- master Test Activity ------------- Developer Testing  Workaround  ---------- None
2020-06-03 16:06:11 Ghada Khalil tags stx.distro.openstack
2020-06-03 19:41:08 Ghada Khalil tags stx.distro.openstack stx.4.0 stx.distro.openstack
2020-06-03 19:41:13 Ghada Khalil starlingx: importance Undecided High
2020-06-03 19:41:15 Ghada Khalil starlingx: status New Triaged
2020-06-03 19:41:27 Ghada Khalil bug added subscriber Bill Zvonar
2020-06-03 19:41:41 Ghada Khalil starlingx: assignee yong hu (yhu6)
2020-06-04 11:11:52 Bill Zvonar removed subscriber Bill Zvonar
2020-06-04 11:12:00 Bill Zvonar bug added subscriber Frank Miller
2020-06-05 02:14:11 zhipeng liu starlingx: assignee yong hu (yhu6) zhipeng liu (zhipengs)
2020-06-05 07:06:10 zhipeng liu starlingx: status Triaged Confirmed
2020-07-14 12:50:55 yong hu tags stx.4.0 stx.distro.openstack stx.4.0 stx.distro.openstack stx.retestneeded
2020-07-23 02:39:03 yong hu starlingx: status Confirmed Fix Released
2021-10-27 13:58:51 Ghada Khalil tags stx.4.0 stx.distro.openstack stx.retestneeded stx.4.0 stx.distro.openstack