StarlingX

Bug #1836402
Comment #8

Comment 8 for bug 1836402

Revision history for this message

ya.wang (ya.wang) wrote on 2019-07-31:

@yhu6, we have retest these bug, the steps are:

1. Create an instance on compute-0(with any flavor extra specs and image properties)
2. reboot compute-0
3. wait the evacuate success
4. wait the compute-0's services up
5. live migrate instance to compute-0

At the beginning our test show that live migrate works well. Then we found that our test step in the fourth part is different from the reporter's step. We use `openstack compute service list` to ensure that compute-0's service is available.

controller-0:~$ openstack compute service list
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+
| 29 | nova-compute | compute-1 | nova | enabled | up | 2019-07-31T13:55:37.000000 |
| 32 | nova-compute | compute-0 | nova | enabled | up | 2019-07-31T13:55:37.000000 |
| 50 | nova-consoleauth | nova-consoleauth-748bffc767-vslb5 | internal | enabled | up | 2019-07-31T13:55:38.000000 |
| 52 | nova-conductor | nova-conductor-5977dbb7c5-jjfwm | internal | enabled | up | 2019-07-31T13:55:35.000000 |
| 54 | nova-scheduler | nova-scheduler-6f78459858-gbtnb | internal | enabled | up | 2019-07-31T13:55:36.000000 |
| 61 | nova-consoleauth | nova-consoleauth-748bffc767-s5lsp | internal | enabled | up | 2019-07-31T13:55:32.000000 |
| 62 | nova-scheduler | nova-scheduler-6f78459858-46shx | internal | enabled | up | 2019-07-31T13:55:39.000000 |
| 63 | nova-conductor | nova-conductor-5977dbb7c5-l5f6x | internal | enabled | up | 2019-07-31T13:55:39.000000 |
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+

We also check the k8s pod's status.

When the both show that compute-0 is ready, we start live-migrate and it works well.

@anujeyan is using starlingx's command to ensure that compute-0's service is available. Therefore, we try to use this method to take a test. Use `watch -n5 "system host-list"` to monitor compute-0's status.
When start reboot compute-0, it shows compute-0 is offline, after waiting for a while, it shows compute-0 is available. At this moment, we use `openstack compute service list` to check compute-0's status, the output show that compute-0's status is 'enabled' and state is still 'down'. We also use `kubectl get pod -n namespace`, and the result show that compute-0's compute service is in init.
Try to live-migrate instance to compute-0, it'll be failed and the log show scheduler failed with "NoValidHost: No valid host was found", which is same as @anujeyan uploaded.

@yong, can you re-test it and ensure nova compute service is available before live-migrate?
If it still not work, please paste nova-scheduler debug log here.

@yhu6, we have retest these bug, the steps are:

controller-0:~$ openstack compute service list
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+
| ID | Binary           | Host                              | Zone     | Status  | State | Updated At                 |
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+
| 29 | nova-compute     | compute-1                         | nova     | enabled | up    | 2019-07-31T13:55:37.000000 |
| 32 | nova-compute     | compute-0                         | nova     | enabled | up    | 2019-07-31T13:55:37.000000 |
| 50 | nova-consoleauth | nova-consoleauth-748bffc767-vslb5 | internal | enabled | up    | 2019-07-31T13:55:38.000000 |
| 52 | nova-conductor   | nova-conductor-5977dbb7c5-jjfwm   | internal | enabled | up    | 2019-07-31T13:55:35.000000 |
| 54 | nova-scheduler   | nova-scheduler-6f78459858-gbtnb   | internal | enabled | up    | 2019-07-31T13:55:36.000000 |
| 61 | nova-consoleauth | nova-consoleauth-748bffc767-s5lsp | internal | enabled | up    | 2019-07-31T13:55:32.000000 |
| 62 | nova-scheduler   | nova-scheduler-6f78459858-46shx   | internal | enabled | up    | 2019-07-31T13:55:39.000000 |
| 63 | nova-conductor   | nova-conductor-5977dbb7c5-l5f6x   | internal | enabled | up    | 2019-07-31T13:55:39.000000 |
+----+------------------+-----------------------------------+----------+---------+-------+----------------------------+

We also check the k8s pod's status.

controller-0:/home/sysadmin# kubectl get pod -n openstack
NAME                                                 READY   STATUS      RESTARTS   AGE
cinder-api-64955798b7-9c9mh                          1/1     Running     0          4h47m
cinder-api-64955798b7-gbjlx                          1/1     Running     1          10h
cinder-backup-77fb945bb-6qpz5                        1/1     Running     0          10h
cinder-backup-77fb945bb-mxg78                        1/1     Running     0          4h47m
cinder-scheduler-974d7bc9b-nbkxl                     1/1     Running     0          10h
cinder-scheduler-974d7bc9b-w259r                     1/1     Running     0          4h47m
cinder-volume-56fd7994cc-vsb2h                       1/1     Running     0          10h
cinder-volume-56fd7994cc-xwr9w                       1/1     Running     0          4h47m
cinder-volume-usage-audit-1564580700-qfqj7           0/1     Completed   0          12m
cinder-volume-usage-audit-1564581000-lp9vj           0/1     Completed   0          7m33s
cinder-volume-usage-audit-1564581300-lmmk4           0/1     Completed   0          2m30s
glance-api-7574db7ff9-pfg9s                          1/1     Running     1          10h
glance-api-7574db7ff9-q8z7m                          1/1     Running     1          4h47m
heat-api-7c5c769c99-9sgzw                            1/1     Running     0          4h47m
heat-api-7c5c769c99-lhrv7                            1/1     Running     1          10h
heat-cfn-7f76797c9f-8nptm                            1/1     Running     1          10h
heat-cfn-7f76797c9f-v8xqn                            1/1     Running     0          4h47m
heat-engine-6b94b76595-7j28d                         1/1     Running     0          10h
heat-engine-6b94b76595-dz8vx                         1/1     Running     0          4h47m
heat-engine-cleaner-1564580700-4d8wm                 0/1     Completed   0          12m
heat-engine-cleaner-1564581000-br2zd                 0/1     Completed   0          7m33s
heat-engine-cleaner-1564581300-tzmmj                 0/1     Completed   0          2m30s
horizon-6fcbcdcbfb-8cpw8                             1/1     Running     0          4h41m
ingress-bdfbc4ccc-2wlnk                              1/1     Running     0          4h47m
ingress-bdfbc4ccc-w9kxv                              1/1     Running     0          10h
ingress-error-pages-7b789b5df8-hrtr9                 1/1     Running     0          4h47m
ingress-error-pages-7b789b5df8-qbtfb                 1/1     Running     0          10h
keystone-api-6fffbb6f7c-hgmlk                        1/1     Running     5          10h
keystone-api-6fffbb6f7c-qcd62                        1/1     Running     0          4h47m
keystone-fernet-rotate-1564574400-6xfpv              0/1     Completed   0          117m
libvirt-libvirt-default-k9qg2                        1/1     Running     2          6d7h
libvirt-libvirt-default-v5pgf                        1/1     Running     2          6d7h
mariadb-ingress-6ff964556d-5btgl                     1/1     Running     0          10h
mariadb-ingress-6ff964556d-bjmwh                     1/1     Running     0          4h47m
mariadb-ingress-error-pages-764cfd869b-bdv94         1/1     Running     0          8h
mariadb-server-0                                     1/1     Running     0          4h40m
mariadb-server-1                                     1/1     Running     0          10h
neutron-dhcp-agent-compute-0-75ea0372-vv4xq          1/1     Running     2          6d7h
neutron-dhcp-agent-compute-1-271fedba-74znj          1/1     Running     2          6d7h
neutron-l3-agent-compute-0-75ea0372-5hlh6            1/1     Running     2          6d7h
neutron-l3-agent-compute-1-eae26dba-h8v4q            1/1     Running     2          6d7h
neutron-metadata-agent-compute-0-75ea0372-dlwq2      1/1     Running     2          6d7h
neutron-metadata-agent-compute-1-eae26dba-2wfg2      1/1     Running     2          6d7h
neutron-ovs-agent-compute-0-75ea0372-b9kqh           1/1     Running     2          6d7h
neutron-ovs-agent-compute-1-eae26dba-p58rm           1/1     Running     24         6d7h
neutron-server-df4d6757b-wnq2l                       1/1     Running     1          10h
neutron-server-df4d6757b-xjlrt                       1/1     Running     0          4h47m
neutron-sriov-agent-compute-0-75ea0372-rwcp7         1/1     Running     2          6d7h
neutron-sriov-agent-compute-1-eae26dba-f76mk         1/1     Running     2          6d7h
nova-api-metadata-7c64c8458f-467r6                   1/1     Running     1          10h
nova-api-metadata-7c64c8458f-5q645                   1/1     Running     0          4h47m
nova-api-osapi-56798b4b68-62bf9                      1/1     Running     0          10h
nova-api-osapi-56798b4b68-8ljzh                      1/1     Running     1          4h47m
nova-api-proxy-598684846d-7jtsv                      1/1     Running     0          8h
nova-compute-compute-0-75ea0372-kpzgj                2/2     Running     4          6d7h
nova-compute-compute-1-eae26dba-w4f9m                2/2     Running     26         6d7h
nova-conductor-5977dbb7c5-jjfwm                      1/1     Running     0          10h
nova-conductor-5977dbb7c5-l5f6x                      1/1     Running     0          4h47m
nova-consoleauth-748bffc767-s5lsp                    1/1     Running     0          4h47m
nova-consoleauth-748bffc767-vslb5                    1/1     Running     0          10h
nova-novncproxy-ff6456d77-jksz7                      1/1     Running     0          4h47m
nova-novncproxy-ff6456d77-qthmd                      1/1     Running     1          10h
nova-scheduler-6f78459858-46shx                      1/1     Running     0          4h47m
nova-scheduler-6f78459858-gbtnb                      1/1     Running     0          10h
openvswitch-db-6nctl                                 1/1     Running     2          6d7h
openvswitch-db-jghgk                                 1/1     Running     2          6d7h
openvswitch-vswitchd-4vfp2                           1/1     Running     2          6d7h
openvswitch-vswitchd-qv9bz                           1/1     Running     2          6d7h
osh-openstack-garbd-garbd-8d64b6886-498qt            1/1     Running     0          3h54m
osh-openstack-memcached-memcached-6c94979765-rtsmc   1/1     Running     0          8h
osh-openstack-rabbitmq-rabbitmq-0                    1/1     Running     0          4h41m
osh-openstack-rabbitmq-rabbitmq-1                    1/1     Running     1          10h
placement-api-5798c855bc-8hq8b                       1/1     Running     0          10h
placement-api-5798c855bc-l62f6                       1/1     Running     0          4h47m

When the both show that compute-0 is ready, we start live-migrate and it works well.

@yong, can you re-test it and ensure nova compute service is available before live-migrate?
If it still not work, please paste nova-scheduler debug log here.