Bug #1833622 “VM boot up ERROR by Failed to allocate the network...” : Bugs : StarlingX

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-20:

#1

ALL_NODES_20190620.210149.tar Edit (22.5 MiB, application/x-tar)

description:	updated
description:	updated

zhipeng liu (zhipengs) on 2019-06-21

Changed in starlingx:
assignee:	nobody → zhipeng liu (zhipengs)

Revision history for this message

zhipeng liu (zhipengs) wrote on 2019-06-21:

#2

From build.info in the log, I can see below info.
OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190620T013000Z"

This daily build has not started placement PO, will impact VM creation.
It can be duplicated to 1833497

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-06-21:

#3

@Zhipeng, I thought the placement pod not starting is only an issue with vswitch_type was set to none. This is confirmed by the sanity report Maria sent on 2019-06-20.

@Peng, Can you confirm the vswitch type from this occurrence?

tags:	added: stx.containers stx.sanity
Changed in starlingx:
status:	New → Incomplete

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-24:

#4

VSWITCH_TYPE : ovs-dpdk

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-24:

#5

ALL_NODES_20190624.161925.tar Edit (90.6 MiB, application/x-tar)

Issue was reproduced on
Lab: WCP_71_75
Load: 20190623T233000Z

VSWITCH_TYPE : ovs-dpdk

New log attached

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-06-24:

#6

Given that the issue is reproduced again with a more recent load, this proves that this is not a duplicate of https://bugs.launchpad.net/starlingx/+bug/1833497 as the fix for that bug was merged on 2019-06-20.

Zhipeng, please review the logs that are provided to determine why nova is returning a failure on the network allocation. If you need help for the networking team, please contact Forrest Zhao.

Changed in starlingx:
status:	Incomplete → New
importance:	Undecided → High
status:	New → Triaged

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-25:

#7

Last time VM success boot up was on
Lab: WCP_71_75
Load: 20190612T013000Z

Revision history for this message

zhipeng liu (zhipengs) wrote on 2019-06-25:

#8

OK, I will check further and also involve Forrest's team.

Zhipeng

Revision history for this message

ChenjieXu (midone) wrote on 2019-06-25:

#9

Hi Peng,

Is it possible to create a normal VM on the your multinode system? I notice that Nova can't connect to database in both ALL_NODES_20190620.210149.tar and ALL_NODES_20190624.161925.tar. If Nova really can't connect to database, no VM can be created.

You can find the log in the following path:
ALL_NODES_20190624.161925/controller-0_20190624.161925/var/log/containers/nova-scheduler-755c98cf54-bqhqg_openstack_nova-scheduler-e5a2a9a3b58aa25e1684c73caf9525af925e342396bd668a2c03e30f5173aac6.log
ALL_NODES_20190624.161925/controller-0_20190624.161925/var/log/containers/nova-conductor-6787ff97bc-qrklj_openstack_nova-conductor-94e02872794725b867494c9872c98a525e6bcdc196e13db2092ed138f61f7e01.log
ALL_NODES_20190620.210149/controller-1_20190620.210149/var/log/containers/nova-conductor-78c98f5d74-mbgxt_openstack_nova-conductor-9e807cb9e82a0def0ac6dcba6d19525d875290f265a500bfd6a72aa29e6714e7.log
ALL_NODES_20190620.210149/controller-1_20190620.210149/var/log/containers/nova-scheduler-8489d97b99-wls9s_openstack_nova-scheduler-bfc78d486a114c955b38fc8e8deaeb9de40fed9e5cc459ac7a4d6c5c04b05708.log

{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines [req-70252872-3d13-472c-bfc7-a79e30e2ae48 - - - - -] Database connection was found disconnected; reconnecting: DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)\n","stream":"stdout","time":"2019-06-20T19:59:00.613656542Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):\n","stream":"stdout","time":"2019-06-20T19:59:00.613685821Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines File \"/var/lib/openstack/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 73, in _connect_ping_listener\n","stream":"stdout","time":"2019-06-20T19:59:00.613689557Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines connection.scalar(select([1]))\n","stream":"stdout","time":"2019-06-20T19:59:00.613692685Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines File \"/var/lib/openstack/lib/python2.7/site-packages/sqlalchemy/engine/base.py\", line 912, in scalar\n","stream":"stdout","time":"2019-06-20T19:59:00.613695482Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines return self.execute(object_, *multiparams, **params).scalar()\n","stream":"stdout","time":"2019-06-20T19:59:00.613698414Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines File \"/var/lib/openstack/lib/python2.7/site-packages/sqlalchemy/engine/base.py\", line 980, in execute\n","stream":"stdout","time":"2019-06-20T19:59:00.613701252Z"}

Hi Peng,

Is it possible to create a normal VM on the your multinode system? I notice that Nova can't connect to database in both ALL_NODES_20190620.210149.tar and ALL_NODES_20190624.161925.tar. If Nova really can't connect to database, no VM can be created.

You can find the log in the following path:
ALL_NODES_20190624.161925/controller-0_20190624.161925/var/log/containers/nova-scheduler-755c98cf54-bqhqg_openstack_nova-scheduler-e5a2a9a3b58aa25e1684c73caf9525af925e342396bd668a2c03e30f5173aac6.log
ALL_NODES_20190624.161925/controller-0_20190624.161925/var/log/containers/nova-conductor-6787ff97bc-qrklj_openstack_nova-conductor-94e02872794725b867494c9872c98a525e6bcdc196e13db2092ed138f61f7e01.log
ALL_NODES_20190620.210149/controller-1_20190620.210149/var/log/containers/nova-conductor-78c98f5d74-mbgxt_openstack_nova-conductor-9e807cb9e82a0def0ac6dcba6d19525d875290f265a500bfd6a72aa29e6714e7.log
ALL_NODES_20190620.210149/controller-1_20190620.210149/var/log/containers/nova-scheduler-8489d97b99-wls9s_openstack_nova-scheduler-bfc78d486a114c955b38fc8e8deaeb9de40fed9e5cc459ac7a4d6c5c04b05708.log

{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines [req-70252872-3d13-472c-bfc7-a79e30e2ae48 - - - - -] Database connection was found disconnected; reconnecting: DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)\n","stream":"stdout","time":"2019-06-20T19:59:00.613656542Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):\n","stream":"stdout","time":"2019-06-20T19:59:00.613685821Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines   File \"/var/lib/openstack/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 73, in _connect_ping_listener\n","stream":"stdout","time":"2019-06-20T19:59:00.613689557Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines     connection.scalar(select([1]))\n","stream":"stdout","time":"2019-06-20T19:59:00.613692685Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines   File \"/var/lib/openstack/lib/python2.7/site-packages/sqlalchemy/engine/base.py\", line 912, in scalar\n","stream":"stdout","time":"2019-06-20T19:59:00.613695482Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines     return self.execute(object_, *multiparams, **params).scalar()\n","stream":"stdout","time":"2019-06-20T19:59:00.613698414Z"}
{"log":"2019-06-20 19:59:00.612 1 ERROR oslo_db.sqlalchemy.engines   File \"/var/lib/openstack/lib/python2.7/site-packages/sqlalchemy/engine/base.py\", line 980, in execute\n","stream":"stdout","time":"2019-06-20T19:59:00.613701252Z"}

Revision history for this message

zhipeng liu (zhipengs) wrote on 2019-06-25:

#10

Hi Chenjie,
Sanity test already include VM creation, I also test VM creation last week, it should be OK.

Zhipeng

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-25:

#11

Download full text (5.1 KiB)

The first VM creating was success during the whole test suite run, but the rest VM creating all failed.

[2019-06-20 17:52:41,191] 268 DEBUG MainThread ssh.send :: Send 'nova --os-username 'tenant1' --os-password 'Li69nux*' --os-project-name tenant1 --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne boot --boot-volume=ed3cef11-0f1f-4005-b8aa-f4acc26df0b8 --key-name=keypair-tenant1 --poll --flavor=e5282554-4e4f-483b-9cd8-4491a6364a4c --nic net-id=6fa3c05a-b858-42f9-a228-fd8005f8b52e --nic net-id=ccaedbe7-495f-4aa1-a05c-c0d1bf611a1a tenant1-vm-1'
[2019-06-20 17:53:00,223] 387 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+-------------------------------------------------+
| Property | Value |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| adminPass | m8XPHUJi5d3i |
| config_drive | |
| created | 2019-06-20T17:52:44Z |
| description | - |
| flavor:disk | 2 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {"hw:mem_page_size": "2048"} |
| flavor:original_name | flavor-default-size2 |
| flavor:ram | 1024 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | |
| id | 1174d120-8b62-4fa9-af24-08734d1bc7f2 |
| image | Attempt to boot from volume - no image supplied |
| key_name | keypair-tenant1 ...

The first VM creating was success during the whole test suite run, but the rest VM creating all failed.

[2019-06-20 17:52:41,191] 268  DEBUG MainThread ssh.send    :: Send 'nova --os-username 'tenant1' --os-password 'Li69nux*' --os-project-name tenant1 --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default  --os-endpoint-type internalURL --os-region-name RegionOne boot --boot-volume=ed3cef11-0f1f-4005-b8aa-f4acc26df0b8 --key-name=keypair-tenant1 --poll --flavor=e5282554-4e4f-483b-9cd8-4491a6364a4c --nic net-id=6fa3c05a-b858-42f9-a228-fd8005f8b52e --nic net-id=ccaedbe7-495f-4aa1-a05c-c0d1bf611a1a tenant1-vm-1'
[2019-06-20 17:53:00,223] 387  DEBUG MainThread ssh.expect  :: Output: 
+--------------------------------------+-------------------------------------------------+
| Property                             | Value                                           |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                          |
| OS-EXT-AZ:availability_zone          |                                                 |
| OS-EXT-STS:power_state               | 0                                               |
| OS-EXT-STS:task_state                | scheduling                                      |
| OS-EXT-STS:vm_state                  | building                                        |
| OS-SRV-USG:launched_at               | -                                               |
| OS-SRV-USG:terminated_at             | -                                               |
| accessIPv4                           |                                                 |
| accessIPv6                           |                                                 |
| adminPass                            | m8XPHUJi5d3i                                    |
| config_drive                         |                                                 |
| created                              | 2019-06-20T17:52:44Z                            |
| description                          | -                                               |
| flavor:disk                          | 2                                               |
| flavor:ephemeral                     | 0                                               |
| flavor:extra_specs                   | {"hw:mem_page_size": "2048"}                    |
| flavor:original_name                 | flavor-default-size2                            |
| flavor:ram                           | 1024                                            |
| flavor:swap                          | 0                                               |
| flavor:vcpus                         | 1                                               |
| hostId                               |                                                 |
| id                                   | 1174d120-8b62-4fa9-af24-08734d1bc7f2            |
| image                                | Attempt to boot from volume - no image supplied |
| key_name                             | keypair-tenant1                                 |
| locked                               | False                                           |
| metadata                             | {}                                              |
| name                                 | tenant1-vm-1                                    |
| os-extended-volumes:volumes_attached | []                                              |
| progress                             | 0                                               |
| security_groups                      | default                                         |
| server_groups                        | []                                              |
| status                               | BUILD                                           |
| tags                                 | []                                              |
| tenant_id                            | c9bc020e94994f9c9a9a48f09b9746b7                |
| trusted_image_certificates           | -                                               |
| updated                              | 2019-06-20T17:52:44Z                            |
| user_id                              | 938d7f86970d4b929a492ccf29fddfef                |
+--------------------------------------+-------------------------------------------------+

Server building... 0% complete
Server building... 0% complete
Server building... 0% complete
Server building... 100% complete
Finished
[sysadmin@controller-1 ~(keystone_admin)]$ 
[2019-06-20 17:53:00,223] 268  DEBUG MainThread ssh.send    :: Send 'echo $?'
[2019-06-20 17:53:00,326] 387  DEBUG MainThread ssh.expect  :: Output: 
0
[sysadmin@controller-1 ~(keystone_admin)]$ 
[2019-06-20 17:53:00,327] 126  DEBUG MainThread table_parser.__table :: skipping invalid table line: 
[2019-06-20 17:53:00,327] 126  DEBUG MainThread table_parser.__table :: skipping invalid table line: 
Server building... 0% complete
Server building... 0% complete
Server building... 0% complete
Server building... 100% complete

Revision history for this message

ChenjieXu (midone) wrote on 2019-06-26:

#12

Hi Peng,

Did you create the ubuntu_14 vm with virtio vifs for the first VM? For the resting VM, what kind of VM have you tried to create? I'm wondering whether this is related to "creating ubuntu_14 vm with virtio vifs" or not.

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-26:

#13

The test suite tried to launch 11 VMs. The first VM launch was on tis-centos-guest image and it was success. The following 8 VMs launching were failed. The images were included,
tis-centos-guest vm with virtio vifs
ubuntu_14 vm with virtio vifs

The last two VMs launching were also success.

Revision history for this message

zhipeng liu (zhipengs) wrote on 2019-06-28:

#14

Hi peng,
Need your confirm about below.

First VM launched with tis-cnetos-guest success
For follow 8 VMs, which image used, ubuntu_14?
For last 2 VMs, which image used?

Thanks!
Zhipeng

Revision history for this message

Peng Peng (ppeng) wrote on 2019-06-28:

#15

TIS_AUTOMATION.log Edit (4.7 MiB, text/plain)

The images were vary, please search "RegionOne boot" in attached execution log for detail.

Ghada Khalil (gkhalil) on 2019-06-28

tags:

added: stx.2.0

Revision history for this message

ChenjieXu (midone) wrote on 2019-07-01:

#16

Download full text (11.5 KiB)

Hi Pengpeng,

Based on the log TIS_AUTOMATION.log, this bug is caused by "system application-apply stx-openstack stuck". It's not caused by "boot a ubuntu_14 vm with virtio vifs".

The log can be summarized as following:
1.The first vm was booted successfully

2.the stx-openstack was re-applied for some reason. But unfortunately the process stuck at "processing chart: osh-openstack-ceph-rgw, overall completion: 44.0%".

[2019-06-20 18:43:13,476] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2019-06-20 18:43:14,997] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+------------------------------+-------------------------------+----------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+------------------------------+-------------------------------+----------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| hello-kitty | 1.0 | hello-kitty | manifest.yaml | upload-failed | Upload of application hello-kitty (1.0) failed: Command '['helm-upload', 'starlingx', u'/scratch/apps/hello-kitty/1.0/charts |
| | | | | | /hello-kitty.tgz']' returned non-zero exit status 1 |
| | | | | | |
| platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-16-centos-stable- | armada-manifest | stx-openstack. | applying | processing chart: osh-openstack-ceph-rgw, overall completion: 44.0% |
| | versioned | | yaml | | |
| | | | | | ...

Hi Pengpeng,

Based on the log TIS_AUTOMATION.log, this bug is caused by "system application-apply stx-openstack stuck". It's not caused by "boot a ubuntu_14 vm with virtio vifs".

The log can be summarized as following:
1.The first vm was booted successfully

2.the stx-openstack was re-applied for some reason. But unfortunately the process stuck at "processing chart: osh-openstack-ceph-rgw, overall completion: 44.0%".

[2019-06-20 18:43:13,476] 268  DEBUG MainThread ssh.send    :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default  --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2019-06-20 18:43:14,997] 387  DEBUG MainThread ssh.expect  :: Output: 
+---------------------+------------------------------+-------------------------------+----------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| application         | version                      | manifest name                 | manifest file  | status        | progress                                                                                                                     |
+---------------------+------------------------------+-------------------------------+----------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| hello-kitty         | 1.0                          | hello-kitty                   | manifest.yaml  | upload-failed | Upload of application hello-kitty (1.0) failed: Command '['helm-upload', 'starlingx', u'/scratch/apps/hello-kitty/1.0/charts |
|                     |                              |                               |                |               | /hello-kitty.tgz']' returned non-zero exit status 1                                                                          |
|                     |                              |                               |                |               |                                                                                                                              |
| platform-integ-apps | 1.0-7                        | platform-integration-manifest | manifest.yaml  | applied       | completed                                                                                                                    |
| stx-openstack       | 1.0-16-centos-stable-        | armada-manifest               | stx-openstack. | applying      | processing chart: osh-openstack-ceph-rgw, overall completion: 44.0%                                                          |
|                     | versioned                    |                               | yaml           |               |                                                                                                                              |
|                     |                              |                               |                |               |                                                                                                                              |
+---------------------+------------------------------+-------------------------------+----------------+---------------+------------------------------------------------------------------------------------------------------------------------------+

3. The following 8 VMs launching were failed because the system application stuck at 44%.

4. Finally, "system host-unlock controller-0" was executed:
[2019-06-20 19:59:12,091] 268  DEBUG MainThread ssh.send    :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default  --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

[2019-06-20 19:59:19,672] 387  DEBUG MainThread ssh.expect  :: Output: 
+---------------------+--------------------------------------------+
| Property            | Value                                      |
+---------------------+--------------------------------------------+
| action              | none                                       |
| administrative      | locked                                     |
| availability        | online                                     |
| bm_ip               | 128.224.64.171                             |
| bm_type             | bmc                                        |
| bm_username         | root                                       |
| boot_device         | /dev/disk/by-path/pci-0000:00:1f.2-ata-1.0 |
| capabilities        | {u'stor_function': u'monitor'}             |
| config_applied      | 40ed044c-e912-476c-9cba-49ccf2167591       |
| config_status       | None                                       |
| config_target       | 40ed044c-e912-476c-9cba-49ccf2167591       |
| console             | ttyS0,115200n8                             |
| created_at          | 2019-06-20T13:39:28.299812+00:00           |
| hostname            | controller-0                               |
| id                  | 1                                          |
| install_output      | text                                       |
| install_state       | None                                       |
| install_state_info  | None                                       |
| invprovision        | provisioned                                |
| location            | {}                                         |
| mgmt_ip             | 192.168.204.3                              |
| mgmt_mac            | 3c:fd:fe:a0:0d:60                          |
| operational         | disabled                                   |
| personality         | controller                                 |
| reserved            | False                                      |
| rootfs_device       | /dev/disk/by-path/pci-0000:00:1f.2-ata-1.0 |
| serialid            | None                                       |
| software_load       | 19.01                                      |
| task                | Unlocking                                  |
| tboot               | false                                      |
| ttys_dcd            | None                                       |
| updated_at          | 2019-06-20T19:58:51.353076+00:00           |
| uptime              | 8201                                       |
| uuid                | deee5af8-e2b7-457a-ae8e-61071aef623f       |
| vim_progress_status | services-disabled                          |
+---------------------+--------------------------------------------+

5. It takes some time to start all the pods:
[2019-06-20 20:08:29,902] 268  DEBUG MainThread ssh.send    :: Send 'kubectl get pod --all-namespaces -o=wide --field-selector=status.phase!=Running,status.phase!=Succeeded'
[2019-06-20 20:08:30,146] 387  DEBUG MainThread ssh.expect  :: Output: 
NAMESPACE     NAME                                  READY   STATUS            RESTARTS   AGE     IP               NODE           NOMINATED NODE   READINESS GATES
kube-system   kube-sriov-cni-ds-amd64-68x6m         0/1     Pending           0          2m57s   <none>           controller-0   <none>           <none>
openstack     aodh-api-7b6f95d469-cknc2             0/1     Init:0/1          0          10m     <none>           controller-0   <none>           <none>
openstack     aodh-evaluator-8494565687-tlx47       0/1     PodInitializing   0          10m     172.16.192.78    controller-0   <none>           <none>
openstack     aodh-listener-5ccb6c9658-md6dx        0/1     Init:0/1          0          9m59s   <none>           controller-0   <none>           <none>
openstack     aodh-notifier-7478cd4d68-tpz92        0/1     PodInitializing   0          10m     172.16.192.89    controller-0   <none>           <none>
openstack     barbican-api-86c4f5486b-2lnm5         0/1     PodInitializing   0          10m     172.16.192.93    controller-0   <none>           <none>
openstack     ceilometer-central-6d7cd696dc-7qvtn   0/1     Init:0/1          0          10m     172.16.192.127   controller-0   <none>           <none>
openstack     cinder-api-559d65f4fc-x77wp           0/1     Init:1/2          0          10m     172.16.192.82    controller-0   <none>           <none>
openstack     cinder-backup-b799d5c5c-mzmsn         0/1     Init:3/4          0          10m     172.16.192.96    controller-0   <none>           <none>
openstack     cinder-scheduler-bf86b7978-zn6mg      0/1     Init:0/2          0          10m     <none>           controller-0   <none>           <none>
openstack     cinder-volume-5c467cc4bd-6rw6q        0/1     Init:0/4          0          10m     <none>           controller-0   <none>           <none>
openstack     glance-api-76c5d794fb-bwhwb           0/1     Init:0/3          0          10m     172.16.192.72    controller-0   <none>           <none>
openstack     gnocchi-api-79d9b89d79-c48zr          0/1     Init:1/2          0          10m     172.16.192.80    controller-0   <none>           <none>
openstack     gnocchi-metricd-hnd66                 0/1     Init:0/2          0          2m57s   172.16.192.119   controller-0   <none>           <none>
openstack     heat-api-86b9b7f996-hrh7p             0/1     PodInitializing   0          10m     172.16.192.118   controller-0   <none>           <none>
openstack     heat-cfn-5bb8766988-pgrqv             0/1     Init:0/1          0          10m     <none>           controller-0   <none>           <none>
openstack     heat-engine-697584c5bd-7nzct          0/1     PodInitializing   0          10m     172.16.192.120   controller-0   <none>           <none>
openstack     horizon-5855cbd5f7-2qzz5              0/1     Init:0/1          0          2m55s   <none>           controller-0   <none>           <none>
openstack     mariadb-ingress-6ff964556d-c8zk2      0/1     PodInitializing   0          10m     172.16.192.109   controller-0   <none>           <none>
openstack     mariadb-server-1                      0/1     Init:0/2          0          9m27s   <none>           controller-0   <none>           <none>
openstack     nova-api-metadata-54974ff967-nh2k4    0/1     Init:1/2          0          10m     172.16.192.110   controller-0   <none>           <none>
openstack     nova-api-osapi-6968b9cb47-2c7xw       0/1     Init:0/1          0          10m     <none>           controller-0   <none>           <none>
openstack     nova-conductor-78c98f5d74-hs675       0/1     PodInitializing   0          9m59s   172.16.192.85    controller-0   <none>           <none>
openstack     nova-consoleauth-7d56c4c646-jnkrk     0/1     PodInitializing   0          10m     172.16.192.92    controller-0   <none>           <none>
openstack     nova-novncproxy-576bcd4d65-tg6jz      0/1     Init:1/3          0          10m     192.168.204.3    controller-0   <none>           <none>
openstack     nova-scheduler-8489d97b99-zkfcp       0/1     PodInitializing   0          10m     172.16.192.121   controller-0   <none>           <none>
openstack     osh-openstack-rabbitmq-rabbitmq-0     0/1     Init:0/4          0          9m41s   <none>           controller-0   <none>           <none>
openstack     panko-api-7cc84d4f-4p2fm              0/1     PodInitializing   0          10m     172.16.192.114   controller-0   <none>           <none>
openstack     placement-api-b5577b4b7-jn4l5         0/1     PodInitializing   0          10m     172.16.192.105   controller-0   <none>           <none>

6. After all the pods running correctly, the last 2 VMs can be booted succesfully.

Revision history for this message

ChenjieXu (midone) wrote on 2019-07-01:

#17

Hi all,

An email has been sent to Ngo, Tee who is responsible for the following bug and I ask him to take a look at this bug. The following bug is caused by rebooting the host before system application-apply successfully:
https://bugs.launchpad.net/starlingx/+bug/1833323

I think we need a better mechanism to deal with the failures of system application-apply.

Revision history for this message

ChenjieXu (midone) wrote on 2019-07-01:

#18

Hi all,

According to Shuicheng, this bug may be a duplicate bug of the following bug:
https://bugs.launchpad.net/starlingx/+bug/1833609

Revision history for this message

ChenjieXu (midone) wrote on 2019-07-02:

#19

Hi all,

The following logs are similar in the bug( https://bugs.launchpad.net/starlingx/+bug/1833609):

2019-06-24 10:51:58.243 36 INFO armada.handlers.wait [-] [chart=openstack-ceph-rgw]: Waiting for resource type=pod, namespace=openstack labels=release_group=osh-openstack-ceph-rgw required=True for 1800s^[[00m
2019-06-24 10:51:58.243 36 DEBUG armada.handlers.wait [-] [chart=openstack-ceph-rgw]: Starting to wait on: namespace=openstack, resource type=pod, label_selector=(release_group=osh-openstack-ceph-rgw), timeout=1800 _watch_resource_completions /usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py:362^[[00m

You can find the above logs in:
ALL_NODES_20190624.161925/controller-1_20190624.161925/var/log/armada/stx-openstack-apply.log
ALL_NODES_20190620.210149/controller-1_20190620.210149/var/log/armada/stx-openstack-apply.log

Revision history for this message

ChenjieXu (midone) wrote on 2019-07-02:

#20

Hi all,

Shuicheng has confirmed that this bug should be a duplicate bug of the bug ( https://bugs.launchpad.net/starlingx/+bug/1833609 ) per the following log:

2019-06-24 10:51:58.243 36 INFO armada.handlers.wait [-] [chart=openstack-ceph-rgw]: Waiting for resource type=pod, namespace=openstack labels=release_group=osh-openstack-ceph-rgw required=True for 1800s^[[00m
2019-06-24 10:51:58.243 36 DEBUG armada.handlers.wait [-] [chart=openstack-ceph-rgw]: Starting to wait on: namespace=openstack, resource type=pod, label_selector=(release_group=osh-openstack-ceph-rgw), timeout=1800 _watch_resource_completions /usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py:362^[[00m

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-07-17:

#21

Duplicate bug is fixed by: https://review.opendev.org/670445
Merged on 2019-07-17

Marking as Fix Released

Changed in starlingx:
status:	Triaged → Fix Released

StarlingX

VM boot up ERROR by Failed to allocate the network(s), not rescheduling

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches