Zun

Zun api status Error

Bug #1930801 reported by ty
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zun
New
Undecided
Unassigned

Bug Description

When I try to start a container from a controller by specifying the host of a compute node with zun api, the status becomes "error".

| status | Error
| status_reason | There are not enough hosts available.

I checked the status of the host.

"compute5" is a compute node
"compute1" is the controller node
If I use zun-client on the compute node, i can start the container.
The error occurs only when the host is specified from the controller.

root@username:~# openstack appcontainer host show compute5
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| uuid | 8117ee69-981b-4130-9943-d525baa3e55f |
| links | [{'href': 'http://controller:9517/v1/hosts/8117ee69-981b-4130-9943-d525baa3e55f', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/8117ee69-981b-4130-9943-d525baa3e55f', 'rel': 'bookmark'}] |
| hostname | compute5 |
| mem_total | 64242 |
| mem_used | 1845 |
| total_containers | 0 |
| cpus | 6 |
| cpu_used | 0.0 |
| architecture | x86_64 |
| os_type | linux |
| os | Ubuntu 20.04.2 LTS |
| kernel_version | 5.8.0-53-generic |
| labels | {} |
| disk_total | 365 |
| disk_used | 0 |
| disk_quota_supported | False |
| runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] |
| enable_cpu_pinning | False |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
root@username:~# openstack appcontainer host show compute1
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| uuid | 2149f13e-d178-417d-9e89-93d6d5d20d42 |
| links | [{'href': 'http://controller:9517/v1/hosts/2149f13e-d178-417d-9e89-93d6d5d20d42', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/2149f13e-d178-417d-9e89-93d6d5d20d42', 'rel': 'bookmark'}] |
| hostname | compute1 |
| mem_total | 32074 |
| mem_used | 6158 |
| total_containers | 0 |
| cpus | 4 |
| cpu_used | 0.0 |
| architecture | x86_64 |
| os_type | linux |
| os | Ubuntu 20.04.2 LTS |
| kernel_version | 5.8.0-45-generic |
| labels | {} |
| disk_total | 706 |
| disk_used | 0 |
| disk_quota_supported | True |
| runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] |
| enable_cpu_pinning | False |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I get the same error when I specify the availability zone.

root@username:~# openstack appcontainer run --name cirros --host compute5 --net network=cca6ddf1-2b5a-4056-956f-280848174524 cirros
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| uuid | e70d998c-5791-4b08-a8e9-833926207fe1 |
| links | [{'href': 'http://controller:9517/v1/containers/e70d998c-5791-4b08-a8e9-833926207fe1', 'rel': 'self'}, {'href': 'http://controller:9517/containers/e70d998c-5791-4b08-a8e9-833926207fe1', 'rel': 'bookmark'}] |
| name | cirros2 |
| project_id | 7133870e4213486abaace91149ea0771 |
| user_id | 9a6140daebee4084bf1b15a2eaf9835e |
| image | cirros |
| cpu | 1.0 |
| cpu_policy | shared |
| memory | 512 |
| command | [] |
| status | Error |
| status_reason | There are not enough hosts available. |
| task_state | None |
| environment | {} |
| workdir | None |
| auto_remove | False |
| ports | None |
| hostname | None |
| labels | {} |
| addresses | None |
| image_pull_policy | None |
| host | None |
| restart_policy | None |
| status_detail | None |
| interactive | False |
| tty | False |
| image_driver | docker |
| security_groups | None |
| runtime | None |
| disk | 0 |
| auto_heal | False |
| privileged | False |
| healthcheck | None |
| registry_id | None |
| entrypoint | None |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Revision history for this message
ty (t3y) wrote :
Download full text (9.6 KiB)

# openstack allocation candidate list --resource VCPU=1 --resource MEMORY_MB=512
+---+----------------------+--------------------------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # | allocation | resource provider | inventory used/capacity | traits |
+---+----------------------+--------------------------------------+-------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...

Read more...

Revision history for this message
ty (t3y) wrote :
Download full text (5.2 KiB)

This is Zun-API log

021-06-13 03:49:03.553 1188376 DEBUG oslo_db.sqlalchemy.engines [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python3/dist-packages/oslo_db/sqlalchemy/engines.py:304
2021-06-13 03:49:03.574 1188376 DEBUG zun.common.quota [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] Getting quotas for project 7133870e4213486abaace91149ea0771. Resources:dict_keys(['containers', 'cpu', 'memory', 'disk']) _get_quotas /usr/local/lib/python3.8/dist-packages/zun/common/quota.py:156
2021-06-13 03:49:03.738 1188376 DEBUG zun.scheduler.client.query [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] Starting to schedule for containers: ['5d1f33ce-b79d-412e-ac11-0f72f3a26eb1'] select_destinations /usr/local/lib/python3.8/dist-packages/zun/scheduler/client/query.py:47
2021-06-13 03:49:03.987 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] REQ: curl -g -i -X GET http://controller:8778/traits?name=in:CUSTOM_ZUN_COMPUTE_STATUS_DISABLED -H "OpenStack-API-Version: placement 1.6" -H "User-Agent: zun-api keystoneauth1/4.2.1 python-requests/2.23.0 CPython/3.8.5" -H "X-Auth-Token: {SHA256}58e03064d3433b1f314dbaa485f73ad5c651f16513a356c9f4844a163aeef04e" -H "X-Openstack-Request-Id: req-e1d0b438-8e5c-4dee-b5e1-544fff2ea45b" -H "accept: application/json" _http_log_request /usr/lib/python3/dist-packages/keystoneauth1/session.py:517
2021-06-13 03:49:04.296 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] RESP: [200] Connection: Keep-Alive Content-Length: 50 Content-Type: application/json Date: Sat, 12 Jun 2021 18:49:03 GMT Keep-Alive: timeout=5, max=100 Server: Apache/2.4.41 (Ubuntu) openstack-api-version: placement 1.6 vary: openstack-api-version x-openstack-request-id: req-44bfa9be-7990-40ef-8580-ef017770b52c _http_log_response /usr/lib/python3/dist-packages/keystoneauth1/session.py:548
2021-06-13 03:49:04.296 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] RESP BODY: {"traits": ["CUSTOM_ZUN_COMPUTE_STATUS_DISABLED"]} _http_log_response /usr/lib/python3/dist-packages/keystoneauth1/session.py:580
2021-06-13 03:49:04.296 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] GET call to placement for http://controller:8778/traits?name=in:CUSTOM_ZUN_COMPUTE_STATUS_DISABLED used request id req-44bfa9be-7990-40ef-8580-ef017770b52c request /usr/lib/python3/dist-packages/keystoneauth1/session.py:944
2021-06-13 03:49:04.296 1188376 DEBUG zun.scheduler.request_filter [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] compute_status_filter request filter added forbidden trait CUSTOM_ZUN_COMPUTE_STATUS_DISABLED compute_status_filter /usr/local/lib/python3.8/dist-packages/zun/scheduler/request_filter.py:33
2021-06-13 03:49:04.299 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] REQ: curl -g -i -X GET http://controller:8778/alloca...

Read more...

Revision history for this message
hongbin (hongbin034) wrote :

See the following message in the log:

2021-06-13 03:49:03.987 1188376 DEBUG zun.scheduler.client.report [req-d70c2850-2991-4969-80aa-d282fc27edad - - - - -] REQ: curl -g -i -X GET http://controller:8778/traits?name=in:CUSTOM_ZUN_COMPUTE_STATUS_DISABLED -H "OpenStack-API-Version: placement 1.6" -H "User-Agent: zun-api keystoneauth1/4.2.1 python-requests/2.23.0 CPython/3.8.5" -H "X-Auth-Token: {SHA256}58e03064d3433b1f314dbaa485f73ad5c651f16513a356c9f4844a163aeef04e" -H "X-Openstack-Request-Id: req-e1d0b438-8e5c-4dee-b5e1-544fff2ea45b" -H "accept: application/json" _http_log_request /usr/lib/python3/dist-packages/keystoneauth1/session.py:517

It means this zun-compute service is disabled. Could you double check the service status in API. The command should be something like:

$ zun service-list

Revision history for this message
ty (t3y) wrote :

Thanks for the reply.
I'll post the result of running "$ zun service-list".
When I checked the zun-api.log, I also found "CUSTOM_ZUN_COMPUTE_STATUS_DISABLED" in the log, so I checked the status, but it didn't seem to be disabled.
nova can be invoked by specifying a host.
However, this problem occurs when I use zun on compute1 or compute5 on a physically different host and try to boot with the "--host" or "--availability zone" option.

~# zun service-list
+----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+
| Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone |
+----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+
| 2 | compute0 | zun-compute | up | False | None | 2021-06-14T06:16:22.000000 | zun-compute |
| 7 | compute1 | zun-compute | up | False | None | 2021-06-14T06:16:42.000000 | zun-compute |
| 8 | compute5 | zun-compute | up | False | None | 2021-06-14T06:16:33.000000 | zun-compute |
+----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+

Revision history for this message
ty (t3y) wrote :

I'm not sure of the cause and effect, but I found that only the first host registered in the zun DB works.
Therefore, if you delete 'compute0' and re-register any host, you can start the container with that host.
There may be a problem with the Zun API resource search or host search.

Revision history for this message
ty (t3y) wrote :

The problem has been solved.
Probably, this problem occurs when the hostname that can be obtained by "replacement" and the hostname specified by the zun API are different.
Since the hostname is registered on the zun side and not on the replacement side, no error will occur on the zun side, but an empty list will be returned when querying the resource, resulting in a status error "There are not enough hosts available".

  zun host | placement(nova) host |
hostA_zun | hostA_nova | -> "There are not enough hosts available"
   hostA | hostA | -> No problem

It was not a bug, but a problem with my configuration.
However, I think the error status information should be more detailed.
In this problem, I had enough resources, just not enough hosts to search.
I feel that the error status of "There are not enough hosts available" is insufficient.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.