empty nova service and hypervisor list

Bug #1682060 reported by Eduardo Gonzalez
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann
kolla-ansible
Fix Released
High
Eduardo Gonzalez
Ocata
Fix Released
Undecided
Unassigned
openstack-manuals
Fix Released
Undecided
Matt Riedemann

Bug Description

In current master, openstack compute service list and openstack hypervisor list (same issue with nova cli) result in an empty list.
If I check the database, services are registered in the database.

[root@controller tools]# docker exec -ti kolla_toolbox mysql -unova -pYVDa3l8vA57Smnbu9Q5qdgSKJckNxP3Q3rYvVxsD -h 192.168.100.10 nova -e "SELECT * from services WHERE topic = 'compute'";
+---------------------+---------------------+------------+----+------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+
| created_at | updated_at | deleted_at | id | host | binary | topic | report_count | disabled | deleted | disabled_reason | last_seen_up | forced_down | version |
+---------------------+---------------------+------------+----+------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+
| 2017-04-12 09:12:10 | 2017-04-12 09:14:33 | NULL | 9 | controller | nova-compute | compute | 13 | 0 | 0 | NULL | 2017-04-12 09:14:33 | 0 | 17 |
+---------------------+---------------------+------------+----+------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+
[root@controller tools]# openstack compute service list --long

[root@controller tools]#
[root@controller tools]# openstack hypervisor list --long

[root@controller tools]#

Logs from kolla deploy gates http://logs.openstack.org/08/456108/1/check/gate-kolla-dsvm-deploy-centos-source-centos-7-nv/9cf1e73/

Environment:
- source code
- OS: centos/ubuntu/oraclelinux
- Deployment type: kolla-ansible

Please let me know if more info is needed or if there is a workaround.
Regards

Tags: cells doc
Revision history for this message
Matt Riedemann (mriedem) wrote :

Do you have the compute_nodes (hosts) mapped to a cell? Have you run the nova-manage cell_v2 discover_hosts command? Could be related to bug 1682001.

no longer affects: python-novaclient
Revision history for this message
Matt Riedemann (mriedem) wrote :

It's likely a result of https://review.openstack.org/#/c/442162/ which requires that the hosts are mapped to the cell before you can list them out of the API.

See the setup sections in this doc for more details on discover_hosts:

https://docs.openstack.org/developer/nova/cells.html

And:

https://docs.openstack.org/developer/nova/man/nova-manage.html#nova-cells-v2

Revision history for this message
Matt Riedemann (mriedem) wrote :

The problem described to me in the kolla IRC channel was kolla polls 'nova service-list' until the computes show up, otherwise if the computes aren't there when simple_cell_setup runs it will stop before it creates the first (main) cell.

With this change in the API: https://review.openstack.org/#/c/442162/21

We iterate the cells to list services, so if you don't have the cell created yet, you'll get nothing back.

The simple_cell_setup command is not great for a fresh install, since it's not so simple it turns out. It works a bit easier for upgrades from non-cells to cells but for a fresh install I wouldn't use it. I'd change tooling for a fresh install to follow the steps in this document:

https://docs.openstack.org/developer/nova/cells.html#fresh-install

Changed in nova:
status: New → Invalid
Changed in kolla-ansible:
status: New → In Progress
assignee: nobody → Eduardo Gonzalez (egonzalez90)
importance: Undecided → High
milestone: none → pike-1
Revision history for this message
Matt Riedemann (mriedem) wrote :

Hmm, I see that https://review.openstack.org/#/c/442162/ also breaks step 6 in our cells v2 install guide which tells you to start your nova-compute services and then poll for them using the 'nova hypervisor-list' command, but that relies on the hosts being mapped in the cell first, and you need to run discover_hosts for that (which is step 7 in the install guide). So we have a bit of a catch-22 here. I'm going to re-open this bug for nova simply because we have a documentation issue that needs to be fixed.

We might also need to raise that as a DocImpact bug for the install guide in openstack-manuals.

Changed in nova:
status: Invalid → In Progress
assignee: nobody → Matt Riedemann (mriedem)
tags: added: cells doc
Changed in nova:
importance: Undecided → Medium
Changed in kolla-ansible:
status: In Progress → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/456920

Changed in kolla-ansible:
status: Confirmed → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :

I've added openstack-manuals to this because it also tells you to use "nova hypervisor-list" to tell when a compute host is up and running but before running discover_hosts on it.

https://docs.openstack.org/ocata/install-guide-ubuntu/nova-compute-install.html#add-the-compute-node-to-the-cell-database

So we need to update that as well.

Changed in openstack-manuals:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/456923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-manuals (master)

Fix proposed to branch: master
Review: https://review.openstack.org/456944

Changed in openstack-manuals:
assignee: nobody → Matt Riedemann (mriedem)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-manuals (master)

Reviewed: https://review.openstack.org/456944
Committed: https://git.openstack.org/cgit/openstack/openstack-manuals/commit/?id=b14a671f2259b5e053796a3318c7d9872d6b784d
Submitter: Jenkins
Branch: master

commit b14a671f2259b5e053796a3318c7d9872d6b784d
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 14 14:18:48 2017 -0400

    Update nova-compute-install steps for multi-cell support

    Similar to the nova change with the same change ID, we
    need to update the install guide for Pike due to some
    limitations in the compute API when supporting multiple
    cells. The hypervisor list command will not work before the
    hosts are discovered, and with the steps as they are now in the
    guide, you wait to discover the hosts until the hypervisors
    show up, which is a chicken-and-egg problem.

    The solution proposed is to use the openstack compute service
    list command instead, which does not have the same host
    discovery pre-requisite limitation. The nova cells install guide
    docs are also being updated in the nova tree for the same issue.

    Kolla is also making a similar change for their Pike deployment
    scripts:

    Id061e8039e72de77a04c51657705457193da2d0f

    Change-Id: If2baab40c2e2a3de20e561bba50688d615b002ef
    Closes-Bug: #1682060

Changed in openstack-manuals:
status: In Progress → Fix Released
Changed in nova:
assignee: Matt Riedemann (mriedem) → Dan Smith (danms)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/456923
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9a5c3cd7da76a7340861d552718e7e46640f15be
Submitter: Jenkins
Branch: master

commit 9a5c3cd7da76a7340861d552718e7e46640f15be
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 14 11:21:17 2017 -0400

    Add release note and update cell install guide for multi-cell limitations

    As of change If1e03c9343b8cc9c34bd51c2b4d25acdb21131ff, using
    "nova hypervisor-list" before compute hosts are mapped to a cell
    will result in an empty list.

    Our cells v2 install steps mention using 'nova hypervisor-list' after
    creating a cell and starting compute services to tell when to run
    the discover_hosts command, but now hypervisor-list won't work until
    you've run discover_hosts, so it's a catch-22.

    This change adds a release note to let people writing deployment tools
    to know about the change in behavior and also updates the install steps
    to use service-list instead of hypervisor-list, since service-list does
    not require the compute host to be mapped to the cell first.

    We are going to need to make a similar change in the OpenStack install
    guide since that also mentions using 'nova hypervisor-list' before
    discover_hosts.

    Change-Id: If2baab40c2e2a3de20e561bba50688d615b002ef
    Closes-Bug: #1682060

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.openstack.org/456920
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=5dfb81efc8975d68ed9b343ab85f6f937bdbec5d
Submitter: Jenkins
Branch: master

commit 5dfb81efc8975d68ed9b343ab85f6f937bdbec5d
Author: Eduardo Gonzalez <email address hidden>
Date: Fri Apr 14 16:02:40 2017 +0100

    Update simple_cell_setup to manual creation

    Simple_cell_setup is not recomended to use.
    Is better create map_cell0 manually, create base
    cell for non cell deployments and run discover_hosts.

    This PS migrate actual config to make use of described
    workflow at [1]. We our actual workflow we're running
    into the issue that services are not mapped until cells
    are present, breaking deployment waiting for compute
    services to appear.

    [1] https://docs.openstack.org/developer/nova/cells.html#fresh-install

    Change-Id: Id061e8039e72de77a04c51657705457193da2d0f
    Closes-Bug: #1682060

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b2

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 5.0.0.0b2

This issue was fixed in the openstack/kolla-ansible 5.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/ocata)

Reviewed: https://review.openstack.org/482513
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=045ce3ebb8f13b6efa6a1b9d33575e507d214d7f
Submitter: Jenkins
Branch: stable/ocata

commit 045ce3ebb8f13b6efa6a1b9d33575e507d214d7f
Author: Eduardo Gonzalez <email address hidden>
Date: Fri Apr 14 16:02:40 2017 +0100

    Update simple_cell_setup to manual creation

    Simple_cell_setup is not recomended to use.
    Is better create map_cell0 manually, create base
    cell for non cell deployments and run discover_hosts.

    This PS migrate actual config to make use of described
    workflow at [1]. We our actual workflow we're running
    into the issue that services are not mapped until cells
    are present, breaking deployment waiting for compute
    services to appear.

    [1] https://docs.openstack.org/developer/nova/cells.html#fresh-install

    Change-Id: Id061e8039e72de77a04c51657705457193da2d0f
    Closes-Bug: #1682060
    (cherry picked from commit 5dfb81efc8975d68ed9b343ab85f6f937bdbec5d)

Revision history for this message
Khawar Munir Abbasi (ekhawmu) wrote :

I am facing this problem in fresh installation of newton (3.0.3) release.

nova service-list
nova service-list
+----+------------------+-----------------+----------+---------+-------+----------------------------+-----------------+-------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down |
+----+------------------+-----------------+----------+---------+-------+----------------------------+-----------------+-------------+
| 1 | nova-consoleauth | silpixa00379576 | internal | enabled | up | 2017-11-08T09:49:11.000000 | - | False |
| 5 | nova-scheduler | silpixa00379576 | internal | enabled | up | 2017-11-08T09:49:15.000000 | - | False |
| 11 | nova-conductor | silpixa00379576 | internal | enabled | up | 2017-11-08T09:49:16.000000 | - | False |
+----+------------------+-----------------+----------+---------+-------+----------------------------+-----------------+-------------+

nova hypervisor-list

+----+---------------------+-------+--------+
| ID | Hypervisor hostname | State | Status |
+----+---------------------+-------+--------+
+----+---------------------+-------+--------+

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 4.0.3

This issue was fixed in the openstack/kolla-ansible 4.0.3 release.

Matt Riedemann (mriedem)
Changed in nova:
assignee: Dan Smith (danms) → Matt Riedemann (mriedem)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.