test_aggregate_add_host_create_server_with_az fails with remote compute connection scenario

Bug #1294511 reported by Yu Liu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
tempest
Fix Released
Undecided
Unassigned

Bug Description

Problem:
If it is not all in one environment, it is the controller node connecting with remote nova compute node. It fails to run tempest test case of test_aggregate_add_host_create_server_with_az when create server with az, the server created with error status as below.

{"message": "NV-67B7376 No valid host was found. ", "code": 500, "details": " File \"/usr/lib/python2.6/site-packages/nova/scheduler/filter_scheduler.py\", line 108, in schedule_run_instance

Basic investigation:

Since the code logic is to add the host of nova compute which is the same of controller node as default. Above scenario is the compute node is not the same with controller, it is remote nova compute node, it will show "No valid host was found".

Yu Liu (liuyu342)
summary: test_aggregate_add_host_create_server_with_az fails with remote compute
- connetion
+ connection scenario
Revision history for this message
Yu Liu (liuyu342) wrote :

Root cause:

2014-03-19 07:26:34.070 15316 WARNING nova.scheduler.filters.compute_filter [req-13d9cfa3-1d8f-4c26-a46a-6636e8bd9a24 3d7ffb47dd41404491d4a0021c0a4d58 1fa8989d15164378b200a03ca4d125ed] NV-ACBDB7A (192-168-0-6, 192-168-0-6.scecd.ibm.com) ram:5340 disk:20480 io_ops:0 instances:0 has not been heard from in a while
2014-03-19 07:26:34.071 15316 INFO nova.filters [req-13d9cfa3-1d8f-4c26-a46a-6636e8bd9a24 3d7ffb47dd41404491d4a0021c0a4d58 1fa8989d15164378b200a03ca4d125ed] NV-9EF7356 Filter ComputeFilter returned 0 hosts
2014-03-19 07:26:34.071 15316 WARNING nova.scheduler.driver [req-13d9cfa3-1d8f-4c26-a46a-6636e8bd9a24 3d7ffb47dd41404491d4a0021c0a4d58 1fa8989d15164378b200a03ca4d125ed] [instance: 4c40d930-dc98-4d6b-9f68-d3bd67a339e0] NV-EAF7DD6 Setting instance to ERROR state.

compute_filter.py:

    def host_passes(self, host_state, filter_properties):
        """Returns True for only active compute nodes."""
        service = host_state.service
        if service['disabled']:
            LOG.debug(_("%(host_state)s is disabled, reason: %(reason)s"),
                      {'host_state': host_state,
                       'reason': service.get('disabled_reason')})
            return False
        else:
            if not self.servicegroup_api.service_is_up(service):
                LOG.warn(_("%(host_state)s has not been heard from in a "
                           "while"), {'host_state': host_state})
                return False
        return True

Revision history for this message
Mauro S M Rodrigues (maurorodrigues) wrote :

Thanks for your investigation!

Can you provide more information about your setup? And the logs of your run?

So currently I have no deploy with multiple nodes, so I ask: does it happen all the time?

Changed in tempest:
status: New → Incomplete
Tracy Jones (tjones-i)
tags: added: testing
Changed in tempest:
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Attila Fazekas (afazekas) wrote :

It looks like fixed, but the test_aggregates_basic_ops has a similar issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/94203

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/94203
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=7ddb14f2d3ecaa4bb6dd9e8bceebcfe3ea5a1913
Submitter: Jenkins
Branch: master

commit 7ddb14f2d3ecaa4bb6dd9e8bceebcfe3ea5a1913
Author: Attila Fazekas <email address hidden>
Date: Mon May 19 16:42:22 2014 +0200

    test_aggregates_basic_ops picks a non compute node

    test_aggregates host ties to add non-hypervisor node,
    (does not runs a nova compute service) to an aggregate.

    It can cause failures, if the openstack deployment has a node,
    with one of the following services n-net, n-cond, n-sch, n-net,
     but without n-cpu.

    It is a similar issue, what we had with the aggregate api test.

    Change-Id: Idbe037da73169e0ebce8a8bb5d7652dcc39eb92b
    Closing-Bug: #1318578
    Related-Bug: #1294511

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (stable/havana)

Related fix proposed to branch: stable/havana
Review: https://review.openstack.org/97724

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (stable/havana)

Reviewed: https://review.openstack.org/97724
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=8755bc9ec768eab3174812afad06bb70ca27575c
Submitter: Jenkins
Branch: stable/havana

commit 8755bc9ec768eab3174812afad06bb70ca27575c
Author: Attila Fazekas <email address hidden>
Date: Mon May 19 16:42:22 2014 +0200

    test_aggregates_basic_ops picks a non compute node

    test_aggregates host ties to add non-hypervisor node,
    (does not runs a nova compute service) to an aggregate.

    It can cause failures, if the openstack deployment has a node,
    with one of the following services n-net, n-cond, n-sch, n-net,
     but without n-cpu.

    It is a similar issue, what we had with the aggregate api test.

    Change-Id: Idbe037da73169e0ebce8a8bb5d7652dcc39eb92b
    Closing-Bug: #1318578
    Related-Bug: #1294511
    (cherry picked from commit 7ddb14f2d3ecaa4bb6dd9e8bceebcfe3ea5a1913)

tags: added: in-stable-havana
Joe Gordon (jogo)
Changed in tempest:
status: Incomplete → Fix Committed
Changed in nova:
status: New → Invalid
Changed in tempest:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.