some nova services are down after deployment cluster after redeployment cluster with enabled plugin

Bug #1570847 reported by Artem Hrechanychenko
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Matthew Mosesohn
Mitaka
Fix Released
High
Matthew Mosesohn

Bug Description

Detailed bug description:
There are 2 problems:
1. some nova services are down after deployment cluster after redeployment cluster with enabled plugin - services from deleted controllers stay in nova db as result marked as down - looks like this one problem is duplicate of https://bugs.launchpad.net/fuel/+bug/1471172

2. OSTF sanity test can not filter deleted/offline controllers in this test - addressed by patch-https://review.openstack.org/#/c/308387/

OSTF test fuel_health.tests.sanity.test_sanity_infrastructure.SanityInfrastructureTest.test_001_services_state failed with

Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor yield File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 601, in run testMethod() File "/usr/lib/python2.7/site-packages/fuel_health/tests/sanity/test_sanity_infrastructure.py", line 82, in test_001_services_state downstate not in output, 'Step 2 failed: Some nova services ' File "/usr/lib/python2.7/site-packages/fuel_health/common/test_mixins.py", line 164, in verify_response_true self.fail(message.format(failed_step_msg, msg)) File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 666, in fail raise self.failureException(msg) AssertionError: Step 2 failed: Some nova services have not been started.. Please refer to OpenStack logs for more details.

from ostf log:
| 1 | nova-consoleauth | node-1.test.domain.local | internal | enabled | down | 2016-04-15T03:24:47.000000 | - |
 | 4 | nova-cert | node-1.test.domain.local | internal | enabled | down | 2016-04-15T03:24:46.000000 | - |
| 7 | nova-scheduler | node-1.test.domain.local | internal | enabled | down | 2016-04-15T03:24:46.000000 | - |
 | 10 | nova-conductor | node-1.test.domain.local | internal | enabled | down | 2016-04-15T03:24:46.000000 | -

Steps to reproduce:
           1. Enable plugin
            2. Re-deploy 1 controller node at cluster (Node Under Test)
            3. Run network verification
            4. Check plugin on ALL controller nodes
            5. Run OSTF <<<<<<<<<<failed here

Expected results:
 ostf tests passed

Actual result:
 OSTF test fuel_health.tests.sanity.test_sanity_infrastructure.SanityInfrastructureTest.test_001_services_state failed

Reproducibility:
 https://product-ci.infra.mirantis.net/job/9.0.system_test.ubuntu.plugins.fuel_plugin_example/78/testReport/%28root%29/three_ctrl_enable_installed_after_create_redeploy/three_ctrl_enable_installed_after_create_redeploy/

Workaround:
 -
Impact:
 swarm test

Description of the environment:
 cat /etc/fuel_build_id:
 201
cat /etc/fuel_build_number:
 201
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6333.noarch
 fuel-misc-9.0.0-1.mos8294.noarch
 python-packetary-9.0.0-1.mos131.noarch
 fuel-openstack-metadata-9.0.0-1.mos8650.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8650.noarch
 python-fuelclient-9.0.0-1.mos306.noarch
 fuel-9.0.0-1.mos6333.noarch
 fuel-nailgun-9.0.0-1.mos8650.noarch
 rubygem-astute-9.0.0-1.mos738.noarch
 fuel-library9.0-9.0.0-1.mos8294.noarch
 fuel-agent-9.0.0-1.mos272.noarch
 fuel-ui-9.0.0-1.mos2659.noarch
 fuel-setup-9.0.0-1.mos6333.noarch
 nailgun-mcagents-9.0.0-1.mos738.noarch
 shotgun-9.0.0-1.mos87.noarch
 network-checker-9.0.0-1.mos72.x86_64
 fuel-bootstrap-cli-9.0.0-1.mos272.noarch
 fuel-migrate-9.0.0-1.mos8294.noarch
 fuelmenu-9.0.0-1.mos268.noarch
 fuel-notify-9.0.0-1.mos8294.noarch
 fuel-ostf-9.0.0-1.mos924.noarch
 fuel-mirror-9.0.0-1.mos131.noarch
 fuel-utils-9.0.0-1.mos8294.noarch

Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :
Changed in fuel:
importance: Undecided → High
Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :
tags: added: swarm-blocker
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
status: New → Confirmed
tags: added: area-library
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Matthew Mosesohn (raytrac3r)
summary: - some nova cervices are down after deployment cluster after redeployment
+ some nova services are down after deployment cluster after redeployment
cluster with enabled plugin
Changed in fuel:
milestone: 9.0 → 10.0
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

The issue is probably an old bug, actually. We don't purge deleted controllers from the nova service table, and therefore it's going to fail OSTF any time you delete a controller from an environment.

Cleaning up this list is important, but we should fix this swarm blocker quickly. OSTF should simply ignore results that pertain to nodes not located in the deployment configuration.

I will propose a fix to OSTF first, and then an enhancement to clean up nova services list. We should do this also with cinder and neutron agents.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-ostf (master)

Fix proposed to branch: master
Review: https://review.openstack.org/308387

Changed in fuel:
status: Confirmed → In Progress
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-ostf (master)

Reviewed: https://review.openstack.org/308387
Committed: https://git.openstack.org/cgit/openstack/fuel-ostf/commit/?id=1f9e4d7bd977e55246e580fb5c84bf7b30cf6177
Submitter: Jenkins
Branch: master

commit 1f9e4d7bd977e55246e580fb5c84bf7b30cf6177
Author: Matthew Mosesohn <email address hidden>
Date: Wed Apr 20 17:54:05 2016 +0300

    Run nova service-list only for online controllers

    nova service-list only needs to report the status
    of online controllers. Down and deleted nodes should
    not cause the test to fail.

    Change-Id: I56765f6cf889b6afb9780b32857a164e2b62c340
    Related-Bug: #1570847

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-ostf (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/309082

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/309094

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

https://review.openstack.org/309094 is still in development. Passing to Vladimir Khlyunev.

Changed in fuel:
assignee: Matthew Mosesohn (raytrac3r) → Vladimir Khlyunev (vkhlyunev)
tags: added: area-ostf
removed: area-library area-plugins
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/309094
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=73d421b70f4935e725cf192f32dd712793c04189
Submitter: Jenkins
Branch: master

commit 73d421b70f4935e725cf192f32dd712793c04189
Author: Matthew Mosesohn <email address hidden>
Date: Thu Apr 21 19:38:07 2016 +0300

    Decrement should_fail tests for nova services

    Now OSTF only considers nova services for online
    (according to Nailgun) computes that are part of
    the active cluster. It will skip deleted nodes,
    and therefore there should be less failures.

    Change-Id: Ie94eccf2608db1d3d800e017a9c91541461f81ee
    Related-Bug: #1570847

Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

Matt, this bug is actually your - I will only apply a fix for fuel-qa

Changed in fuel:
assignee: Vladimir Khlyunev (vkhlyunev) → Matthew Mosesohn (raytrac3r)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/310445

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-ostf (stable/mitaka)

Reviewed: https://review.openstack.org/309082
Committed: https://git.openstack.org/cgit/openstack/fuel-ostf/commit/?id=d4b5d964890686090d3ca188739717d35cc183c6
Submitter: Jenkins
Branch: stable/mitaka

commit d4b5d964890686090d3ca188739717d35cc183c6
Author: Matthew Mosesohn <email address hidden>
Date: Wed Apr 20 17:54:05 2016 +0300

    Run nova service-list only for online controllers

    nova service-list only needs to report the status
    of online controllers. Down and deleted nodes should
    not cause the test to fail.

    Change-Id: I56765f6cf889b6afb9780b32857a164e2b62c340
    Related-Bug: #1570847
    (cherry picked from commit 1f9e4d7bd977e55246e580fb5c84bf7b30cf6177)

tags: added: in-stable-mitaka
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This bug is fixed in that it fixes swarm. There is a larger task to actually purge nova compute services after deleting a node. I will track it with bug https://bugs.launchpad.net/fuel/+bug/1513401 and mark this bug closed.

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

verified 285 iso for mitaka

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (stable/mitaka)

Reviewed: https://review.openstack.org/310445
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=d1438f50cd4d0348721990917eb52946fa859592
Submitter: Jenkins
Branch: stable/mitaka

commit d1438f50cd4d0348721990917eb52946fa859592
Author: Matthew Mosesohn <email address hidden>
Date: Thu Apr 21 19:38:07 2016 +0300

    Decrement should_fail tests for nova services

    Now OSTF only considers nova services for online
    (according to Nailgun) computes that are part of
    the active cluster. It will skip deleted nodes,
    and therefore there should be less failures.

    Depends-On: I56765f6cf889b6afb9780b32857a164e2b62c340
    Change-Id: Ie94eccf2608db1d3d800e017a9c91541461f81ee
    Related-Bug: #1570847
    (cherry picked from commit 73d421b70f4935e725cf192f32dd712793c04189)

Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.