[10.0][swarm] TimeoutError: Node failed to become offline

Bug #1652000 reported by Dmitry Belyaninov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Dmitry Belyaninov
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Fuel CI (fuel-ci)
tags: added: swarm-fail
tags: added: swarm-blocker
removed: swarm-fail
Revision history for this message
Dmitry Kaigarodеsev (dkaiharodsev) wrote :

please describe how can ci-team help to solve this bug?

Changed in fuel:
assignee: Fuel CI (fuel-ci) → Oleksiy Molchanov (omolchanov)
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

@Dmitry, my bad, I thought the problem is related to getting node online.

@QA team, I have tried to diagnose a root cause using https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.numa_cpu_pinning/147/console test case.

04:00:29 - Node was rebooted using 'shutdown +1'
04:04:30 - Test failed with message 'Node failed to become offline'

Node had 3 minutes to go offline, but somehow it didn't manage. I think that it was related to high load, but not sure, because there are no logs related to this node in snapshot.

You can try to extend time we are waiting for node to become offline, also please don't use shutdown +1.

Changed in fuel:
assignee: Oleksiy Molchanov (omolchanov) → Fuel QA Team (fuel-qa)
Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → Dmitry Belyaninov (dbelyaninov)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-qa (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/428642

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (stable/newton)

Reviewed: https://review.openstack.org/428642
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=18e4d59570c9334cbf32ccb6412102ef884871b0
Submitter: Jenkins
Branch: stable/newton

commit 18e4d59570c9334cbf32ccb6412102ef884871b0
Author: Dmitry Belyaninov <email address hidden>
Date: Fri Feb 3 08:45:12 2017 +0000

    Timeout changing for restart procedure

    We have few tests failed on call "warm_shutdown_nodes"
    with 4x60 timeout. But there are few rassed tests with
    10x60 timeout. So longer timeout should be used for
    all tests.

    Change-Id: I0c7e63b4e9156f35d6679a9dcdfb7b9249c6796a
    Partial-Bug: 1652000

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-qa (master)

Fix proposed to branch: master
Review: https://review.openstack.org/430625

Changed in fuel:
status: In Progress → Fix Committed
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/430625
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=185b824873d4fbc603c7a36a70b80e662340bf20
Submitter: Jenkins
Branch: master

commit 185b824873d4fbc603c7a36a70b80e662340bf20
Author: Dmitry Belyaninov <email address hidden>
Date: Fri Feb 3 08:45:12 2017 +0000

    Timeout changing for restart procedure

    We have few tests failed on call "warm_shutdown_nodes"
    with 4x60 timeout. But there are few rassed tests with
    10x60 timeout. So longer timeout should be used for
    all tests.

    Change-Id: I0c7e63b4e9156f35d6679a9dcdfb7b9249c6796a
    Partial-Bug: 1652000
    (cherry picked from commit 18e4d59570c9334cbf32ccb6412102ef884871b0)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.