[astute] Problem with provision status detection in case of a large number of nodes

Bug #1377105 reported by Vladimir Sharshov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Vladimir Sharshov
5.1.x
Fix Committed
High
Vladimir Sharshov

Bug Description

All nodes are provisioned but we got only part of this nodes in every mcollective request.

2014-10-03T09:23:41 debug: [440] Not provisioned: 91, got target OSes: 28,40,41,48,50,52,54,77,78,79,80,81,82,83,84,85,89,90,93,94,96,97,98

--- After this moment all nodes already provisioned, but we still continue
to detect node statuses ---
2014-10-03T09:23:46 debug: [440] Not provisioned: , got target OSes: 28,40,41,48,50,52,54,77,78,79,80,81,82,83,84,85,89,90,93,94,91,96,98,97
2014-10-03T09:23:51 debug: [440] Not provisioned: , got target OSes: 28,40,41,48,50,52,54,77,78,79,80,81,82,83,84,85,89,90,91,93,94,96,97,98
2014-10-03T09:23:56 debug: [440] Not provisioned: , got target OSes: 28,40,48,41,50,52,54,77,78,79,80,81,82,83,84,85,89,90,91,93,94,96,97,98
2014-10-03T09:24:01 debug: [440] Not provisioned: , got target OSes: 40,28,41,48,50,54,52,84,82,79,80,83,81,85,78,77,90,89,93,94,91,96,98,97
2014-10-03T09:24:06 debug: [440] Not provisioned: , got target OSes:

Why? Because we try to get status of all nodes in every request. And, as we can see,
in case of a large number of nodes, this is not work.

2014-10-03T09:23:56 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 24
2014-10-03T09:24:01 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 24
2014-10-03T09:24:06 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 35
2014-10-03T09:24:11 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 35
2014-10-03T09:24:16 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 36
2014-10-03T09:24:21 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 36
2014-10-03T09:24:26 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 36
2014-10-03T09:24:31 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 36
2014-10-03T09:24:36 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 37
2014-10-03T09:24:41 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 37
2014-10-03T09:24:46 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 38
2014-10-03T09:24:51 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 38
2014-10-03T09:24:56 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 38
2014-10-03T09:25:01 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 38
2014-10-03T09:25:06 debug: [440] Nodes list length is not equal to target nodes list length: 40 != 38

Tags: astute
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/125955

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/125955
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=33e357c06de708bb324885e529136cdc23361a6d
Submitter: Jenkins
Branch: master

commit 33e357c06de708bb324885e529136cdc23361a6d
Author: Vladimir Sharshov <email address hidden>
Date: Fri Oct 3 16:20:09 2014 +0400

    New node provision status detection

    Use accumulated knowledge about nodes statuses
    instead of trying to get info about all nodes
    using one request.

    Also:
    * decrease polling frequency for node status
    detection;
    * increase timeout limit for mcollective request
    about nodes statuses.

    Change-Id: I12dd30bc2fe304ac13f40d1cee1e2d25643dcacb
    Closes-Bug: #1377105

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/5.1)

Fix proposed to branch: stable/5.1
Review: https://review.openstack.org/134491

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/134886

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/5.1)

Reviewed: https://review.openstack.org/134491
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=fce051a6d013b1c30aa07320d225f9af734545de
Submitter: Jenkins
Branch: stable/5.1

commit fce051a6d013b1c30aa07320d225f9af734545de
Author: Vladimir Sharshov <email address hidden>
Date: Fri Oct 3 16:20:09 2014 +0400

    New node provision status detection

    Use accumulated knowledge about nodes statuses
    instead of trying to get info about all nodes
    using one request.

    Also:
    * decrease polling frequency for node status
    detection;
    * increase timeout limit for mcollective request
    about nodes statuses.

    Also backport fix from master:
    0085021fe327f6f910901b3ca55051b1df33a96e

    Change-Id: I12dd30bc2fe304ac13f40d1cee1e2d25643dcacb
    Closes-Bug: #1377105
    (cherry picked from commit 33e357c06de708bb324885e529136cdc23361a6d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/134886
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=513a034f834a0496f8d3c84afcfcd357409ea5d3
Submitter: Jenkins
Branch: master

commit 513a034f834a0496f8d3c84afcfcd357409ea5d3
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Mon Nov 17 12:39:58 2014 +0300

    New node provision status detection (test)

    Change-Id: I1c0ee068fd3537a1c0dd7d6604f752befa9093f3
    Related-Bug: #1377105

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/135325

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/135325
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=7720992a297a478638b9623c53356e0b355abc24
Submitter: Jenkins
Branch: master

commit 7720992a297a478638b9623c53356e0b355abc24
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Tue Nov 18 18:20:59 2014 +0300

    Recognize 'bootstrap' status like not booted

    We change algorithm which detect node status
    for provision operation. But if some of node
    do no rebooted (see https://bugs.launchpad.net/fuel/+bug/1318567
    for details). Without this fix we will recognize
    such nodes as bootstraped instead of not booted.

    Change-Id: I285ed61231452473b21f7cf8739a19248efa00b9
    Related-Bug: #1377105

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/135598

Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Potencial problem in new mechanism related with this bug: https://bugs.launchpad.net/fuel/+bug/1318567

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-astute (stable/5.1)

Reviewed: https://review.openstack.org/135598
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=51087c92a50be982071a074ff2bea01f1a5ddb76
Submitter: Jenkins
Branch: stable/5.1

commit 51087c92a50be982071a074ff2bea01f1a5ddb76
Author: Vladimir Sharshov (warpc) <email address hidden>
Date: Tue Nov 18 18:20:59 2014 +0300

    Recognize 'bootstrap' status like not booted

    We change algorithm which detect node status
    for provision operation. But if some of node
    do no rebooted (see https://bugs.launchpad.net/fuel/+bug/1318567
    for details). Without this fix we will recognize
    such nodes as bootstraped instead of not booted.

    Change-Id: I285ed61231452473b21f7cf8739a19248efa00b9
    Related-Bug: #1377105
    (cherry picked from commit 7720992a297a478638b9623c53356e0b355abc24)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.