We should increase time to live for messages and queues for max value in mcollective

Bug #1316720 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Vladimir Sharshov
4.1.x
Fix Released
High
Fuel Library (Deprecated)

Bug Description

{"build_id": "2014-05-06_01-31-29", "mirantis": "yes", "build_number": "183", "ostf_sha": "fe718434f88f2ab167779770828a195f06eb29f8", "nailgun_sha": "c61100f34a12c597df32f7498697acd84035957f", "production": "docker", "api": "1.0", "fuelmain_sha": "185fac4a937b970fc82aa48a9c31c7981f8f7659", "astute_sha": "3cffebde1e5452f5dbf8f744c6525fc36c7afbf3", "release": "5.0", "fuellib_sha": "edaecb643f34ca73be3716c5a722bfdd40e06128"}

Steps to Reproduce:
1. Deploy one node cluster using KVM
2. Delete it
3. Deploy neutron gre cluser with 4 nodes (1controller 2 computes 1 cinder+ mongo) with murano/savanna/ceilometer

Expected
Deployment pass , cluster ready, ostf pass

Actual:
deployment of controller fail with message [7ff473ade700] (receiver) Node slave-01_controller not answered by RPC, removing from db

Info: Dima P comment: need to set ttl to max value for mcollective

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

we have failed tests on CI especially for ha with a very similar problem. We marked node with error status with 2014-05-06T14:56:36 err: [394] MCollective agents '1' didn't respond within the allotted time.
 and do not wait a little bit more , but than we can see in logs that node is responsible and online

Changed in fuel:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Andrew Woodward (xarses)
no longer affects: qemu-kvm (Ubuntu)
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
importance: Critical → High
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Andrey Danin (gcon-monolake)
Changed in fuel:
assignee: Andrey Danin (gcon-monolake) → Vladimir Sharshov (vsharshov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/92766

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/92766
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=32f25307a69c69b002f3f7a220fa6bae2d4d110d
Submitter: Jenkins
Branch: master

commit 32f25307a69c69b002f3f7a220fa6bae2d4d110d
Author: Vladimir Sharshov <email address hidden>
Date: Thu May 8 07:44:30 2014 +0400

    Set max avaliable TTL for Mcollective

    We use virtual machine snapshots for CI. When we
    up it from snapshots, we get error:
    "MCollective agents '<ID>' didn't respond within
    the allotted time.", but node is online. Big TTL
    in Mcollective should solve this problem.

    Change-Id: Id3f2f5ddf26a9d31de214e9d5596a47f450b983f
    Closes-Bug: #1316720

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Aleksey Kasatkin (alekseyk-ru) wrote :

Problem is repeated on #199.

2014-05-13 14:24:12 ERR [416] Timeout of deployment is exceeded.
2014-05-13 14:24:12 ERR [416] MCollective agents '1' didn't respond within the allotted time.

Revision history for this message
Aleksey Kasatkin (alekseyk-ru) wrote :
Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Aleksey, problem with "Timeout of deployment" is another bug: https://bugs.launchpad.net/fuel/+bug/1312443. Thanks for the report.

Changed in fuel:
status: Confirmed → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/4.1)

Fix proposed to branch: stable/4.1
Review: https://review.openstack.org/96852

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/4.1)

Reviewed: https://review.openstack.org/96852
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=6de87343ddf5a939695c1178f9946d2d08c3e876
Submitter: Jenkins
Branch: stable/4.1

commit 6de87343ddf5a939695c1178f9946d2d08c3e876
Author: Vladimir Sharshov <email address hidden>
Date: Thu May 8 07:44:30 2014 +0400

    Set max avaliable TTL for Mcollective

    We use virtual machine snapshots for CI. When we
    up it from snapshots, we get error:
    "MCollective agents '<ID>' didn't respond within
    the allotted time.", but node is online. Big TTL
    in Mcollective should solve this problem.

    Change-Id: Id3f2f5ddf26a9d31de214e9d5596a47f450b983f
    Closes-Bug: #1316720

Revision history for this message
Meg McRoberts (dreidellhasa) wrote :

Documented as fixed in 4.1.1

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

verifiy 6.1 rc3

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.