[10.0 swarm] Network verification is not possible because some nodes are not available via mcollective

Bug #1673743 reported by Vladimir Khlyunev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Vladimir Sharshov

Bug Description

ISO 1481

Scenario:
1. Create cluster
2. Add 3 node with controller role
3. Verify network
4. Deploy cluster
5. Stop deployment
6. Add 2 nodes with compute role
7. Re-deploy cluster

Result:
Deployment successful, but node-3 was not deployed:
[root@nailgun ~]# fuel node
...
 3 | discover | slave-03_controller | 1 | 10.109.15.6 | 64:c8:c9:b1:3b:90 | controller |
[root@nailgun ~]# ssh node-3
ssh: Could not resolve hostname node-3: Name or service not known
[root@nailgun ~]# ssh 10.109.15.6
Warning: Permanently added '10.109.15.6' (ECDSA) to the list of known hosts.
root@bootstrap:~#

Snapshot https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.cluster_actions_ha/210/artifact/logs/fail_error_deploy_stop_reset_on_ha-fuel-snapshot-2017-03-17_02-08-55.tar

tags: added: swarm-fail
Changed in fuel:
status: New → Confirmed
tags: added: area-python
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Vladimir Kozhukalov (kozhukalov)
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

there is the error in the asute.log

2017-03-28 00:39:12 ERROR [17836] Error running RPC method verify_networks: Network verification not available because nodes ["1"] not available via mcollective, trace:
["/usr/share/gems/gems/astute-10.0.0/lib/astute/orchestrator.rb:218:in `validate_nodes_access'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/orchestrator.rb:170:in `check_dhcp'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:126:in `check_dhcp'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:110:in `block in verify_networks'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:108:in `each'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:108:in `verify_networks'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:172:in `dispatch_message'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:131:in `block in dispatch'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:64:in `call'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:64:in `block in each'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:56:in `each'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:56:in `each'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:128:in `each_with_index'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:128:in `dispatch'",
 "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:106:in `block in perform_main_job'"]

there are no errors in mcollective.log file on the node which was not available. Looks like the reason why this issue appears is that network connection is unstable (maybe due to high cpu load on the host at that time).

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :
Changed in fuel:
assignee: Vladimir Kozhukalov (kozhukalov) → Vladimir Sharshov (vsharshov)
summary: - After deployment stop and restart node still in bootstrap/discover state
+ [10.0 swarm] Network verification is not possible because some nodes are
+ not available via mcollective
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

seems not actual any more

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :
Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Mark it as duplicate, because test was succeed 2 day in a row. Looks like fix for stop deployment resolve this problem. More details: https://bugs.launchpad.net/fuel/+bug/1672964

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.