Environment deployment failed with Fuel (Too many nodes failed to provision)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
Medium
|
Fuel DevOps |
Bug Description
ERROR FROM FUEL:
Deployment Failed
Error
Too many nodes failed to provision
Some nodes have an error status after deployment. Redeployment is needed.
After OS installation during deploy nodes became offline.
**The main thing that the same actions performed with the same versions but on other server were finished succesessfuly.**
So probably it is something wrong with the host server configuration.
ENVIRONMENT:
Host: srv94-bud.
OS: Ubuntu 14.04.3 LTS
Version: MirantisOpenSta
Nodes: 3x controller, 1x compute, 1x cinder
Snapshots:
Before deploy:
# source /home/akoryagin
# dos.py revert-resume fuelweb_
Now deploy in failed on server and machine wasn't touched after it
MORE INFORMATION:
In Fuel Master, Astute log:
2015-11-25 17:40:47 ERR [655] Timeout of provisioning is exceeded. Nodes not booted: ["1", "2", "3", "4", "5"]
2015-11-25 17:40:47 DEBUG [655] Aborting provision. To many nodes failed: ["1", "2", "3", "4", "5"]
2015-11-25 17:40:47 INFO [655] Node timed out to provision: 5
2015-11-25 17:40:47 INFO [655] Node timed out to provision: 4
2015-11-25 17:40:47 INFO [655] Node timed out to provision: 3
2015-11-25 17:40:47 INFO [655] Node timed out to provision: 2
2015-11-25 17:40:47 INFO [655] Node timed out to provision: 1
From the host machine:
$ virsh list --all
Id Name State
---
2 fuelweb_
3 fuelweb_
4 fuelweb_
5 fuelweb_
6 fuelweb_
7 fuelweb_
From Admin Node:
[root@nailgun ~]# fuel node
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---
1 | error | Untitled (9a:fc) | 1 | 10.109.0.4 | 64:d0:df:47:9a:fc | controller | | False | 1
5 | error | Untitled (9c:f6) | 1 | 10.109.0.6 | 64:1d:35:2b:9c:f6 | compute | | False | 1
3 | error | Untitled (7f:33) | 1 | 10.109.0.7 | 64:b9:3f:2d:7f:33 | controller | | False | 1
4 | error | Untitled (e4:09) | 1 | 10.109.0.3 | 64:de:b9:e5:e4:09 | cinder | | False | 1
2 | error | Untitled (35:6e) | 1 | 10.109.0.5 | 64:b2:c4:c9:35:6e | controller | | False | 1
Nodes became offline:
[root@nailgun ~]# for num in 3 4 5 6 7; do ping -q -w 5 10.109.0.${num}; done
PING 10.109.0.3 (10.109.0.3) 56(84) bytes of data.
--- 10.109.0.3 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3000ms
pipe 3
PING 10.109.0.4 (10.109.0.4) 56(84) bytes of data.
--- 10.109.0.4 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3000ms
pipe 3
PING 10.109.0.5 (10.109.0.5) 56(84) bytes of data.
--- 10.109.0.5 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3000ms
pipe 3
PING 10.109.0.6 (10.109.0.6) 56(84) bytes of data.
--- 10.109.0.6 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3000ms
pipe 3
PING 10.109.0.7 (10.109.0.7) 56(84) bytes of data.
--- 10.109.0.7 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3001ms
pipe 3
tags: | added: area-devops |
This issue marked as duplicate accordingly to the conversation with Georgy Duldin in skype. He saw this issue before and looks like he knows the root of the issue.