Detailed bug description:
One of compute-osd node failed on deployment stage. On that node ceph-osd daemon stuck on bootstrapping.
Steps to reproduce:
1. Start deploying 191 node cluster(3 controllers, 187 computes, 20 compute-osd, 1 just-os)
Expected results:
Cluster successfully deployed
Actual result:
One of compute-osd node is failed on deploying stage.
Reproducibility:
Have faced two last times
Workaround:
-
Impact:
Deploy cluster
Description of the environment:
Operation system: Ubuntu 14.04
Versions of components: MOS-9.0
Reference architecture: 3 controllers, 187 computes, 20 compute-osd, 1 just-os
Network model: vxlan-dvr
Related projects installed: -
Additional information:
ceph status commands is not working on failed node but MON's nodes is accessible: http://paste.openstack.org/show/556110/
Part log from puppet.log from failed node: http://paste.openstack.org/show/556108/
logs from fuel and nodes: http://mos-scale-share.mirantis.com/fuel-snapshot-2016-08-12_13-36-52.tar.gz
I am marking this as Incomplete, there is no logs like commands, etc from the nodes in the snapshot. Please give me an access to the environment with such errors or ping me directly when you have one.