{"build_id": "2014-11-30_11-15-26", "ostf_sha": "dc66fd39d4d035bb972e4c0225591290593c459d", "build_number": "24", "auth_required": true, "api": "1.0", "nailgun_sha": "58e5f47457a0e832c005ce350e01b75a0c01b90a", "production": "docker", "fuelmain_sha": "f324b592399c544eace2f64cb499564da01ab38c", "astute_sha": "1da516b88d1a8d0014d78ab0d796e5b08379a59b", "feature_groups": ["mirantis"], "release": "6.0", "release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-11-30_11-15-26", "ostf_sha": "dc66fd39d4d035bb972e4c0225591290593c459d", "build_number": "24", "api": "1.0", "nailgun_sha": "58e5f47457a0e832c005ce350e01b75a0c01b90a", "production": "docker", "fuelmain_sha": "f324b592399c544eace2f64cb499564da01ab38c", "astute_sha": "1da516b88d1a8d0014d78ab0d796e5b08379a59b", "feature_groups": ["mirantis"], "release": "6.0", "fuellib_sha": "bbf26b499bf47ca41302ba6f62c3ebc5a493013d"}}}, "fuellib_sha": "bbf26b499bf47ca41302ba6f62c3ebc5a493013d"}
When a critical role fails, it also sets all nodes that have the same role as failed. This makes it hard to trouble shoot which node may have caused the failure. This is especially notable when one of the ceph-osd node's fails and there are 90 other nodes.
Expected result: a critical role fails, and only that node is marked as failed. the deployment is stopped.
Actual result. a critical role fails and all nodes with that role are marked failed. the deployment is stopped.
Please provide logs. It will help to check this problem. I have seen today logs from 90+ nodes with ceph problem (https:/ /bugs.launchpad .net/fuel/ +bug/1398096)