Comment 21 for bug 1719436

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I agree that the ceph cluster would not have reached completion in that test run; however, it looks like the monitor cluster still had either the same, or a similar issue with regards to a unit joining a cluster.

The error logs in #19 include the ceph-mon unit logs for the failing unit, in which the recently added retry code path is executed. Given that we added that retry logic to try to combat intermittent failures of a monitor node to join the cluster, it looks like something failed before the monitor cluster tried to relate to the OSD machines.