Comment 35 for bug 1333814

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

The failure in commit #33 is due to an unrelated problem:

2014-07-31T12:29:44.193624+01:00 notice: (/Stage[main]/Ceph::Osd/Exec[ceph-deploy osd activate]/returns) [node-6][INFO ] Running command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdc2
2014-07-31T12:29:44.193624+01:00 notice: (/Stage[main]/Ceph::Osd/Exec[ceph-deploy osd activate]/returns) [node-6][DEBUG ] === osd.3 ===
2014-07-31T12:29:44.199063+01:00 notice: (/Stage[main]/Ceph::Osd/Exec[ceph-deploy osd activate]/returns) [node-6][DEBUG ] failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.3 --keyri
ng=/var/lib/ceph/osd/ceph-3/keyring osd crush create-or-move -- 3 0.06 host=node-6 root=default'
2014-07-31T12:29:44.208757+01:00 notice: (/Stage[main]/Ceph::Osd/Exec[ceph-deploy osd activate]/returns) [node-6][WARNING] INFO:ceph-disk:ceph osd.3 already mounted in position; unmounting ours.
2014-07-31T12:29:44.208757+01:00 notice: (/Stage[main]/Ceph::Osd/Exec[ceph-deploy osd activate]/returns) [node-6][WARNING] ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', 'start', 'osd.3']' returned non-zero exit status 1

Soon after that, node-6 (primary controller) went offline, and deployment of the rest of the cluster failed. The failure above by itself could be related to bug #1322230, the controller going offline to bug #1348839, or this combination of symptoms could be indicative of a new bug.

Bug description updated to help identify reoccurences of this problem and tell them from all other ceph failures.