Comment 3 for bug 1352335

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

There's two separate problems in the logs:

1) cinder-volume is trying and failing to connect to the Ceph cluster (log fragments from the bug description). This happens after ceph-mon is initialized, but before any osds were added to the cluster. This works fine on CentOS, but is reproducible in Ubuntu. Unfortunately cinder rbd driver exception handling is broken so there's no indication with it failed to connect to the ceph cluster.

Since this can be reproduced on Ubuntu, updating status to Confirmed. Since there's a trivial workaround (restarting cinder-volume service makes the problem go away), updating priority to Medium. Leaving target milestone as 5.1 for now due to likely impact on system tests (OSTF will be failing until cinder-volume is restarted).

2) when osd processes are started by ceph-deploy, they report following errors:

2014-08-03 15:23:46.163693 7f4f5b0da7c0 -1 asok(0x7f4f5e382230) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.0.aso
2014-08-03 15:23:46.163729 7f4f5b0da7c0 0 filestore(/var/lib/ceph/osd/ceph-0) lock_fsid failed to lock /var/lib/ceph/osd/ceph-0/fsid, is another ceph-osd still running? (11) Resource temporarily unav
2014-08-03 15:23:46.163738 7f4f5b0da7c0 -1 filestore(/var/lib/ceph/osd/ceph-0) FileStore::mount: lock_fsid failed
2014-08-03 15:23:46.163743 7f4f5b0da7c0 -1 .[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-0: (16) Device or resource busy.[0m

Still, osd show up as up and in in ceph-mon logs, and at the end of deployment the cluster is operational, and TestVM image is successfully uploaded to Glance.

Problem #2 should be reported as a separate Low priority bug.