Fuel for OpenStack

'ceph-deploy osd prepare' failed because the OSD partition is already mounted

Bug #1246513 reported by Dmitry Borodaenko on 2013-10-30

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Released	High	Ryan Moe	Fuel for OpenStack 3.2.1

Bug Description

The problem is more likely to occur when multiple disks are allocated for Ceph OSD. Example log output:

2013-10-30 13:29:44,955 [ceph_deploy.cli][INFO ] Invoked (1.2.7): /usr/bin/ceph-deploy osd prepare node-23:/dev/sdb4 node-23:/dev/sdc4
2013-10-30 13:29:44,956 [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks node-23:/dev/sdb4: node-23:/dev/sdc4:
2013-10-30 13:29:44,958 [ceph_deploy.sudo_pushy][DEBUG ] will use a local connection without sudo
2013-10-30 13:29:45,266 [ceph_deploy.osd][INFO ] Distro info: CentOS 6.4 Final
2013-10-30 13:29:45,267 [ceph_deploy.osd][DEBUG ] Deploying osd to node-23
2013-10-30 13:29:45,285 [node-23][INFO ] write cluster configuration to /etc/ceph/{cluster}.conf
2013-10-30 13:29:45,327 [node-23][ERROR ] Traceback (most recent call last):
2013-10-30 13:29:45,329 [node-23][ERROR ] File "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line 10, in inner
2013-10-30 13:29:45,332 [node-23][ERROR ] def inner(*args, **kwargs):
2013-10-30 13:29:45,334 [node-23][ERROR ] File "/usr/lib/python2.6/site-packages/ceph_deploy/conf.py", line 12, in write_conf
2013-10-30 13:29:45,337 [node-23][ERROR ] line = self.fp.readline()
2013-10-30 13:29:45,339 [node-23][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
2013-10-30 13:29:45,354 [node-23][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add
2013-10-30 13:29:45,673 [ceph_deploy.osd][DEBUG ] Preparing host node-23 disk /dev/sdb4 journal None activate False
2013-10-30 13:29:45,673 [node-23][INFO ] Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdb4
2013-10-30 13:29:48,016 [node-23][INFO ] meta-data=/dev/sdb4 isize=2048 agcount=4, agsize=2056960 blks
2013-10-30 13:29:48,017 [node-23][INFO ] = sectsz=512 attr=2, projid32bit=0
2013-10-30 13:29:48,017 [node-23][INFO ] data = bsize=4096 blocks=8227840, imaxpct=25
2013-10-30 13:29:48,018 [node-23][INFO ] = sunit=0 swidth=0 blks
2013-10-30 13:29:48,018 [node-23][INFO ] naming =version 2 bsize=4096 ascii-ci=0
2013-10-30 13:29:48,021 [node-23][INFO ] log =internal log bsize=4096 blocks=4017, version=2
2013-10-30 13:29:48,022 [node-23][INFO ] = sectsz=512 sunit=0 blks, lazy-count=1
2013-10-30 13:29:48,022 [node-23][INFO ] realtime =none extsz=4096 blocks=0, rtextents=0
2013-10-30 13:29:48,038 [ceph_deploy.osd][DEBUG ] Host node-23 is now ready for osd use.
2013-10-30 13:29:48,039 [ceph_deploy.sudo_pushy][DEBUG ] will use a local connection without sudo
2013-10-30 13:29:48,652 [ceph_deploy.osd][INFO ] Distro info: CentOS 6.4 Final
2013-10-30 13:29:48,653 [ceph_deploy.osd][DEBUG ] Preparing host node-23 disk /dev/sdc4 journal None activate False
2013-10-30 13:29:48,654 [node-23][INFO ] Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdc4
2013-10-30 13:29:48,968 [node-23][ERROR ] Traceback (most recent call last):
2013-10-30 13:29:48,968 [node-23][ERROR ] File "/usr/lib/python2.6/site-packages/ceph_deploy/osd.py", line 126, in prepare_disk
2013-10-30 13:29:48,976 [node-23][ERROR ] File "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line 10, in inner
2013-10-30 13:29:48,978 [node-23][ERROR ] def inner(*args, **kwargs):
2013-10-30 13:29:48,980 [node-23][ERROR ] File "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line 6, in remote_call
2013-10-30 13:29:48,983 [node-23][ERROR ] This allows us to only remote-execute the actual calls, not whole functions.
2013-10-30 13:29:48,985 [node-23][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line 505, in check_call
2013-10-30 13:29:48,987 [node-23][ERROR ] raise CalledProcessError(retcode, cmd)
2013-10-30 13:29:48,989 [node-23][ERROR ] CalledProcessError: Command '['ceph-disk-prepare', '--fs-type', 'xfs', '--cluster', 'ceph', '--', '/dev/sdc4']' returned non-zero exit status 1
2013-10-30 13:29:49,000 [node-23][ERROR ] ceph-disk: Error: Device is mounted: /dev/sdc4
2013-10-30 13:29:49,009 [ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdc4
2013-10-30 13:29:49,010 [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

As can be seen from the log, ceph-disk can't deal with an OSD partition that is already mounted. Given that the OSD partitions are assigned the Ceph OSD GPT type UUID and formatted as XFS during Fuel provisioning, when 'ceph-deploy osd prepare' calls udevadm, the Ceph automount udev rules are triggered and the Ceph partitions are mounted. Because it takes udev some time to mount the partitions, first one or two drives slip by and are successfully prepared by ceph-disk (including mkfs.xfs and mount), preparing all subsequent drives will fail.

In order to work around ceph-disk and ceph-deploy limitations, Fuel should not format the OSD partitions during provisioning (it should still set the GPT type UUID), and it should not pass device names corresponding to mounted OSD partitions to to ceph-deploy (the facter should use type UUID instead of filesystem label to identify Ceph partitions, and should exclude mounted devices).

Tags:

Ryan Moe (rmoe) on 2013-10-31

Changed in fuel:
assignee:	nobody → Ryan Moe (rmoe)

Mike Scherbakov (mihgen) on 2013-11-01

Changed in fuel:
importance:	Undecided → High

Andrew Woodward (xarses) on 2013-11-01

Changed in fuel:
status:	New → In Progress

Vladimir Kuklin (vkuklin) on 2013-11-05

tags:

added: ceph library provisioning

Vladimir Kuklin (vkuklin) on 2013-11-07

Changed in fuel:
status:	In Progress → Fix Committed

Dmitry Borodaenko (angdraug) on 2013-11-07

Changed in fuel:
milestone:	none → 3.2.1