'ceph-deploy osd prepare' fails on re-deploy if xfs filesystem is not zapped

Bug #1317296 reported by Dmitry Borodaenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Ryan Moe

Bug Description

Environment: CentOS, Ceph, re-reploy ceph-osd on the same node using the same partition size.

Based on a comment to an older related bug:
https://bugs.launchpad.net/fuel/+bug/1246513/comments/5

Symptoms are almost identical but the root cause is different: since we only zap the beginning of the disk, xfs file system remains in place, so when a partition is created at the same spot during provisioning, ceph automount udev rule picks it up and mounts it, which then causes ceph-deploy to fail.

A fix is to zap the partitions before wiping partition table.

Tags: ceph
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/92737

Changed in fuel:
status: Triaged → In Progress
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

It is true that the problem is that we already have xfs file system in place when we create partition with the same boundaries. But we create partition during provisioning stage, but udev rules which mount ceph partitions automatically come with ceph package which is installed after reboot. Turned out udev rules are applied right after ceph installation (Hmm... interesting).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/92737
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=7ee0d73b9d043ee67c3e94706ef15c41d7b2ed1d
Submitter: Jenkins
Branch: master

commit 7ee0d73b9d043ee67c3e94706ef15c41d7b2ed1d
Author: Ryan Moe <email address hidden>
Date: Wed May 7 17:42:50 2014 -0700

    Erase beginning of all partitions when erasing nodes

    When re-deploying nodes with an identical disk layout previous
    filesystems can cause problems with Ceph.

    Change-Id: I3507472e805278924daff89a146dfdf7f89b4f3c
    Closes-bug: #1317296

Changed in fuel:
status: In Progress → Fix Committed
tags: added: backports-4.1.1
Changed in fuel:
milestone: 5.0 → 4.1.1
status: Fix Committed → In Progress
Ryan Moe (rmoe)
Changed in fuel:
milestone: 4.1.1 → 5.0
Ryan Moe (rmoe)
tags: removed: backports-4.1.1
Revision history for this message
Ryan Moe (rmoe) wrote :

This won't be backported to 4.1 because the code that handles erasing nodes has moved from mcagents/ to ssh_actions/ and has been changed.

Mike Scherbakov (mihgen)
Changed in fuel:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.