kolla-ansible

Ceph OSD Containers Fail when Device Names Change

Bug #1701148 reported by James McEvoy on 2017-06-29

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	kolla-ansible	Won't Fix	Undecided	Unassigned

Bug Description

When a system reboots the discovery order of disks can change for various reasons. If this disk name change happens to ceph osd disk with a co-resident journal the ceph osd container will no longer start. This happens because the data partition of the disk correctly uses a the uuid name of the disk so it remains tied to the correct disk partition. The problem is that the journal on the same disk uses the traditional device name such as /dev/sdj2 which is unreliable across boots. This mismatch between the data and its journal causes restart loop for the container because of the mismatch.

The fix for this issue is to configure the uuid of the partition found in /dev/disk/by-partuuid/ to link the link the journal and it data together across reboots.

A suggested way to reproduce this issue is to attach 2 or more qcow disks disks to a VM for use as ceph osd's. Label the disk for co-resident journals and data. Then after the kolla deploy completes, make note of the uuid of the ceph devices using the command blkid. Shutdown the VM and swap the names of the qcow files so the data changes device names. Restart the VM and verify that the device name change for the ceph disks. Next start the ceph osd container to see the error.

Michal Nasiadka (mnasiadka) on 2017-08-07

Changed in kolla-ansible:
status:	New → Confirmed
status:	Confirmed → In Progress
assignee:	nobody → Michal Nasiadka (mnasiadka)

Michal Nasiadka (mnasiadka) on 2018-05-09

Changed in kolla-ansible:
assignee:	Michal Nasiadka (mnasiadka) → nobody

Eduardo Gonzalez (egonzalez90) on 2018-09-28

Changed in kolla-ansible:
status:	In Progress → Confirmed

Mark Goddard (mgoddard) on 2020-09-03

Changed in kolla-ansible:
status:	Confirmed → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.