Error when bootstrapping cache OSD on NVMe drive.

Bug #1847014 reported by Eddie Yen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Medium
Eddie Yen
Rocky
Fix Released
Medium
Unassigned
Stein
Fix Released
Medium
Unassigned
Train
Fix Released
Medium
Eddie Yen

Bug Description

When I trying to deploy Ceph with cache tier, I got an error when bootstrapping cache OSDs on NVMe drive.

The error is inside the attachment.

I think the root cause is the partition number. The nornal drive using "1, 2, ..." as partition number, like "sda1, sda2, ...".
But the NVMe drive using "p1, p2, ..." as partition number, like "nvme0n1p1, nvme0n1p2".
Kolla-ansible just "forget" to add "p" into device path. It went to "nvme0n11, nvme0n12" output when generating the command, then caused the error because the path is not exist.

But Idk how to fix it, because Idk where the value generated.
Please help.

OS: Ubuntu 18.04
Kolla Release: stable-rocky

Revision history for this message
Eddie Yen (aksn74) wrote :
description: updated
tags: added: ceph stable-rocky
Revision history for this message
Mark Goddard (mgoddard) wrote :

I have seen bugs like this before. It's not just NVMe, it happens with loopbacks too. The rule is "if the device ends in a number, add a 'p' before the partition number".

Changed in kolla-ansible:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Mark Goddard (mgoddard) wrote :

Here is a fix for the same issue with loopback devices: https://review.opendev.org/#/c/668222/1/docker/ceph/ceph-osd/extend_start.sh

I think we need to make a new function in that file to get the partition device name. Can you do it?

Revision history for this message
Eddie Yen (aksn74) wrote :

Hmm, I'm not very well about coding, but I think we can do the scenario like add "p" in front of PARTNUM if the last character of DEV is number. Is that what you want?
If so, perhaps can try create the function about this, then addition the function into "if [[ "${OSD_BS_DEV}" =~ "/dev/loop" ]];" with OR. Like:

if [[ "${OSD_BS_DEV}" =~ "/dev/loop" || ${OSD_BS_NVME_DEV} = "True" ]];

Little challenge to me, but I may try it out.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Here's a starting point: a function that generates a partition device name:

function part_name {
   if [[ $1 =~ .*[0-9] ]]; then
     echo ${1}p${2}
   else
     echo ${1}${2}
   fi
}

echo $(part_name /dev/sda 1)
echo $(part_name /dev/loop1 2)
echo $(part_name /dev/nvme1 1)

Revision history for this message
Eddie Yen (aksn74) wrote :

Thanks for your hint! I'll try it out. May take about little longer since I'm very busy in these days.

Mark Goddard (mgoddard)
no longer affects: kolla-ansible/stein
no longer affects: kolla-ansible/rocky
Changed in kolla:
importance: Undecided → Medium
no longer affects: kolla-ansible/train
no longer affects: kolla-ansible
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/stein)

Reviewed: https://review.opendev.org/688926
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=6bc6469bc6750ccc388971d5b2f7e3fe98aba8f9
Submitter: Zuul
Branch: stable/stein

commit 6bc6469bc6750ccc388971d5b2f7e3fe98aba8f9
Author: Eddie Yen <email address hidden>
Date: Mon Oct 14 05:24:46 2019 +0000

    Add disk dev name check function

    This patch will add new function in extend_start.sh for OSD
    creation. Not only support loop device but also others that
    disk dev layout is end with numbers.

    Change-Id: Iee5f8b8581d70166de6eba1bdc9e42766fe8cb48
    Closes-Bug: #1847014
    (cherry picked from commit 1d5f753fb13bcc3659b4abd1bb768de8550a6dc4)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/rocky)

Reviewed: https://review.opendev.org/688918
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=5da1c35cc3d113ef702e8e2515c8178c413a4af2
Submitter: Zuul
Branch: stable/rocky

commit 5da1c35cc3d113ef702e8e2515c8178c413a4af2
Author: Eddie Yen <email address hidden>
Date: Mon Oct 14 05:24:46 2019 +0000

    Add disk dev name check function

    This patch will add new function in extend_start.sh for OSD
    creation. Not only support loop device but also others that
    disk dev layout is end with numbers.

    Change-Id: Iee5f8b8581d70166de6eba1bdc9e42766fe8cb48
    Closes-Bug: #1847014

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 9.0.0.0rc1

This issue was fixed in the openstack/kolla 9.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 7.1.0

This issue was fixed in the openstack/kolla 7.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 8.0.2

This issue was fixed in the openstack/kolla 8.0.2 release.

Mark Goddard (mgoddard)
Changed in kolla:
status: Fix Committed → Fix Released
milestone: 9.0.0 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.