Install Multipathd with ceph have an ceph_osd error

Bug #1741489 reported by mhhli
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Invalid
Undecided
Unassigned

Bug Description

I also try to install Multipathd with the ceph as the storage back-end.
Then the bootstrap_osd container will be exit because "mkfs.xfs /dev/sdb1" will get error "mkfs cannot open <device>: Device or resource busy"

I don't know install each container by what order. I see the multipathd container is runing ,then the bootstrap_osd can not runing .

The follow link is I find about this error on the internet.

https://www.linuxquestions.org/questions/fedora-35/mkfs-cannot-open-device-device-or-resource-busy-4175484362/

Revision history for this message
Filippo Gaudenzi (gaudenzi-filippo) wrote :

We are facing the same issue.
We have two nodes with multipath enabled with related labels KOLLA_CEPH_OSD_BOOTSTRAP and KOLLA_CEPH_OSD_CACHE_BOOTSTRAP.

From docker logs of the centos-binary-ceph-osd:5.0.1 we get:

INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Copying /var/lib/kolla/config_files/ceph.conf to /etc/ceph/ceph.conf
INFO:__main__:Setting permission for /etc/ceph/ceph.conf
INFO:__main__:Copying /var/lib/kolla/config_files/ceph.client.admin.keyring to /etc/ceph/ceph.client.admin.keyring
INFO:__main__:Setting permission for /etc/ceph/ceph.client.admin.keyring
INFO:__main__:Writing out command to execute
{"election_epoch":4,"quorum":[0,1],"quorum_names":["172.25.41.248","172.25.41.249"],"quorum_leader_name":"172.25.41.248","monmap":{"epoch":1,"fsid":"e2b1db1c-a0f9-41f0-b826-d5250a0c98d5","modified":"2018-02-27 18:03:43.946171","created":"2018-02-27 18:03:43.946171","mons":[{"rank":0,"name":"172.25.41.248","addr":"172.25.41.248:6789\/0"},{"rank":1,"name":"172.25.41.249","addr":"172.25.41.249:6789\/0"}]}}
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
device-mapper: remove ioctl on mpath-sperca1 failed: Device or resource busy
Warning: parted was unable to re-read the partition table on /dev/mapper/mpath-sperca (Device or resource busy). This means Linux won't know anything about the modifications you made.
device-mapper: create ioctl on mpath-sperca1 part1-mpath-36d09466006fbc5002184ce7c1fb8d583 failed: Device or resource busy
device-mapper: remove ioctl on mpath-sperca1 failed: Device or resource busy
device-mapper: remove ioctl on mpath-spercc1 failed: Device or resource busy
Warning: parted was unable to re-read the partition table on /dev/mapper/mpath-spercc (Device or resource busy). This means Linux won't know anything about the modifications you made.
device-mapper: create ioctl on mpath-spercc1 part1-mpath-36d09466006fbc50021ef5dde1a13d1f4 failed: Device or resource busy
device-mapper: remove ioctl on mpath-spercc1 failed: Device or resource busy
mkfs.xfs: cannot open /dev/sdb1: Device or resource busy

Please take into account that CACHE is enabled that is the reason whit there are mpath-spercc and mpath-sperca.

Thanks.

Revision history for this message
Filippo Gaudenzi (gaudenzi-filippo) wrote :

Looking at the find_disk.py(https://github.com/openstack/kolla/blob/master/docker/kolla-toolbox/find_disks.py) script, I saw that kolla is looking for the candidate device under block subsystem using pyudev. Mapper-devices are not listed under this subsystem, indeed they are not discovered. Using labels directly on storage devices, such as /dev/sdb or /dev/sdc entails a duplicate storage device as result of find-disk function.
I'm still looking for a way to list multipath disk using pyudev.

affects: kolla → kolla-ansible
Changed in kolla-ansible:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Jonatan Maggesi (jmaggesi) wrote :

I have the same issue on a Dell VRTX with a PERC8 controller.

Version of kolla-ansible: 5.0.1
Version of OpenStack images: Pike
Operating System: CentOs 7

If i put the labels on the /dev/sdb or /dev/sdd the deploy finds the disks but i have the same errors as Filippo. I tried to put the labels directly on the /dev/mapper/mpath-sparca device but the deploy doesn't find the disks.

Probably the find_disks.py script should look at the /dev/dm-n devices or change the way it search using pyudev.

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :

Is this still an issue for recent k-a versions (and OpenStack releases)?

Revision history for this message
wangwei (wangwei-david) wrote :

These two commits supports iscsi disks and multipath disks.

kolla: https://review.opendev.org/#/c/599961/
kolla-ansbile: https://review.opendev.org/#/c/599962/

If there is any problem in use, please let me know.

Revision history for this message
hotkhwan (hotkhwan) wrote :

hi wangwei
I try to https://review.opendev.org/#/c/599962/ (try with 2 way)
normal to deploy but ceph osd is 0 up and openstack show disk capacity is 0

Revision history for this message
wangwei (wangwei-david) wrote :

hi hotkhwan:

kolla: https://review.opendev.org/#/c/599961/
kolla-ansbile: https://review.opendev.org/#/c/599962/

The above two patches are used together to support multipath disks, which means that you need to build the ceph image and kolla-toolbox image with the kolla patch, and then deploy it with the kolla-ansible patch.

Revision history for this message
hotkhwan (hotkhwan) wrote :
Download full text (33.9 KiB)

hi wangwei
I follow two patches the Bootstrapping Ceph OSDs has fail
kolla version 8.1.0.dev76
kolla-ansible version 8.1.0.dev130

ASK [ceph : pause] *****************************************************************************************************************************************
 [WARNING]: conditional statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: not {{ item.external_journal | bool if
item.external_journal is defined else item.external_block | bool }}

skipping: [ops-st1] => (item={u'fs_uuid': u'bf49f96d-4b9d-47d2-af95-b4ad65d139d9', u'bs_blk_partition_type': u'mpath', u'partition_uuid': u'0f372de0-491c-4c68-810e-c8cbcc254b1d', u'partition': u'/dev/mapper/mpatha1', u'external_block': True, u'fs_label': u'', u'bs_blk_partition': u'/dev/mapper/mpatha2', u'bs_blk_device': u'/dev/mapper/mpatha', u'partition_num': u'1', u'bs_blk_partition_num': u'2', u'device': u'/dev/mapper/mpatha', u'bs_blk_partition_uuid': u'c81785cf-6c2d-4b33-9645-1df5a764700d', u'partition_type': u'mpath'})

TASK [ceph : Bootstrapping Ceph OSDs] ***********************************************************************************************************************
failed: [ops-st2] (item=[0, {u'fs_uuid': u'fe1ecb2b-88b4-484d-ae39-80187edd3881', u'bs_blk_partition_type': u'mpath', u'partition_uuid': u'924cbe89-19f1-4c2f-89f8-c5912a35c946', u'partition': u'/dev/mapper/mpatha1', u'external_block': True, u'fs_label': u'', u'bs_blk_partition': u'/dev/mapper/mpatha2', u'bs_blk_device': u'/dev/mapper/mpatha', u'partition_num': u'1', u'bs_blk_partition_num': u'2', u'device': u'/dev/mapper/mpatha', u'bs_blk_partition_uuid': u'25444a43-4b61-4092-9802-8cf26378bc69', u'partition_type': u'mpath'}]) => {"ansible_loop_var": "item", "changed": true, "item": [0, {"bs_blk_device": "/dev/mapper/mpatha", "bs_blk_partition": "/dev/mapper/mpatha2", "bs_blk_partition_num": "2", "bs_blk_partition_type": "mpath", "bs_blk_partition_uuid": "25444a43-4b61-4092-9802-8cf26378bc69", "device": "/dev/mapper/mpatha", "external_block": true, "fs_label": "", "fs_uuid": "fe1ecb2b-88b4-484d-ae39-80187edd3881", "partition": "/dev/mapper/mpatha1", "partition_num": "1", "partition_type": "mpath", "partition_uuid": "924cbe89-19f1-4c2f-89f8-c5912a35c946"}], "msg": "Container exited with non-zero return code 1", "rc": 1, "stderr": "+ sudo -E kolla_set_configs\nINFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\nINFO:__main__:Validating config file\nINFO:__main__:Kolla config strategy set to: COPY_ALWAYS\nINFO:__main__:Copying service configuration files\nINFO:__main__:Copying /var/lib/kolla/config_files/ceph.conf to /etc/ceph/ceph.conf\nINFO:__main__:Setting permission for /etc/ceph/ceph.conf\nINFO:__main__:Copying /var/lib/kolla/config_files/ceph.client.admin.keyring to /etc/ceph/ceph.client.admin.keyring\nINFO:__main__:Setting permission for /etc/ceph/ceph.client.admin.keyring\nINFO:__main__:Writing out command to execute\n++ cat /run_command\n+ CMD='/usr/bin/ceph-osd -f --public-addr 192.168.244.42 --cluster-addr 192.168.245.42'\n+ ARGS=\n+ [[ ! -n '' ]]\n+ . kolla_extend_start\n++ [[ ! -d /var/log/kolla/ceph ]]\n++ mkdir -p /var/log/ko...

Revision history for this message
Dincer Celik (dincercelik) wrote :

I'm having the same issue with stable/stein. I'll try the patches but kolla's patch is failing.

patching file docker/ceph/ceph-osd/extend_start.sh
Hunk #1 FAILED at 7.
Hunk #2 FAILED at 137.
Hunk #3 FAILED at 144.
Hunk #4 FAILED at 161.
4 out of 4 hunks FAILED -- saving rejects to file docker/ceph/ceph-osd/extend_start.sh.rej
patching file docker/kolla-toolbox/find_disks.py

Revision history for this message
Dincer Celik (dincercelik) wrote :

Ignore previous output and see below for the right one.

patching file docker/ceph/ceph-osd/extend_start.sh
Hunk #1 FAILED at 7.
Hunk #2 succeeded at 154 with fuzz 1 (offset 17 lines).
Hunk #3 succeeded at 163 (offset 17 lines).
Hunk #4 succeeded at 182 (offset 17 lines).
1 out of 4 hunks FAILED -- saving rejects to file docker/ceph/ceph-osd/extend_start.sh.rej
patching file docker/kolla-toolbox/find_disks.py

Changed in kolla-ansible:
assignee: nobody → wangwei (wangwei-david)
status: Confirmed → In Progress
Revision history for this message
wangwei (wangwei-david) wrote :

hi hotkhwan

I am sorry to see your message so late, because this message has no email reminder, so if you have any problems, please send it to my email(<email address hidden>).

Your problem should be caused by version inconsistency. The patch I submitted is for the ceph luminous version, but recently the community upgraded the ceph version to nautilus, so we need to add '--no-mon-config' when using the ceph-osd command.

I have rebase the code and can now be used for deployment.

kolla: https://review.opendev.org/#/c/599961/
kolla-ansbile: https://review.opendev.org/#/c/599962/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla-ansible (master)

Change abandoned by "Radosław Piliszek <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/599962
Reason: kolla ansible no longer deploys ceph

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

kolla ansible no longer deploys ceph

Changed in kolla-ansible:
status: In Progress → Invalid
importance: High → Undecided
assignee: wangwei (wangwei-david) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.