Comment 3 for bug 1828617

Revision history for this message
Xav Paice (xavpaice) wrote :

I'm seeing this in a slightly different manner, on Bionic/Queens.

We have LVMs encrypted (thanks Vault), and rebooting a host results in at least one OSD not returning fairly consistently. The LVs appear in the list, however the difference between a working and a non-working OSD is the lack of links to block.db and block.wal on a non-working OSD.

See https://pastebin.canonical.com/p/rW3VgMMkmY/ for some info.

If I made the links manually:

cd /var/lib/ceph/osd/ceph-4
ln -s /dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 block.db
ln -s /dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 block.wal

This resulted in a perms error accessing the device "bluestore(/var/lib/ceph/osd/ceph-4) _open_db /var/lib/ceph/osd/ceph-4/block.db symlink exists but target unusable: (13) Permission denied"

ls -l /dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/
total 0
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-db-053e000a-76ed-427e-98b3-e5373e263f2d -> ../dm-20
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-db-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e -> ../dm-24
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-db-33de740d-bd8c-4b47-a601-3e6e634e489a -> ../dm-14
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-db-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 -> ../dm-12
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-db-c2669da2-63aa-42e2-b049-cf00a478e076 -> ../dm-22
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-db-d38a7e91-cf06-4607-abbe-53eac89ac5ea -> ../dm-18
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-db-eb5270dc-1110-420f-947e-aab7fae299c9 -> ../dm-16
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-wal-053e000a-76ed-427e-98b3-e5373e263f2d -> ../dm-19
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-wal-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e -> ../dm-23
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-wal-33de740d-bd8c-4b47-a601-3e6e634e489a -> ../dm-13
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-wal-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 -> ../dm-11
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-wal-c2669da2-63aa-42e2-b049-cf00a478e076 -> ../dm-21
lrwxrwxrwx 1 root root 8 May 22 23:04 osd-wal-d38a7e91-cf06-4607-abbe-53eac89ac5ea -> ../dm-17
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 osd-wal-eb5270dc-1110-420f-947e-aab7fae299c9 -> ../dm-15

I tried to change the perms to ceph.ceph ownership, but no change.

I have also tried (using `systemctl edit lvm2-monitor.service`) adding the following to lvm2, but that's not changed the behavior either:

# cat /etc/systemd/system/lvm2-monitor.service.d/override.conf
[Service]
ExecStartPre=/bin/sleep 60