Misconfiguration of bluestore wal/db block size options leads to non-related error message

Bug #1803767 reported by Vladimir Grevtsev
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Triaged
Low
Unassigned

Bug Description

Problem statement:

Charm expects the "bluestore-block-db-size" and "bluestore-block-wal-size" be configured per OSD device, not per unit (it's not clear from config options, here is a review [1] that fixes that).

If this config parameters will be oversized (in my case I specified 480GB for DB and 20GB for WAL, assuming 500GB partition will be used for WAL/DB data per all OSDs; but this parameters was considered to be used per OSD) units will switch to "blocked" status, but with non-related "Non-pristine devices detected" error message - however, charm has failed to execute "lvcreate" CLI command when trying to create WAL/DB partitions.

In this particular case /dev/disk/by-dname/nvme0n1-part2 have 500G overall, but by mistake charm config contained sizing per unit, not per OSD - so this error triggered, and after clearing up and dividing wal/db sizing by OSD count, all is working fine.

Expected output:

Error message like "Unsufficient space on device" or something else; "non-pristine device" appeared here because of first OSD device has been initialized successfully, while others don't and, as I think, it's not related to that case.

Additional information:

juju status: https://pastebin.canonical.com/p/M8YcGgcpcZ/
bundle: https://pastebin.canonical.com/p/4TRwF8hcQJ/
unit logs: https://pastebin.canonical.com/p/rhtK5SMXkC/

[1] https://review.openstack.org/#/c/618527/

James Page (james-page)
Changed in charm-ceph-osd:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Ryan Beisner (1chb1n) wrote :

The added clarity in the charm config documentation is quite helpful and appreciated (merged 8mo. ago).

I also agree that we can and should make improvements to the UX in the certain scenarios as highlighted above.

This would be a good bug task for engineering onboarding and/or engineering rotations.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.