ceph-osd-replication-count should be set based on the number of nodes instead of OSDs

Bug #2065698 reported by Nobuto Murata
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
Fix Committed
High
Hemanth Nakkina

Bug Description

$ snap list openstack
Name Version Rev Tracking Publisher Notes
openstack 2024.1 503 2024.1/edge canonical✓ -

It looks like the current logic will set ceph-osd-replication-count to the number of OSDs discovered.

https://github.com/canonical/snap-openstack/blob/599e01aa263729d8f411241531bc424934b9ce05/sunbeam-python/sunbeam/commands/openstack.py#L139-L153

However, in the following case in the bootstrap phase, I think replica=1 should be used since Ceph cannot put the replicas onto the same host in the following tree.

$ sudo ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.03119 root default
-2 0.03119 host sunbeam-1
 1 0.01559 osd.1 up 1.00000 1.00000
 2 0.01559 osd.2 up 1.00000 1.00000

$ juju run microceph/leader list-disks
Running operation 13 with 1 task
  - task 14 on unit-microceph-0

Waiting for task 14...
osds: '[{''osd'': 1, ''path'': ''/dev/disk/by-path/virtio-pci-0000:06:00.0'', ''location'':
  ''sunbeam-1''}, {''osd'': 2, ''path'': ''/dev/disk/by-path/virtio-pci-0000:07:00.0'',
  ''location'': ''sunbeam-1''}]'
unpartitioned-disks: '[]'

So in this case replica should be 1 based on the unique number of "location" (sunbeam-1) instead of 2 based on the number of OSDs.

Revision history for this message
Nobuto Murata (nobuto) wrote :

I mean the logic can be:

ceph-osd-replication-count is the unique number of host when # of hosts is < 3
or ceph-osd-replication-count = 3 when the number of hosts is >=3

or something like that.

Nobuto Murata (nobuto)
description: updated
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Yes, I think it would be ideal if the microceph piece simply handled the replication factor based on the number of hosts that are available by default (e.g. RF = min(num_hosts, 3)).

This is something that the sunbeam configuration can handle, however I really think we should push it down into the microceph piece as that will be solved more broadly for other users of microceph.

Revision history for this message
Nobuto Murata (nobuto) wrote :

One correction to the original description, there was a logic of min(osds, 3) already, so the feedback here is a slight update to min(hosts, 3).

https://github.com/canonical/snap-openstack/blob/599e01aa263729d8f411241531bc424934b9ce05/sunbeam-python/sunbeam/commands/openstack.py#L95-L96

James Page (james-page)
Changed in snap-openstack:
status: New → Triaged
status: Triaged → In Progress
importance: Undecided → High
assignee: nobody → Hemanth Nakkina (hemanth-n)
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Changed in snap-openstack:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.