ceph: default crush rule does not suit multi-OSD deployments
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Medium
|
Unassigned | ||
ceph (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Quantal |
Won't Fix
|
Undecided
|
Unassigned | ||
Raring |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Version: 0.48.2-
Our Ceph deployments typically involve multiple OSDs per host with no disk redundancy. However the default crush rules appears to distribute by OSD, not by host, which I believe will not prevent replicas from landing on the same host.
I've been working around this by updating the crush rules as follows and installing the resulting crushmap in the cluster, but since we aim for fully automated deployment (using Juju and MaaS) this is suboptimal.
--- crushmap.txt 2013-01-10 20:33:21.265809301 +0000
+++ crushmap.new 2013-01-10 20:32:49.496745778 +0000
@@ -104,7 +104,7 @@
min_size 1
max_size 10
step take default
- step choose firstn 0 type osd
+ step chooseleaf firstn 0 type host
step emit
}
rule metadata {
@@ -113,7 +113,7 @@
min_size 1
max_size 10
step take default
- step choose firstn 0 type osd
+ step chooseleaf firstn 0 type host
step emit
}
rule rbd {
@@ -122,7 +122,7 @@
min_size 1
max_size 10
step take default
- step choose firstn 0 type osd
+ step chooseleaf firstn 0 type host
step emit
}
Changed in ceph (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → Medium |
Changed in cloud-archive: | |
status: | New → Confirmed |
Changed in cloud-archive: | |
importance: | Undecided → Medium |
Changed in ceph (Ubuntu Quantal): | |
status: | New → Won't Fix |
Changed in ceph (Ubuntu Raring): | |
status: | New → Fix Released |
Changed in cloud-archive: | |
status: | Confirmed → Fix Released |
Changed in ceph (Ubuntu): | |
status: | Confirmed → Fix Released |
This has been fixed on upstream's master branch by commit c236a51a8040508 ee893e4c64b206e 40f9459a62 and cherry-picked to the bobtail branch as 6008b1d8e4587d5 a3aea60684b1d87 1401496942. The change does not seem to have been applied to argonaut.