some parts replicas assigned to duplicate devices in the ring
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
High
|
Samuel Merritt |
Bug Description
With rings who's replica count is restrained by the device number - when the weights of the available devices are not well distributed - sometimes the RingBuilder will place more than one replica of partition on the same device.
This is more likely with EC rings where the replica count is more likely to approach the device count.
You can work around the issue if you set overload very high before the initial balance. After a rebalance it can take some time for overload to pull the duplicated replicas of the hungry devices.
Here's an example ring builer output that demonstrates the issue:
ec-test.builder, build version 15
1024 partitions, 14.000000 replicas, 1 regions, 1 zones, 15 devices, 0.10 balance, 29.98 dispersion
The minimum number of hours before a partition can be reassigned is 0
The overload factor is 0.00% (0.000000)
Devices: id region zone ip address port replication ip replication port name weight partitions balance meta
0 1 1 127.0.0.1 6000 127.0.0.1 6000 d0 2.00 1146 -0.08
1 1 1 127.0.0.1 6001 127.0.0.1 6001 d1 2.00 1147 0.01
2 1 1 127.0.0.1 6002 127.0.0.1 6002 d2 2.00 1147 0.01
3 1 1 127.0.0.1 6003 127.0.0.1 6003 d3 2.00 1146 -0.08
4 1 1 127.0.0.1 6004 127.0.0.1 6004 d4 2.00 1146 -0.08
5 1 1 127.0.0.1 6005 127.0.0.1 6005 d5 2.00 1146 -0.08
6 1 1 127.0.0.1 6006 127.0.0.1 6006 d6 2.00 1147 0.01
7 1 1 127.0.0.1 6007 127.0.0.1 6007 d7 2.00 1147 0.01
8 1 1 127.0.0.1 6008 127.0.0.1 6008 d8 2.00 1147 0.01
9 1 1 127.0.0.1 6009 127.0.0.1 6009 d9 2.00 1147 0.01
10 1 1 127.0.0.1 6010 127.0.0.1 6010 d10 1.00 574 0.10
11 1 1 127.0.0.1 6011 127.0.0.1 6011 d11 1.00 574 0.10
12 1 1 127.0.0.1 6012 127.0.0.1 6012 d12 1.00 574 0.10
13 1 1 127.0.0.1 6013 127.0.0.1 6013 d13 1.00 574 0.10
14 1 1 127.0.0.1 6014 127.0.0.1 6014 d14 1.00 574 0.10
14 devices, and 14 replicas, but some of the devices have half the weight. Towards the end of partition placement - the hungry devices get assigned multiple replicas of the same part. Note that the partition counts all add to replicas * parts - because the device that is holding multiple replicas of the same part is happily incrementing it's part count.
I've attached a script that I think would be a good addition to the RingBuilder post rebalance validate method after we decide how we want to fix it.
summary: |
- some parts replicas assigned to duplicate devices + some parts replicas assigned to duplicate devices in the ring |
tags: | added: ec |
Changed in swift: | |
assignee: | nobody → Samuel Merritt (torgomatic) |
Changed in swift: | |
importance: | Undecided → Medium |
tags: | removed: ec |
@torgomatic should be able to confirm this bug