swift-ring-builder write_builder does not respect fractional replica count, always sets int value
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Low
|
Unassigned |
Bug Description
If you have a ring file with fractional replica count and use swif-ring-builder write_builder to create a buider file for that ring, the builder will have ceil(fractional replica count) replicas.
We should at least document/warn that this happens.
Example:
Create ring with 3.2 replicas...
swift@vm-
swift@vm-
Device d0r1z1-
swift@vm-
Device d1r1z1-
swift@vm-
Device d2r1z1-
swift@vm-
Device d3r1z1-
swift@vm-
Device d4r1z1-
swift@vm-
Reassigned 48 (75.00%) partitions. Balance is now 2.34. Dispersion is now 0.00
Write ring file, check ring state...
swift@vm-
swift@vm-
fractional.builder, build version 6
64 partitions, 3.200000 replicas, 1 regions, 1 zones, 5 devices, 2.34 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file fractional.
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6201 127.0.0.1:6201 sda 100.00 41 0.10
1 1 1 127.0.0.1:6201 127.0.0.1:6201 sdb 100.00 41 0.10
2 1 1 127.0.0.1:6201 127.0.0.1:6201 sdc 100.00 41 0.10
3 1 1 127.0.0.1:6201 127.0.0.1:6201 sdd 100.00 41 0.10
4 1 1 127.0.0.1:6201 127.0.0.1:6201 sde 100.00 40 -2.34
"Lose" the builder file...
swift@vm-
Write a replacement builder file...
swift@vm-
Note: using fractional.builder instead of fractional.ring.gz as builder file
WARNING: default min_part_hours may not match the value in the lost builder.
...which will have 4 replicas, not 3.2...
swift@vm-
fractional.builder, build version 0
64 partitions, 4.000000 replicas, 1 regions, 1 zones, 5 devices, 21.88 balance
The minimum number of hours before a partition can be reassigned is 24 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file fractional.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6201 127.0.0.1:6201 sda 100.00 41 -19.92
1 1 1 127.0.0.1:6201 127.0.0.1:6201 sdb 100.00 41 -19.92
2 1 1 127.0.0.1:6201 127.0.0.1:6201 sdc 100.00 41 -19.92
3 1 1 127.0.0.1:6201 127.0.0.1:6201 sdd 100.00 41 -19.92
4 1 1 127.0.0.1:6201 127.0.0.1:6201 sde 100.00 40 -21.88
Makes perfect sense given how we write our ring metadata [1]. Would it be enough to fix write_ring, or do we also want to fix write_builder to eat old rings better?
[1] https:/ /github. com/openstack/ swift/blob/ 2.13.0/ swift/common/ ring/ring. py#L126