With the failed device on the server that's desperately trying to shed parts, and enough replicas -
balance will not change significantly from one invocation to the next while rebalance is busy fixing dispersion...
We should expect that as a desire-able behavior and use delta_dispersion to get over the hump:
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder |head
stuck.builder, build version 63, id a5b9fbd213bb4c20ab60eff2a2bb3a75
256 partitions, 13.000000 replicas, 1 regions, 1 zones, 52 devices, 100.00 balance, 100.00 dispersion
...
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance -f
Reassigned 256 (100.00%) partitions. Balance is now 100.00. Dispersion is now 0.00
-------------------------------------------------------------------------------
NOTE: Balance of 100.00 indicates you should push this
ring, wait at least 0 hours, and rebalance/repush.
-------------------------------------------------------------------------------
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Reassigned 255 (99.61%) partitions. Balance is now 1.56. Dispersion is now 0.00
Notice the delta_dispersion when "cowardly refusing to save rebalance" is *HUGE*
With enough replicas and a failed device it's easier to see that we should look at delta_dispersion in addition to delta_balance:
https:/ /gist.github. com/clayg/ b0d0d41a382e703 56bb58a1ee94d1b 73
With the failed device on the server that's desperately trying to shed parts, and enough replicas -
balance will not change significantly from one invocation to the next while rebalance is busy fixing dispersion...
We should expect that as a desire-able behavior and use delta_dispersion to get over the hump:
ubuntu@ saio:/vagrant/ .scratch/ rings/tata$ swift-ring-builder stuck.builder rebalance saio:/vagrant/ .scratch/ rings/tata$ swift-ring-builder stuck.builder |head 0ab60eff2a2bb3a 75 saio:/vagrant/ .scratch/ rings/tata$ swift-ring-builder stuck.builder rebalance saio:/vagrant/ .scratch/ rings/tata$ swift-ring-builder stuck.builder rebalance -f ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- -- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- -- saio:/vagrant/ .scratch/ rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@
stuck.builder, build version 63, id a5b9fbd213bb4c2
256 partitions, 13.000000 replicas, 1 regions, 1 zones, 52 devices, 100.00 balance, 100.00 dispersion
...
ubuntu@
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@
Reassigned 256 (100.00%) partitions. Balance is now 100.00. Dispersion is now 0.00
-------
NOTE: Balance of 100.00 indicates you should push this
ring, wait at least 0 hours, and rebalance/repush.
-------
ubuntu@
Reassigned 255 (99.61%) partitions. Balance is now 1.56. Dispersion is now 0.00
Notice the delta_dispersion when "cowardly refusing to save rebalance" is *HUGE*