Ring refuses to save even when 100% parts move

Bug #1697543 reported by clayg on 2017-06-12
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Medium
Unassigned

Bug Description

If you're gradually adding weight while adding devices in multiple zones in a EC ring with multiple replicas in each zone, it's possible the rings preference to move replicas of parts that need to disperse to the new zone *first* might leave some new capacity in the original zone waiting for enough frags to move before it can start taking the part-replicas it wants.

duplication example script is attached

Work arounds are to either use "rebalance -f" multiple times until enough frags get assigned to the other zone that the balance changed detection code will start working again.

Or to change weights before you rebalance until the over-assignment in the standing zone becomes more apparent in the balance.

Fix would be to just look at dispersion or changed_parts coming out of rebalance (in addition to delta balance) before we design if the rebalance is worth saving.

clayg (clay-gerrard) wrote :

With enough replicas and a failed device it's easier to see that we should look at delta_dispersion in addition to delta_balance:

https://gist.github.com/clayg/b0d0d41a382e70356bb58a1ee94d1b73

With the failed device on the server that's desperately trying to shed parts, and enough replicas -
 balance will not change significantly from one invocation to the next while rebalance is busy fixing dispersion...

We should expect that as a desire-able behavior and use delta_dispersion to get over the hump:

ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder |head
stuck.builder, build version 63, id a5b9fbd213bb4c20ab60eff2a2bb3a75
256 partitions, 13.000000 replicas, 1 regions, 1 zones, 52 devices, 100.00 balance, 100.00 dispersion
...
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance -f
Reassigned 256 (100.00%) partitions. Balance is now 100.00. Dispersion is now 0.00
-------------------------------------------------------------------------------
NOTE: Balance of 100.00 indicates you should push this
      ring, wait at least 0 hours, and rebalance/repush.
-------------------------------------------------------------------------------
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Reassigned 255 (99.61%) partitions. Balance is now 1.56. Dispersion is now 0.00

Notice the delta_dispersion when "cowardly refusing to save rebalance" is *HUGE*

Changed in swift:
importance: Low → Medium
clayg (clay-gerrard) wrote :

I was pretty sure this changed would solve this

https://review.openstack.org/#/c/479012/

But maybe there's something else going on? I'd like to know what...

clayg (clay-gerrard) wrote :

So if you have a sufficient number of replicas to move you can have 100 dispersion even after moving the maximum whole replicanth

If you go from 6 replicanths in z1 and 1 in z2 to 5 replicanths in z1 and 2 in z2 the dispersion metric should probably represent some kind of improvement - but I'm not sure how exactly... and we'd still need to related change.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers