Ring refuses to save even when 100% parts move

Bug #1697543 reported by clayg on 2017-06-12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)

Bug Description

If you're gradually adding weight while adding devices in multiple zones in a EC ring with multiple replicas in each zone, it's possible the rings preference to move replicas of parts that need to disperse to the new zone *first* might leave some new capacity in the original zone waiting for enough frags to move before it can start taking the part-replicas it wants.

duplication example script is attached

Work arounds are to either use "rebalance -f" multiple times until enough frags get assigned to the other zone that the balance changed detection code will start working again.

Or to change weights before you rebalance until the over-assignment in the standing zone becomes more apparent in the balance.

Fix would be to just look at dispersion or changed_parts coming out of rebalance (in addition to delta balance) before we design if the rebalance is worth saving.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers