Activity log for bug #1605841

Date Who What changed Old value New value Message
2016-07-23 08:40:20 Cheng Li bug added bug
2016-07-23 08:40:29 Cheng Li swift: assignee Cheng Li (shcli)
2016-07-23 08:47:51 OpenStack Infra swift: status New In Progress
2016-09-12 20:22:17 clayg swift: importance Undecided Medium
2016-09-12 20:22:41 clayg summary add dev_losing_part['parts_wanted'] += 1 Bad balance and excess part movement when reducing fractional replica count
2016-09-17 12:19:45 Cheng Li description in swift/common/ring/builder.py the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line. dev['parts_wanted'] = parts_by_tier[tier] - dev['parts'] so dev_losing_part['parts'] -= 1 should be followed with dev_losing_part['parts_wanted'] += 1 this will cause problem when you reduce replicas and then rebalance. current case: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00 after this issue fixed: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00 in swift/common/ring/builder.py the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line. dev['parts_wanted'] = parts_by_tier[tier] - dev['parts'] so `dev_losing_part['parts'] -= 1` should be followed with dev_losing_part['parts_wanted'] += 1 this will cause problem when you reduce replicas and then rebalance. current case: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00 after this issue fixed: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00
2016-09-17 12:24:12 Cheng Li description in swift/common/ring/builder.py the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line. dev['parts_wanted'] = parts_by_tier[tier] - dev['parts'] so `dev_losing_part['parts'] -= 1` should be followed with dev_losing_part['parts_wanted'] += 1 this will cause problem when you reduce replicas and then rebalance. current case: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00 after this issue fixed: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00 in swift/common/ring/builder.py the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line. dev['parts_wanted'] = parts_by_tier[tier] - dev['parts'] so dev_losing_part['parts'] -= 1 should be followed with dev_losing_part['parts_wanted'] += 1 but in _adjust_replica2part2dev_size(...), dev_losing_part['parts_wanted'] += 1 does exist. This gets wrong parts_wanted number. You will meet this case if you reduce replicas and then rebalance current case: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned *128 (50.00%)* partitions. Balance is now 3.65. Dispersion is now 0.00 after this issue fixed: swift-ring-builder test.builder create 8 3.5 1 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0 swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0 swift-ring-builder test.builder rebalance -s 1 Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00 swift-ring-builder test.builder set_replicas 3 swift-ring-builder test.builder pretend_min_part_hours_passed swift-ring-builder test.builder rebalance -s 1 Reassigned *9 (3.52%)* partitions. Balance is now 0.00. Dispersion is now 0.00