2016-07-23 08:40:20 |
Cheng Li |
bug |
|
|
added bug |
2016-07-23 08:40:29 |
Cheng Li |
swift: assignee |
|
Cheng Li (shcli) |
|
2016-07-23 08:47:51 |
OpenStack Infra |
swift: status |
New |
In Progress |
|
2016-09-12 20:22:17 |
clayg |
swift: importance |
Undecided |
Medium |
|
2016-09-12 20:22:41 |
clayg |
summary |
add dev_losing_part['parts_wanted'] += 1 |
Bad balance and excess part movement when reducing fractional replica count |
|
2016-09-17 12:19:45 |
Cheng Li |
description |
in swift/common/ring/builder.py
the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line.
dev['parts_wanted'] = parts_by_tier[tier] - dev['parts']
so dev_losing_part['parts'] -= 1
should be followed with
dev_losing_part['parts_wanted'] += 1
this will cause problem when you reduce replicas and then rebalance.
current case:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00
after this issue fixed:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00 |
in swift/common/ring/builder.py
the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line.
dev['parts_wanted'] = parts_by_tier[tier] - dev['parts']
so `dev_losing_part['parts'] -= 1`
should be followed with
dev_losing_part['parts_wanted'] += 1
this will cause problem when you reduce replicas and then rebalance.
current case:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00
after this issue fixed:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00 |
|
2016-09-17 12:24:12 |
Cheng Li |
description |
in swift/common/ring/builder.py
the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line.
dev['parts_wanted'] = parts_by_tier[tier] - dev['parts']
so `dev_losing_part['parts'] -= 1`
should be followed with
dev_losing_part['parts_wanted'] += 1
this will cause problem when you reduce replicas and then rebalance.
current case:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 128 (50.00%) partitions. Balance is now 3.65. Dispersion is now 0.00
after this issue fixed:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned 9 (3.52%) partitions. Balance is now 0.00. Dispersion is now 0.00 |
in swift/common/ring/builder.py
the rebalance function calls self._set_parts_wanted(replica_plan) at 443 line.
dev['parts_wanted'] = parts_by_tier[tier] - dev['parts']
so dev_losing_part['parts'] -= 1
should be followed with
dev_losing_part['parts_wanted'] += 1
but in _adjust_replica2part2dev_size(...),
dev_losing_part['parts_wanted'] += 1 does exist. This gets wrong parts_wanted number.
You will meet this case if you reduce replicas and then rebalance
current case:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned *128 (50.00%)* partitions. Balance is now 3.65. Dispersion is now 0.00
after this issue fixed:
swift-ring-builder test.builder create 8 3.5 1
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d1 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d2 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d3 1.0
swift-ring-builder test.builder add r1z1-127.0.0.1:6000/d4 1.0
swift-ring-builder test.builder rebalance -s 1
Reassigned 512 (200.00%) partitions. Balance is now 0.00. Dispersion is now 0.00
swift-ring-builder test.builder set_replicas 3
swift-ring-builder test.builder pretend_min_part_hours_passed
swift-ring-builder test.builder rebalance -s 1
Reassigned *9 (3.52%)* partitions. Balance is now 0.00. Dispersion is now 0.00 |
|