Reconstructor jobs are ordered by disk instead of randomized
Bug #1491605 reported by
Caleb Tennis
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
High
|
Unassigned |
Bug Description
We create jobs by disk and then run them (vs. essentially randomizing jobs by the replicator). The problem with this is obviously a slow disk can basically bog down the reconstructor and it never progresses. And imagine a scenario where it restarts every so often (hour?) due to ring changes, so it never progresses past the one disk its working on.
Changed in swift: | |
importance: | Undecided → Low |
Changed in swift: | |
status: | New → Confirmed |
Changed in swift: | |
assignee: | nobody → Paul Dardeau (paul-dardeau) |
Changed in swift: | |
assignee: | Paul Dardeau (paul-dardeau) → nobody |
To post a comment you must log in.
I think we can fix this in collect_parts.
The trick is to do the quick `for policy; for part in listdir` loop to append to a list and shuffle it - then iterate over the shuffled list do all the hard work.
In particular we want to avoid doing the isdir check on ever partition directory before we start yielding out part_info dicts [1].
Ideally tho once the part_info dicts start coming out - they would be splayed randomly across policies and most importantly devices.
1. https:/ /review. openstack. org/#/c/ 140178/