Migration script needs to be parallelized.
Bug #1517675 reported by
Robert Bruce Park
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
CI Train [cu2d] |
Fix Released
|
Undecided
|
Robert Bruce Park |
Bug Description
When the migration script was only checking published silos, it ran in about 30 seconds.
When we upgraded it to check for new commits, it ran in about 4 minutes (for ~55 silos).
Now the migration script has to iterate over every build in every package in every silo, and it takes 18 minutes to run. I should really find a way to parallelize each silo to speed things up.
Related branches
lp:~robru/cupstream2distro/parallelize-migration
- Robert Bruce Park (community): Approve
- PS Jenkins bot: Approve (continuous-integration)
-
Diff: 400 lines (+108/-134)6 files modifiedcitrain/jenkins-templates/status.xml.tmpl (+31/-11)
citrain/setup_citrain.py (+6/-2)
citrain/status.py (+16/-30)
files/config.xml (+22/-0)
tests/unit/test_script_setup_citrain.py (+5/-2)
tests/unit/test_script_status.py (+28/-89)
Changed in cupstream2distro: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Ok I think I have a reasonable parallelization plan:
Instead of having one global check-publicati on-migration script, there needs to be a per-silo 'status' script, which does the same thing, just considering one silo each rather than one job having a for loop over all silos.
Then all these jobs can all be set on their own timers.
Also this job should collect the same artifacts as the build job (diffs), so that those are easy to get to all in one place.
This may also require an increase in executors, as having 60 jobs fire off every 30 minutes, even if they individually only take a minute, will still overwhelm the 20 executors available. May interfere with people trying to build.