Efficiently mirroring sftp hosted branches with minimal latency
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
Critical
|
Jonathan Lange |
Bug Description
SFTP hosted branches should be mirrored http://
We control the SFTP server, so we know when they are updated. Also hosted branches are stored locally, so mirroring them to HTTP should be very fast.
There is hack in place to mirror them all every time the branch puller cron script runs, this has brought the latency down to minutes (rather than a whole day!). This latency is acceptable (but not great), but this solution isn't scalable. The vast majority of hosted branches won't change between cron script invocations, but the script still has to examine each one of them. As more and more hosted branches are added (and as those branches grow more revisions), this script will become slower and slower, disproportionately to the amount of new branches and revisions that need to be mirrored.
To fix the scalability, the SFTP server should inform the branch puller, probably by an XML-RPC call that updates the database, that a branch has been updated, so the puller can know which branches need attention. This would have no impact on the latency (except that the cron script would be a bit faster because it doesn't need to examine unchanged hosted branches).
Also, it would be nice to further reduce the latency to almost nothing by having the SFTP server immediately kick off a branch mirroring attempt for a branch, rather than updating the database and waiting for the next run of the cronscript. Possibly this should be a topic of a new bug.
Some discussion on how to implement that can be found in bug 49989.
Changed in launchpad-bazaar: | |
importance: | Untriaged → High |
status: | Unconfirmed → Confirmed |
description: | updated |
description: | updated |
There's a workaround now in place that hopefully makes this much better. Let's see how it goes.