Merge detection times out during ref scans in repositories with deep history
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
High
|
Colin Watson | ||
turnip |
Fix Released
|
High
|
Colin Watson |
Bug Description
When there is a lot of references and there is an open merge-proposal against the repository the git_repository refs_collection may end up not matching the output of git ls-remote; some of the refs are absent.
===
apw
13:41
@cristiangsp hey ... i have a git repository for which a git ls-remote --refs shows we have 340 refs in the repository, but the repo.refs iterator for the launchpad API element only has 217.
apw
13:41
git://git.
13:42
i assume this is not to be expected.
cristiangsp
14:06
checking what exactly repo.refs is returning
apw
14:07
looking at the Ubuntu-4.15 prefix tags it seems to stop at -66.75. Whereas a git ls-remote on it they go up to -159.167.
apw
14:14
i will note that the git ls-remote output is 26K, if i truncate it at 16K it is 221 entries, really close to the 217 that is reported in launchpad.
cristiangsp
14:32
@cjwatson do you think this could be related with some issue with the GitRefScanJob?
cjwatson
14:32
I guess it must be but I don't know
cristiangsp
14:32
ok, checking logs
apw
14:32
i have not requested any kind of rescan, is that a thing?
cjwatson
14:32
You shouldn't need to
14:33
Hm, though https:/
14:33
So maybe try poking that before looking at anything else
apw
14:34
should i hit the rescan button, or is "poke" here "look into" and intended for cristian
cjwatson
14:35
Probably look into in case it's reliably timing out
14:35
It'll be in ackee's celeryd-
cristiangsp
14:35
yep already there
apw
14:36
i do know how to torture git.l.n. i am glad we don't have everything in the one repo, that is for sure.
cristiangsp
14:44
I have got the OOPS:
[2021-09-22 10:49:02,278: INFO/ForkPoolWo
[2021-09-22 10:49:02,280: INFO/ForkPoolWo
cjwatson
14:45
Looks like a timeout in merge detection
cristiangsp
14:45
https:/
14:45
is not loading :slightly_
cjwatson
14:45
It loads eventually, just very big and slow
cristiangsp
14:45
oh ok
cjwatson
14:46
Shows >31s in a /repo/17131/
14:47
We should probably start by finding out (somehow) how long that code actually takes if allowed to finish, and seeing whether we should be looking at optimizing it or bumping timeouts
cristiangsp
14:48
ok, will create an issue with all these details
cjwatson
14:49
Detecting merges is fairly tricky; the implementation goes to some effort to avoid having to scan the entire history, but maybe something is going wrong there
cristiangsp
14:50
how much this is blocking you @apw?
apw
14:51
it is currently blocking my processing, but i likely can switch back to the old code.
cjwatson
14:51
Should be possible to analyse it by figuring out what the MP in question is (I'm fairly sure it's https:/
apw
14:52
ironic as we don't use merges pretty much :slightly_
cjwatson
14:52
Right, but it still needs to check
14:53
If all else fails, either merging or rejecting that MP would likely fix it, but obviously it would be better for the infrastructure to cope
apw
14:54
i have turned off the git launchpad integration in swm, so it should fall back to git ls-remote for its data; so i can leave things as they are
cjwatson
14:55
It's also possible it might help to change the detect-merges API to send the previously-scanned head commits as explicit stop commits for the commit graph walk
14:55
Since it may in fact just be trying to walk back through the full history and that obviously won't work for Linux ...
apw
14:56
ok confirmed we are unblocked for now
cjwatson
14:58
If it can wait a few weeks, and if it is what I think it is, this could make a good new-starter task (not the first thing to give somebody, but a good intro to git hosting mechanisms)
cristiangsp
14:58
agreed
apw
15:34
its preventing me implementing an optimisation; but i can live with that for now
15:35
(LP git repositories have a date_last_changed) which i cannot see with git ls-remote.
Related branches
- Simone Pelosi: Approve
-
Diff: 196 lines (+61/-17)7 files modifiedlib/lp/code/interfaces/githosting.py (+3/-1)
lib/lp/code/interfaces/gitrepository.py (+4/-1)
lib/lp/code/model/githosting.py (+14/-4)
lib/lp/code/model/gitrepository.py (+2/-1)
lib/lp/code/model/tests/test_githosting.py (+13/-1)
lib/lp/code/model/tests/test_gitrepository.py (+17/-8)
lib/lp/code/subscribers/git.py (+8/-1)
- Guruprasad: Approve
-
Diff: 85 lines (+39/-4)3 files modifiedturnip/api/store.py (+5/-2)
turnip/api/tests/test_api.py (+30/-0)
turnip/api/views.py (+4/-2)
Changed in turnip: | |
status: | Triaged → In Progress |
assignee: | nobody → Colin Watson (cjwatson) |
Changed in launchpad: | |
status: | Triaged → In Progress |
assignee: | nobody → Colin Watson (cjwatson) |
Changed in turnip: | |
status: | In Progress → Fix Released |
Changed in launchpad: | |
status: | In Progress → Fix Released |
The number of references is probably not relevant, but the depth of history is.