TranslationBranchApprover uses 'walkdirs', will be very slow on large bzr trees
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
High
|
Unassigned |
Bug Description
On 2009-03-27 on #launchpad-code
[13:38] <abentley> henninge: Also, I think you should not be calling to_tree.walkdirs in _init_translati
[13:38] <henninge> abentley: No, I need a full list.
[13:39] <abentley> henninge: Oh, why?
[13:39] <henninge> abentley: for the approver to work.
...
[13:44] <abentley> henninge: There are other ways, I'm not sure if they're easier or harder. Tree.iter_
[13:45] <abentley> henninge: But why does the approver want to know about unchanged files?
[13:47] <henninge> abentley: to get the full picture and to know which of the existing database entries have corresponding files in the tree.
[13:47] <henninge> abentley: to catch template files being moved or renamed.
[13:48] <abentley> henninge: Moved or renamed counts as "changed"
[13:49] <henninge> abentley: If done properly using "bzr move"
[13:49] <henninge> yeah, maybe we should expect people to do it properly...
[13:49] <abentley> henninge: If done improperly, it's a create and a delete, and those definitely count as changed.
[13:50] <henninge> abentley: will I see deleted files?
[13:50] <abentley> iter_changes will tell you about deleted files.
[13:51] <abentley> It will say they went from versioned to unversioned, and from kind='file' to kind=None
[13:51] <henninge> but I am only using the to_tree information ...
[13:51] <henninge> so those will not be treated by my code.
[13:51] <abentley> henninge: You are doing to_tree.
[13:52] <henninge> abentley: yes, those file_names I gathered during walktree
[13:52] <abentley> henninge: So stop doing that.
[13:52] <henninge> ;-)
[13:52] <abentley> Just do normal iter_changes.
[13:53] <abentley> henninge: Sorry to be pushy about this, but whole-tree operations are a scalability problem.
[13:55] <henninge> abentley: that changes the whole concept of the approver so I will have to think about it.
[13:55] <flacoste> salgado: but yeah, a box with jaunty is a good idea anyway
[13:55] <abentley> henninge: It's certainly something you can fix in a future release.
[13:56] <henninge> abentley: so the scalabilty issue is not imminent?
[13:57] <abentley> henninge: I have no idea. We'll see when it hits production.
[13:57] * henninge shudders.
[13:57] <abentley> You have a much better idea than I how often these jobs will run against how many branches.
[13:59] <henninge> abentley: well, the feature is new so I have the hope that it will not be used widely at first.
[13:59] <abentley> henninge: It must be explicitly enabled? We're fine, then.
[13:59] <henninge> abentley: but it is also a cool feature so people may jump to it...
[13:59] <henninge> abentley: oh yeah, it must!
Changed in rosetta: | |
importance: | Undecided → Medium |
status: | New → Triaged |
tags: | added: translations-branch |
If the approver cannot gather information about all template files in the source tree from bazaar it must get this information from the database. Two solution come to mind:
1. It assumes that each potemplate entry in the database has been uploaded from the tree at one point and therefore corresponds to a template file in the source tree. So this list plus added files that iter_changes reports make up the full list of template files. This implies that manual uploads must be prohibited when import from branches is active to avoid that the owner breaks the above assumption by uploading templates. This may be viewed by the user as a limitation.
2. We add a flag to potemplate to indicate its origin. The approver will only consider those templates that are flagged as having been imported from the branch. For new templates arriving in the branch it checks for an existing template of the same name without the flag and uses that (while setting the flag) instead of creating a new template. This way existing templates can be re-used if the project switches to import from branches. If the owner chooses to deactivate import from branches, all flags are removed.