Commit holds fulltext longer than it needs to
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
Low
|
Unassigned |
Bug Description
Pulled out of bug #566940, specifically comment #1: https:/
The detail is that the commit code passes the fulltext content down many layers of function calls. At the bottom, we turn the fulltext into a compressed text. At that point, we no longer need the fulltext. Python reference counting means that all of the frames above us still hold a reference to that content.
Some thoughts:
1) Wrap the content in an object. The intermediate frames will hold references to the outer object, which can be told to release its hold on the text itself.
2) Try to rework the apis so that we can pass down something like "an iterable of content". This would allow us to use something like a File object, which we then iterate (though ideally we'd want something better than line-based iteration, like 64kB chunk iteration). In which case we could teach commit to only track the compressed content size.
It does complicate a few places, though, so we'd want to only use it carefully.
The easier one is to define an object that we can put through the levels and indicate when we are done with the fulltext. We *might* want it to have a way to get the text back, in case we needed it,
tags: | added: check-for-breezy |
This probably won't drop our specific peak memory (1fulltext + 1compressed text), but it can lower the total time that we consume the full amount.