Cleanup action removes packages still in use

Bug #2046088 reported by Danny Cocks
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
charm-apt-mirror
Won't Fix
High
Robert Gildein

Bug Description

We found that the automatically-run cleanup action on a synchronise can remove packages that are still referenced by existing `Packages` lists. This was observed by deb files in all repositories except for "main" being removed, after running the synchronize action.

I believe it is due to [1] in the `locate_package_indicies` function. The line of code correctly iterates over all dists, but it then takes only the first `Packages*` file it can find, alphabetically. This just happens to be in the `main` repository, because `universe`, `multiverse`, `restricted` are all alphabetically larger.

[1] https://github.com/canonical/charm-apt-mirror/blob/23cd8fa1296d2ce557a39a00d4615760699827e4/src/utils.py#L86

On a related note, this destroyed the snapshots as well as the main mirror because they symlink to the pool directory. This was very surprising! I guess the symlink is to avoid the snapshots taking up unnecessary space. Can I suggest an alternative to use hardlinks for the deb files in the snapshots? This way an unexpected deletion wouldn't happen, and no unnecessary disk space would be used.

Tags: bseng-1865
Eric Chen (eric-chen)
tags: added: bseng-1865
Changed in charm-apt-mirror:
status: New → Triaged
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
Paul Goins (vultaire) wrote :

I've also reviewed this. I agree with Danny's findings. The code was likely written the way it was because there are often Packages, Packages.bz2 and Packages.gz files (maybe even Packages.xz as well?), and these all have the same real content, thus it makes sense to only parse one copy. However, the code in question is not actually looking just at the packages files in the same immediate directory, but across multiple directories within the same mirrored repository subfolder - thus the code unfortunately ends up discarding far more than it should.

Revision history for this message
Paul Goins (vultaire) wrote :

I thought I had some code which could be used here, but upon further reflection my repository "health check" code actually ends up parsing the compressed Packages files as well as the uncompressed ones - thus it's not really a good reference for correcting the broken logic here.

I think the key thing is: for each directory which contains package files, at least one Package file should be parsed. Rather than searching for all Packages files within a “dists” folder, it’s necessary to recurse to the directories within the dists folder where the Packages files are directly contained.

Or, alternatively (and perhaps a better long term solution): when we create a snapshot, rather than the current method of symlinking the “pool” subdirs and copying everytihing else, it’s probably better to do a “cp -r --link”-style copy of the pool.. That is, for files within the pool directory, create hard links rather than normal copies. This has the benefit of cleanup being implicit via the filesystem - the hard-linked “copies” don’t take up any extra disk space beyond the little that the hard links themselves consume in the filesystem, and snapshot removal simply becomes a “rm -rf” of the snapshot directory.

…The above all stated, the first method is probably the simpler one to do with the current code.

Changed in charm-apt-mirror:
status: Triaged → In Progress
assignee: nobody → Robert Gildein (rgildein)
Revision history for this message
Robert Gildein (rgildein) wrote :
Changed in charm-apt-mirror:
status: In Progress → Fix Committed
Revision history for this message
Robert Gildein (rgildein) wrote :

The fix was merger in [1].

There is also proposal [2] to use hardlinks instead of symlinks in snapshots. However,
Paul has good concern that those changes could break snapshots done by previous charm
version. I did not test it properly, but I think it should be fine, since the change
is only in way how new snapshots are done and not how packages are cleaned.

I believe that [2] can be merged, but I would like if someone can test or provide
real example how I can test it.

---
[1]: https://github.com/canonical/charm-apt-mirror/pull/28
[2]: https://github.com/canonical/charm-apt-mirror/pull/29

Revision history for this message
Eric Chen (eric-chen) wrote :
Changed in charm-apt-mirror:
status: Fix Committed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.