expire_archive_files.py doesn't expire enough cruft

Bug #844945 reported by Julian Edwards on 2011-09-08
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
High
William Grant

Bug Description

It currently expires files for sources and binaries that were superseded or deleted. It should also expire files for obsoleted publications, at least for Ubuntu, as the files are moved to old-releases.ubuntu.com

In addition, we need to expire files for rejected PackageUploads.

(any more William?)

Changed in launchpad:
status: New → Triaged
importance: Undecided → High
tags: added: tech-debt
William Grant (wgrant) wrote :

Here are notes I had:

 * Find all unexpired BPRs.
 * Remove any where datecreated > (now - stay of execution)
 * Remove any from a build which is referenced by a PackageUpload that
   is not DONE or REJECTED.
 * Remove any with publications where any of these hold:
   + dateremoved is NULL
   + dateremoved > (now - stay of execution)
   + archive is private
   + archive is a PPA and blacklisted
   + archive is primary and any of these hold:
     - series is not expirable
       (expirable series are currently: warty, hoary, breezy, dapper,
       edgy, feisty, gutsy, intrepid, jaunty)
     - status is PUBLISHED or OBSOLETE (the final packages in a
       release are kept forever)
 * Expire all BPRs that remain.

What about copy archives? They are not published, so their
PackageUploads never leave ACCEPTED. We probably want a separate thing
to reject them, making them eligible for expiration?

William Grant (wgrant) wrote :

Note that I would be extremely averse to extending the current expire_archive_files approach: it misses stuff, and catches stuff that it shouldn't. We should rewrite it along the lines of librarian-gc.

On Friday 09 September 2011 00:34:56 you wrote:
> What about copy archives? They are not published, so their
> PackageUploads never leave ACCEPTED. We probably want a separate thing
> to reject them, making them eligible for expiration?

We need something separate for those. As you noted, we need a GC approach
which is running continuously for main archives and PPAs. We can't do that for
COPY archives.

William Grant (wgrant) wrote :

So, we can just implicitly blacklist all copy archives for now, if we want. Remember that there are still two barriers to their expiration anyway: firstly, the PackageUploads are ACCEPTED, so nothing can be expired unless and until the archive is published. And even if it is published, dateremoved IS NULL until the contents of the archive are deleted.

So we don't have to blacklist. We could just leave them under the !primary rules, and provide a script to reject all the uploads when we want them to be expired.

William Grant (wgrant) on 2013-02-11
Changed in launchpad:
assignee: nobody → William Grant (wgrant)
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers