retracer requires a lot of storage for instance sandboxes

Bug #1295400 reported by Brian Murray on 2014-03-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Daisy
High
Unassigned

Bug Description

I'm fairly certain that the cleanup of caches used by the retracers is never done. In setup_cache of retracer.py we can see the following:

        instance_sandbox = tempfile.mkdtemp(prefix='cache-', dir=sandbox_release)
        atexit.register(shutil.rmtree, instance_sandbox)

So when retracer.py exits the the instance_sandbox directory should be deleted. However, this isn't happening and the sandbox directories are growing quite large. Running the retracer manually, and pressing "Ctrl C" to interrupt the retracer we see "Shutting down." (also using atexit.register) in the retracer log file corresponding to the architecture.

However, running the retracer via the upstart job there is no such "Shutting down" log message leading me to believe that atexit never occurs and this is why the cleanup of caches is not being done regularly.

Related branches

Changed in daisy:
importance: Undecided → High
Brian Murray (brian-murray) wrote :

Actually, it seems more likely that the retracers are taking too long to exit and so then they get killed. From the cookbook:

Upstart waits for up to kill timeout seconds (default 5 seconds) for the process to end.

If the process is still running after the timeout, a SIGKILL signal is sent to the process which cannot be ignored and will forcibly stop the processes in the process group.

So perhaps we should increase the kill timeout for the upstart jobs.

Brian Murray (brian-murray) wrote :

In addition to increasing the kill timeout I also think the instance_sandbox directories should be remove when ever the retracing process exits, so that disk space needs are kept to a minimum.

Brian Murray (brian-murray) wrote :

I think I misread a bit code of did not notice that setup_cache will return if a sandbox has already been setup for the release. So any one retracer process will have one sandbox per release, its just that those grow and grow as newer and different packages are downloaded.

summary: - cleanup of instance sandboxes is never done
+ retracer requires a lot of storage or instance sandboxes
Brian Murray (brian-murray) wrote :

Speaking to David Ames about this he thinks the sandbox dir in $release-name/cache-$tmpdir/ is consuming the most space so I've submitted a merge proposal that removes that after every retrace. We'll keep the cache dir in there though as it contains downloaded deb files which are useful for other retrace attempts.

However, for Trusty it'd make sense to clean those up regularly as new packages are created frequently.

summary: - retracer requires a lot of storage or instance sandboxes
+ retracer requires a lot of storage for instance sandboxes
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers