apt-cacher has poor childpid management

Bug #882874 reported by Craig Miskell on 2011-10-28
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apt-cacher (Ubuntu)
Undecided
Unassigned

Bug Description

Apt-cacher keeps a list of the pids of any children it forks/spawns. At shutdown, it terminates them all with SIGTERM. Unfortunately, it doesn't catch SIGCHLD and remove a normal finished child from the list. So after it's been running for a while, that list of PIDs will be fairly long and rather comprehensive (on one recent server where I noticed it, the list had 25000 unique PIDs, i.e. about 3/4s of the available PID space :)).

The only mitigating factor is that apt-cacher runs as www-data by default, although that doesn't help if you're running it on the same server as apache2, nginx or some other www-data running process.

Patch attached fixes this; it uses a hash instead of an array, and traps SIGCHLD to catch when children finish. Seems to work ok in a short time running it here.

Craig Miskell (3-crjig-7) wrote :

The attachment "Patch to fix the problem" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Craig Miskell (3-crjig-7) wrote :

Slightly modified patch attached; deleting the childPid entry caused a rather odd perl panic:

Sun Jan 1 19:11:13 2012|error [24236]: panic: attempt to copy value 1 to a freed scalar 107bfa8 at /usr/sbin/apt-cacher line 1912.

Looks like some sort of child/parent race condition that is well beyond my ken. Setting the value to 0 instead of deleting, and checking for "1" in the kill loop seems to avoid the issue.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers