fuser forking uncontrollably in cron job

Bug #876387 reported by mrhyde
426
This bug affects 100 people
Affects Status Importance Assigned to Milestone
php5 (Ubuntu)
Undecided
Unassigned
Oneiric
Undecided
Unassigned
psmisc (Debian)
Fix Released
Unknown
psmisc (Ubuntu)
Undecided
Unassigned
Oneiric
High
Unassigned

Bug Description

php5 cron job creates several thousand fuser zombie processes which consumes system resources and causes other applications to crash (due to lack of system resources)

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: php5-common 5.3.6-13ubuntu3.1
ProcVersionSignature: Ubuntu 3.0.0-12.20-generic 3.0.4
Uname: Linux 3.0.0-12-generic x86_64
NonfreeKernelModules: nvidia wl
ApportVersion: 1.23-0ubuntu3
Architecture: amd64
Date: Mon Oct 17 13:10:50 2011
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
SourcePackage: php5
UpgradeStatus: Upgraded to oneiric on 2011-10-14 (2 days ago)
modified.conffile..etc.cron.d.php5: [modified]
mtime.conffile..etc.cron.d.php5: 2011-10-17T13:06:40.584452

Revision history for this message
mrhyde (dczech) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in php5 (Ubuntu):
status: New → Confirmed
Revision history for this message
Fabio Alessandrelli (mckenzie) wrote :

I can confirm this bug.
Workaround ( NOT for production server ) is to disable session cleaning cron job ( /etc/cron.d/php5 ) as reported here http://ubuntuforums.org/showthread.php?p=11355965 .

Ubuntu version: 11.10
Upgraded from: 11.04

Revision history for this message
Graham Poulter (grahampo) wrote :

My workaround is to restore the php5 cron job from 11.04, which does not call fuser:

This is the 11.10 cron job:

09,39 * * * * root [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete

And this is the 11.04 cron job:

09,39 * * * * root [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) -delete

We think fuser was added to cater for some edge case of process not closing the session file, but was never tested with a large number of sessions.

I posted this solution yesterday on the thread: http://ubuntuforums.org/showthread.php?p=11355965

Revision history for this message
Ondřej Surý (ondrej) wrote :

The fuser check was added to not delete still active session files (e.g. created long time ago, but still actively used by php5 process).

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in psmisc (Ubuntu):
status: New → Confirmed
Revision history for this message
Leo Unglaub (leo-unglaub) wrote :

Maybe it's an interesting point that i have this problem only if the php-apc package is installed.
ii php-apc 3.1.7-1 APC (Alternative PHP Cache) module for PHP 5

Greetings
Leo

Revision history for this message
peterh (peter-holik) wrote :

This little one liner stopped zombie flooding for me:

--- fuser.c.orig 2011-10-28 10:50:49.000000000 +0200
+++ fuser.c 2011-10-28 10:50:49.000000000 +0200
@@ -1820,6 +1820,7 @@
                (void) alarm(0);
                (void) signal(SIGALRM, SIG_DFL);
                close(pipes[0]);
+ waitpid(pid, NULL, 0);
                break;
        }
        return ret;

i also reportet upstream: https://sourceforge.net/tracker/?func=detail&atid=115273&aid=3429674&group_id=15273

cu Peter

Revision history for this message
mama21mama (mama21mama2000) wrote :
Revision history for this message
klemens_u (klemens) wrote :

I've got the same problem. Reverted to the Ubuntu 10.04 setting as
suggested above.

Revision history for this message
Bernhard Goetze (wst-lordfish) wrote :

Same here with Ubuntu 11.10 x64

The Bug is not all the time active, but ~ every 30? Lot of Zombie Process and fuser with 9999% cpu. Alsoi have never more then 1 GB free RAM of 16 GB, and with that the Message : Cannot allocate memory

Cant be normal? The White Spare are wehen munin Cannot allocate memory
http://www.imagebanana.com/view/5fb8v6fs/memoryday.png

Revision history for this message
Mikael Nordfeldth (mmn) wrote :

Ubuntu 11.10 with php-cgi (php5) and libapache2-mod-fcgid causes this, however I only seem to have the problem on heavy loads - or at least with rapid page loading. Is it after certain amount of page loads that the cleanup process will have too much to clean up?

I only noticed there was a problem when some user was mirroring a php-site.

Workaround with removing the `fuser` part in /etc/cron.d/php5 seems to be working so far.

Would perhaps running the cleanup cron-job more often make the impact smaller?

Revision history for this message
Rthaduthd Anthnhkrc (nthnuekeu-deactivatedaccount) wrote :

I'm also seeing this using php-cgi with lighttpd as a front-end (reverse proxy)...

Changed in psmisc (Debian):
status: Unknown → New
Revision history for this message
Akber Choudhry (akberc) wrote :

Did not happen on a new instance on 32-bit small EC2 instance

Happened on an upgrade from 11.04 to 11.10 on a micro EC2 instance

Revision history for this message
Thomas Tanghus (tanghus) wrote :

It seems like some 'update' modified my modified cronjob :-/ Just had it happen again.

Revision history for this message
Gergely Csépány (cheoppy) wrote :

It happened for me today for the first time.
What kind of update might have taken place? I did not find any package update concerning php in the last few days.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

I had an update today, but I must admit that I didn't notice what it was :-P

Revision history for this message
Gergely Csépány (cheoppy) wrote :

The update process is usually logged in
  /var/log/apt/history.log
or in
  /var/log/unattended-upgrades/unattended-upgrades.log
if you have automatic updates enabled. You might find the affected packages there.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

Ok, that can't be the culprit:

Start-Date: 2011-12-13 12:03:11
Upgrade: kde-zeroconf:amd64 (4.7.3-0ubuntu0.1~ppa1, 4.7.3-0ubuntu0.1), kppp:amd64 (4.7.3-0ubuntu0.1~ppa1, 4.7.3-0ubuntu0.1), libkopete4:amd64 (4.7.3-0ubuntu0.1~ppa1, 4.7.3-0ubuntu0.1), kopete:amd64 (4.7.3-0ubuntu0.1~ppa1, 4.7.3-0ubuntu0.1), kdenetwork-filesharing:amd64 (4.7.3-0ubuntu0.1~ppa1, 4.7.3-0ubuntu0.1), libmsn0.3:amd64 (4.1-2ubuntu1.1, 4.1-2ubuntu1.2)

Thanks for the hint about history.log :-)

Revision history for this message
Gergely Csépány (cheoppy) wrote :

@tanghus I think it was just a mere coincidence or something much more complicated could have caused this.

I applied the workaround from comment #4 and so far I had no problems with it.

Revision history for this message
Benpro (benpro82) wrote :

The bug in action, graphed by munin! The value is quite high ...

Revision history for this message
gregg (greggatghc) wrote :

I am having this same problem. fuser has 23898 defunct processes now on my server. 11.10 x64. nginx, php5-fpm, php-apc.

Revision history for this message
Simon Hirscher (codethief) wrote :

I can confirm the bug on Ubuntu 11.10 x64, lighttpd, php5-fpm (without php-apc).
Found this older Debian bug via Google: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=633100 . Maybe it provides anybody with more information on how to hunt this bug down?

Revision history for this message
sergey (ph4nt0m055) wrote :

Confirm. Ubuntu 11.10 x64, apache2, php5.3.6-13ub

Revision history for this message
Simon Hirscher (codethief) wrote :

Don't know if it's related but since having tempered with /etc/cron.d/php5 (see the workaround Graham described) I (as root) regularly receive the following mail:

Subject: "Cron <root@lvps176-28-19-116> test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ) (failed)"
Message:
"Unable to run /etc/cron.daily/standard because lockfile /var/lock/cron.daily
acquisition failed. This probably means that the previous day's
instance is still running. Please check and correct if necessary.

lockfile creation failed: cannot create temporary lockfile
run-parts: /etc/cron.daily/standard exited with return code 1"

Taking a look at /var brought:

root@lvps:/home/simon# ls -l /var | grep lock
lrwxrwxrwx 1 root root 9 2011-12-27 22:27 lock -> /run/lock

…whereas /run/lock didn't even exist. So I (re-)created the directory. Let's whether this fixes it.

Can anyone confirm this?

Revision history for this message
Vladimir Rutsky (rutsky) wrote :

This bug affects my system in following way: I have "cameramonitor" package installed which periodically calls `fuser /dev/video0`, also I have "nproc" limits set to 600 in /etc/security/limits.conf, so periodically all my threads resources are getting exhausted, which leads to faults in some programs that tries to create thread at that moment.

The most noticable problem is with Pidgin --- it quite often hangs in pthread_cond_wait() with stack trace similar to specified here: https://bugzilla.gnome.org/show_bug.cgi?id=666957 looks like because gstreamer fails to fork.

Peter's patch helped, thanks!

Revision history for this message
MonsieurApple (monsieurapple) wrote :

Confirmed.

Switching to the Ubuntu 11.04 cron script has fixed this problem for me!

Revision history for this message
Ondřej Surý (ondrej) wrote :

Could you try installing psmisc (just psmisc, there's also latest php5) from https://launchpad.net/~ondrej/+archive/php5? This should fix the fork()ing of fuser.

Changed in psmisc (Debian):
status: New → Fix Released
Revision history for this message
chrone (chrone81) wrote :

psmisc has been installed, and i noticed there's an update to the php5 package this morning, but it seems fuser sometimes take 9999% of the cpu from top command line.

i also use php-apc package too.

Revision history for this message
chrone (chrone81) wrote :

There are 18596 zombie processes. :(

Revision history for this message
Ondřej Surý (ondrej) wrote :

Could you please do:

apt-cache policy psmisc

Just to be sure you are running correct version?

Revision history for this message
Paul Beaudoin (nonword) wrote :

Thank you, graham-poulter. This solved our issue. Note to other users affected: If `ps ax | grep fuser | wc` still shows lots of zombie fuser procs: after modifying the cron.d entry, you may need to kill the runaway find processes. In our case we had three overlapping finds running, which caused the number of fuser procs to rise and fall between 5000 and 16,000 long after fully disabling the cron.d entry.

Revision history for this message
chrone (chrone81) wrote :

Dear Ondrej,

apt-cache policy psmisc
psmisc:
  Installed: 22.14-1
  Candidate: 22.14-1
  Version table:
 *** 22.14-1 0
        500 http://127.0.0.1/ubuntu/ oneiric/main amd64 Packages
        100 /var/lib/dpkg/status

This morning I found out again there's too many zombie process and fusers were all over taking 9999% CPU usage. at that time, apache2 seems couldn't parse index.php and make the directory listing available/vieable to public. I then restarted the apache2 serviecs and the index.php worked again.

Revision history for this message
Ondřej Surý (ondrej) wrote :

crone, there is psmisc 22.15-2~<dist>+1 in the PHP5 PPA which I have asked to test. Hence your testing just confirms what we already know. The bug was fixed in psmisc 22.15.

Revision history for this message
2GooD (david+launchpad) wrote :
Revision history for this message
David Jung (djung) wrote :

Installing onderj's PPA psmisc fixed the issue for us (without editing the php5 cron). Thanks Ondrej!

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

From the Debian bug discussion, this appears to be a problem with psmisc and not really fixable in PHP5. So I'm going to mark this as Invalid for PHP5, and Triaged for psmisc. It may be worth bringing in 22.15's fuser into 11.10 to fix this, as I can't imagine php is the only thing that regularly makes use of fuser.

Changed in php5 (Ubuntu):
status: Confirmed → Invalid
Changed in php5 (Ubuntu Oneiric):
status: New → Invalid
Changed in psmisc (Ubuntu):
status: Confirmed → Triaged
status: Triaged → Fix Released
Changed in psmisc (Ubuntu Oneiric):
status: New → Triaged
importance: Undecided → High
Hamed Madani (madani)
Changed in php5 (Ubuntu):
status: Invalid → Incomplete
status: Incomplete → Invalid
Revision history for this message
Chris (cnizzardini) wrote :

Yes confirmed on Ubuntu 11.10 running 3.0.0-17-generic kernel and PHP 5.3.6-13ubuntu3.6 with Suhosin-Patch (cli) (built: Feb 11 2012 03:26:01).

output from top
  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29601 root 20 0 0 0 0 Z 9999 0.0 0:00.00 fuser <defunct>
 5268 root 20 0 11252 856 680 R 65 0.0 0:02.08 fuser
12587 root 20 0 11252 860 684 R 56 0.0 0:01.80 fuser

Revision history for this message
Ondřej Surý (ondrej) wrote :

> Yes confirmed on Ubuntu 11.10 running 3.0.0-17-generic kernel and PHP 5.3.6-13ubuntu3.6 with Suhosin-Patch (cli) (built: Feb 11 2012 03:26:01).

And ... have you tried installing fixed psmisc as told several times in this bug report?

Revision history for this message
lopho (lopho) wrote :

is this still present on 12.04 ?
inb4 a server update. and this is mission critical.

Revision history for this message
Ondřej Surý (ondrej) wrote :

@Phillip Kleinhenz: It's fixed in psmisc from 22.15-2 (as mentioned above in my posts).

So yes, it is fixed in precise: http://packages.ubuntu.com/psmisc

Revision history for this message
lopho (lopho) wrote :

@Ondřej Surý
thank you. I wasn't aware there had been an point release including this fix.

Revision history for this message
jpatokal (jpatokal-iki) wrote :

@Phillip: Yes, it's still present in vanilla 12.04, just ran into this bug this morning. Grr.

Revision history for this message
Nonox (nbulian) wrote :

Hi guys!
I'm running ubuntu 12.04 and I have the same problem:

Take a look:

$ grep -i cron /var/log/syslog

Aug 13 16:39:01 one CRON[7303]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete)

More...

$ ls /usr/lib/php5/
20090626+lfs build libexec maxlifetime

Thanks in advance!
nonox

Revision history for this message
Alex (alex-1992) wrote :

Same problem here, a lot of fuser processes eating up my cpu. :(
Ubuntu 12.04 64-bit webserver.

Revision history for this message
Oat (oatcpe) wrote :

Me too.

a lot of "fuser" get over 20% of my CPU all time.

Revision history for this message
Ondřej Surý (ondrej) wrote :

> Me too.

If you really feel the urge to write "Me too." (instead of just clicking at the top of the page), please at least provide some useful information.

The version of Ubuntu, it's architecture, version of PHP 5 packages (php5-common should suffice) and version of psmisc package.

Revision history for this message
Ben White (ben-white) wrote :

Also happening here.

- Ubuntu 12.04.1 64-bit
- PHP 5.3.10-1ubuntu3.4 with Suhosin-Patch (cli)
- psmisc Installed: 22.15-2ubuntu1.1

Unsure how to list versions of other PHP packages, but happy provide more information if needed.

As a temporary fix, I have removed the fuser part of the cron job as per the suggestion here: http://ubuntuforums.org/showpost.php?p=11370262&postcount=2

But this feels like a bandaid solution - I'd like to get this sorted out properly.

Revision history for this message
Ondřej Surý (ondrej) wrote :

Ben,

could you try replacing the standard cron-job with this script:

-- cut here --
#!/bin/sh

# first find all used files and touch them (hope it's not massive
amount of files)
lsof -w -l +d /var/lib/php5 | awk -e '{ if (NR > 1) { print $9; } }'
| xargs -i touch -c {}

# find all files older then maxlifetime
find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f
-ignore_readdir_race -cmin +$(/usr/lib/php5/maxlifetime) -delete
-- cut here --

It takes other approach which should be lighter to the system, but I need a confirmation from users before I apply it to Debian package (and before it gets pulled to Ubuntu).

Ondřej

Revision history for this message
Ondřej Surý (ondrej) wrote :

This modification to cron job has been uploaded as php5 5.4.9-2~<dist>+1 into my PHP5 PPA.

Revision history for this message
dino99 (9d9) wrote :
Changed in psmisc (Ubuntu Oneiric):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.