Comment 7 for bug 802850

Revision history for this message
Brian Aker (brianaker) wrote : Re: [Bug 802850] Re: Gearman 100% cpu usage, workers in a loop (PHP, 0.22, 0.23)

What happens if you use the default timeout (latest version of
libgearman), and start the server up with the wake workers at a value
<4?

Sent from my TI85

On Jul 8, 2011, at 4:38 AM, Artur Bodera <email address hidden>
wrote:

> Number of workers is always <20
>
> Each worker listens for max 2h, then shuts itself down and a new
> process
> is spawned (not a fork, completely new php shell process).
>
> $gmworker = new GearmanWorker();
> $gmworker->addOptions(GEARMAN_WORKER_GRAB_UNIQ);
> $gmworker->setTimeout(5000);
> $gmworker->addFunction($funcName, array($this,'workGearman'))
>
> main loop (with fat removed):
>
> while(
> !$this->_signal &&
> (
> @$gmworker->work() ||
> $gmworker->returnCode() == GEARMAN_TIMEOUT
> ) && !$this->_signal
> ){
> // nothing unusual happened, run another loop iteration
> $db->closeConnection();
> $log->debug('Waiting for jobs...');
> }
>
> --
> You received this bug notification because you are subscribed to
> Gearman.
> https://bugs.launchpad.net/bugs/802850
>
> Title:
> Gearman 100% cpu usage, workers in a loop (PHP, 0.22, 0.23)
>
> Status in Gearman Server and Client Libraries:
> New
>
> Bug description:
> After some time, minutes to hours, with a slight load on the gearman
> server (~1 job/min), workers get lost in a loop (per strace) and
> gearmand eats up 100% cpu.
>
> Strace of a worker:
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> getsockopt(7, SOL_SOCKET, SO_ERROR, [117528996916232192], [4]) = 0
> sendto(7, "\0REQ\0\0\0\36\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> recvfrom(7, "\0RES\0\0\0\n\0\0\0\0", 8192, 0, NULL, NULL) = 12
> sendto(7, "\0REQ\0\0\0\4\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
> poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7,
> revents=POLLIN}])
> ....
>
>
> Strace of gearmand:
>
>
> # strace -p 2820
> Process 2820 attached - interrupt to quit
> clock_gettime(CLOCK_MONOTONIC, {3794803, 727265822}) = 0
> epoll_wait(3,
>
> (... and nothing more.... )
>
>
> All workers are PHP based.
>
>
> # php --ri gearman
>
> gearman
>
> gearman support => enabled
> extension version => 0.8.0
> libgearman version => 0.22
> Default TCP Host => 127.0.0.1
> Default TCP Port => 4730
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/gearmand/+bug/802850/+subscriptions