gearmand 0.31 does not exit properly

Bug #977328 reported by Sven Nierlein
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gearman
Fix Released
Low
Brian Aker

Bug Description

Gearmand does not exit properly and can only be killed with -9

Tested with release 0.31:

%> gearmand --port=54730 --verbose=DEBUG --log-file=/tmp/gearmand.log
%> kill <pid>
%> tail /tmp/gearmand.log
  DEBUG 2012-03-09 16:17:43.864337 [ main ] Received SHUTDOWN wakeup event -> libgearman-server/gearmand.cc:856
   INFO 2012-03-09 16:17:43.864419 [ main ] Clearing event for listening socket (7)
   INFO 2012-03-09 16:17:43.864451 [ main ] Clearing event for listening socket (8)
  DEBUG 2012-03-09 16:17:43.864475 [ main ] Clearing event for wakeup pipe -> libgearman-server/gearmand.cc:790
  DEBUG 2012-03-09 16:17:43.864502 [ main ] Exited main event loop -> libgearman-server/gearmand.cc:348
  DEBUG 2012-03-09 16:17:43.864525 [ main ] shutting down Epoch thread -> libgearman-server/timer.cc:98
  ERROR 2012-03-09 16:17:43.864610 [ main ] poll(Interrupted system call) -> libgearman-server/timer.cc:71

process still running and cannot be stoped.

strace till the kill -9:
[pid 12536] clock_gettime(CLOCK_MONOTONIC, {4463959, 119626270}) = 0
[pid 12536] epoll_wait(26, <unfinished ...>
[pid 12535] clock_gettime(CLOCK_MONOTONIC, {4463959, 119771609}) = 0
[pid 12535] epoll_wait(21, <unfinished ...>
[pid 12534] clock_gettime(CLOCK_MONOTONIC, {4463959, 119881609}) = 0
[pid 12534] epoll_wait(16, <unfinished ...>
[pid 12533] futex(0x852e478, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 12532] clock_gettime(CLOCK_MONOTONIC, {4463959, 120060682}) = 0
[pid 12532] epoll_wait(11, <unfinished ...>
[pid 12531] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 12530] futex(0xb756dbd8, FUTEX_WAIT, 12531, NULL <unfinished ...>
[pid 12531] <... restart_syscall resumed> ) = 0
[pid 12531] gettimeofday({1333988304, 903333}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988305, 904622}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988306, 904898}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988307, 906221}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988308, 907483}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988309, 909993}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988310, 911298}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988311, 912757}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988312, 914078}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988313, 915392}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988314, 916711}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
[pid 12531] gettimeofday({1333988315, 918042}, NULL) = 0
[pid 12531] poll([{fd=0, events=0}], 1, 1000 <unfinished ...>
[pid 12533] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12533 leader 12530
[pid 12534] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12534 leader 12530
[pid 12532] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12532 leader 12530
[pid 12531] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12531 leader 12530
[pid 12536] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12536 leader 12530
[pid 12535] +++ killed by SIGKILL +++
PANIC: handle_group_exit: 12535 leader 12530
+++ killed by SIGKILL +++

Revision history for this message
Brian Aker (brianaker) wrote : Re: [Bug 977328] [NEW] gearmand 0.31 does not exit properly
Download full text (8.0 KiB)

What platform is this?

On Apr 9, 2012, at 9:25 AM, Sven Nierlein wrote:

> Public bug reported:
>
> Gearmand does not exit properly and can only be killed with -9
>
> Tested with release 0.31:
>
> %> gearmand --port=54730 --verbose=DEBUG --log-file=/tmp/gearmand.log
> %> kill <pid>
> %> tail /tmp/gearmand.log
> DEBUG 2012-03-09 16:17:43.864337 [ main ] Received SHUTDOWN wakeup event -> libgearman-server/gearmand.cc:856
> INFO 2012-03-09 16:17:43.864419 [ main ] Clearing event for listening socket (7)
> INFO 2012-03-09 16:17:43.864451 [ main ] Clearing event for listening socket (8)
> DEBUG 2012-03-09 16:17:43.864475 [ main ] Clearing event for wakeup pipe -> libgearman-server/gearmand.cc:790
> DEBUG 2012-03-09 16:17:43.864502 [ main ] Exited main event loop -> libgearman-server/gearmand.cc:348
> DEBUG 2012-03-09 16:17:43.864525 [ main ] shutting down Epoch thread -> libgearman-server/timer.cc:98
> ERROR 2012-03-09 16:17:43.864610 [ main ] poll(Interrupted system call) -> libgearman-server/timer.cc:71
>
> process still running and cannot be stoped.
>
> strace till the kill -9:
> [pid 12536] clock_gettime(CLOCK_MONOTONIC, {4463959, 119626270}) = 0
> [pid 12536] epoll_wait(26, <unfinished ...>
> [pid 12535] clock_gettime(CLOCK_MONOTONIC, {4463959, 119771609}) = 0
> [pid 12535] epoll_wait(21, <unfinished ...>
> [pid 12534] clock_gettime(CLOCK_MONOTONIC, {4463959, 119881609}) = 0
> [pid 12534] epoll_wait(16, <unfinished ...>
> [pid 12533] futex(0x852e478, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
> [pid 12532] clock_gettime(CLOCK_MONOTONIC, {4463959, 120060682}) = 0
> [pid 12532] epoll_wait(11, <unfinished ...>
> [pid 12531] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 12530] futex(0xb756dbd8, FUTEX_WAIT, 12531, NULL <unfinished ...>
> [pid 12531] <... restart_syscall resumed> ) = 0
> [pid 12531] gettimeofday({1333988304, 903333}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988305, 904622}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988306, 904898}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988307, 906221}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988308, 907483}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988309, 909993}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988310, 911298}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988311, 912757}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988312, 914078}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988313, 915392}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988314, 916711}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
...

Read more...

Revision history for this message
Sven Nierlein (sven-nierlein) wrote :

Ubuntu 10.04.4 (32bit)

Revision history for this message
Brian Aker (brianaker) wrote :

Hi,

  ERROR 2012-03-10 16:10:46.627604 [ main ] poll(Interrupted system call) -> libgearman-server/timer.cc:71

Let me work out a patch for this, it is odd that this passed regression.

Changed in gearmand:
assignee: nobody → Brian Aker (brianaker)
importance: Undecided → Low
status: New → In Progress
Brian Aker (brianaker)
Changed in gearmand:
milestone: none → 0.32
Revision history for this message
Brian Aker (brianaker) wrote :

Please pull lp:gourmand and test.

Thanks!

Changed in gearmand:
status: In Progress → Fix Committed
Revision history for this message
Sven Nierlein (sven-nierlein) wrote :

thanks, that worked.

Revision history for this message
Brian Aker (brianaker) wrote : Re: [Bug 977328] Re: gearmand 0.31 does not exit properly
Download full text (4.3 KiB)

Thank you, I will wait just a bit and then have 0.32 pop out for this fix.

On Apr 10, 2012, at 3:55 PM, Sven Nierlein wrote:

> thanks, that worked.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/977328
>
> Title:
> gearmand 0.31 does not exit properly
>
> Status in Gearman Server and Client Libraries:
> Fix Committed
>
> Bug description:
> Gearmand does not exit properly and can only be killed with -9
>
> Tested with release 0.31:
>
> %> gearmand --port=54730 --verbose=DEBUG --log-file=/tmp/gearmand.log
> %> kill <pid>
> %> tail /tmp/gearmand.log
> DEBUG 2012-03-09 16:17:43.864337 [ main ] Received SHUTDOWN wakeup event -> libgearman-server/gearmand.cc:856
> INFO 2012-03-09 16:17:43.864419 [ main ] Clearing event for listening socket (7)
> INFO 2012-03-09 16:17:43.864451 [ main ] Clearing event for listening socket (8)
> DEBUG 2012-03-09 16:17:43.864475 [ main ] Clearing event for wakeup pipe -> libgearman-server/gearmand.cc:790
> DEBUG 2012-03-09 16:17:43.864502 [ main ] Exited main event loop -> libgearman-server/gearmand.cc:348
> DEBUG 2012-03-09 16:17:43.864525 [ main ] shutting down Epoch thread -> libgearman-server/timer.cc:98
> ERROR 2012-03-09 16:17:43.864610 [ main ] poll(Interrupted system call) -> libgearman-server/timer.cc:71
>
> process still running and cannot be stoped.
>
> strace till the kill -9:
> [pid 12536] clock_gettime(CLOCK_MONOTONIC, {4463959, 119626270}) = 0
> [pid 12536] epoll_wait(26, <unfinished ...>
> [pid 12535] clock_gettime(CLOCK_MONOTONIC, {4463959, 119771609}) = 0
> [pid 12535] epoll_wait(21, <unfinished ...>
> [pid 12534] clock_gettime(CLOCK_MONOTONIC, {4463959, 119881609}) = 0
> [pid 12534] epoll_wait(16, <unfinished ...>
> [pid 12533] futex(0x852e478, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
> [pid 12532] clock_gettime(CLOCK_MONOTONIC, {4463959, 120060682}) = 0
> [pid 12532] epoll_wait(11, <unfinished ...>
> [pid 12531] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 12530] futex(0xb756dbd8, FUTEX_WAIT, 12531, NULL <unfinished ...>
> [pid 12531] <... restart_syscall resumed> ) = 0
> [pid 12531] gettimeofday({1333988304, 903333}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988305, 904622}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988306, 904898}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988307, 906221}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988308, 907483}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988309, 909993}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988310, 911298}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
> [pid 12531] gettimeofday({1333988311, 912757}, NULL) = 0
> [pid 12531] poll([{fd=0, events=0}], 1, 1000) = 0 (Timeout)
>...

Read more...

Brian Aker (brianaker)
Changed in gearmand:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.