System Instance of Memcached Causes Check Failure

Bug #850396 reported by Ladar Levison
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
libmemcached
Fix Released
Medium
Brian Aker

Bug Description

It appears that when Server::Cycle is called it tries to kill stale instances of the memcached server. And somehow its detecting my global instance of memcached, and trying to kill it. Since the global instance belongs to a different user, the kill attempt fails, along with the unit test. I got a little confused tracing the pid logic around, otherwise I'd submit a patch. My suggestion would be adding a uid check of the memcached process before killing it? Or perhaps wrapping the memcached launch logic in a script or executable and then only killing memcached instances that are children of the magic wrapper?

I noticed kill_pid() does test for EPERM; is there a reason those can't silently return true and be ignored? In other words, is there a situation where EPERM could show up, but we'd want a kill attempt to fail?

I'm attaching logs showing a build of 0.51, and 0.52 so you can see that 0.51 worked just fine, but that 0.52 fails. A sample error is:

libtest/killpid.cc:54: in kill_pid() Does someone else have a process running locally for 11939?
libtest/killpid.cc:54: in kill_pid() Does someone else have a process running locally for 11939?
libtest/server.cc:136: in cycle() Reached limit, could not kill server pid:11939
libtest/server_container.cc:361: in start_socket_server() Could not start up server localhost:11226 Socket:
libtest/test.cc:142: in main() /home/ladar/Desktop/libmemcached-0.52/tests/.libs/lt-cycle failed in Framework::create()

For the record, the shared memcached pid is 11939:

[ladar@magma memcached]$ ps -ef | grep memcached | grep -v grep
memcache 11939 1 0 15:09 ? 00:00:00 /usr/local/bin/memcached -d -p 11211 -I 11211 -u memcache -m 1024 -c 1024 -P /var/run/memcached/memcached.pid -I 8

Revision history for this message
Ladar Levison (ladar) wrote :
Revision history for this message
Ladar Levison (ladar) wrote :
Revision history for this message
Ladar Levison (ladar) wrote :

I spent some time studying the code, and realized that the problem is with how the memcached servers are added. When the unit tests create a memcached socket server the function memcached_server_add is called with the hostname = "" and port = 0. Because these values are invalid, the memcached_server_add function substitutes the default values: localhost:11211.

Because I have a global memcached server running at localhost:11211, a valid record is created in the server pool. The absence of a pid file doesn't cause problems because the libmemcached_util_getpid function is being used for pid's, not the pid files.

I haven't tried creating a patch because I'm still not sure where I can add a fix that that doesn't break one of the alternative configs (sasl, gearmand, etc) . The fix should add logic to the server startup/shutdown code so that kill attempts are limited to servers spawned by the unit test logic.

The workaround for v0.52 is to kill any system-wide memcached server instances on the default port before running "make check", or moving the system-wide instance to a port that different from libmemcached's value for MEMCACHED_DEFAULT_PORT (which is 11211 by default).

Brian Aker (brianaker)
Changed in libmemcached:
importance: Undecided → Medium
assignee: nobody → Brian Aker (brianaker)
Revision history for this message
Brian Aker (brianaker) wrote :

This should be fixed now (ports are tested for, and then assigned).

Changed in libmemcached:
status: New → Fix Committed
milestone: none → 1.0.8
Brian Aker (brianaker)
Changed in libmemcached:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.