On Jul 12, 2013, at 13:01, Don MacAskill <email address hidden> wrote:
> As noted, this is *not* fixed. We're testing nathanael-foy's fix right
> now, with promising results.
>
> See: https://github.com/onethumb/libmemcached/pull/1
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/928696
>
> Title:
> incorrect handling of server restart
>
> Status in libmemcached - A C and C++ client library for memcached:
> Fix Released
>
> Bug description:
> libmemcached version: 1.0.3, 1.0.4
>
> we have the following setup:
> - backend (membase/couchbase: several sasl buckets)
> - libmemcached-based client (c++, multi-threaded, uses memcached_pool, single server record, binary protocol with sasl authentication, tcp sockets)
>
> client is organized to dispatch every single get/set/del request to
> distinct thread (thread pool is used). upon receiving task, thread
> obtains connection from memcached_pool related to backend bucket
> specified in request, performs needed memcached_* calls to process
> request, returns backend connection to memcached_pool, and waits for
> next request.
>
> everything goes ok until server is restarted.
> after server restarts, next request gets CONNECTION_FAILURE result (which is ok). and then comes the problem: all following memcached_set calls (performed with in-between time interval less than memcached_st::retry_timeout) get WRITE_FAILURE result. hours of gdb'ing revealed that every such request "renews" memcached_server_write_instance_st::next_retry field. this does not cause any problems if requests come less frequently than memcached_st::retry_timeout - once memcached_server_write_instance_st::next_retry "expires", all functionality gets back to normal.
>
> some info on failing requests:
> memcached_set() returns "WRITE FAILURE" (5)
> error stack(root): {
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/storage.cc:180",
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/connect.cc:614"
> }
> error stack(server-0): {
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/storage.cc:180",
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/connect.cc:614"
> }
>
>
> there was dumb (in terms of overall libmemcached code awareness) attempt to add the following check at the very beginning of <void memcached_quit_server(memcached_server_st *ptr, bool io_death)>:
> --------------------------------------------
> if ( ptr->state == MEMCACHED_SERVER_STATE_IN_TIMEOUT )
> return;
> --------------------------------------------
> this DID help with described problem, but at the same time it broke other functionality.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/libmemcached/+bug/928696/+subscriptions
Thanks, I will look at it.
On Jul 12, 2013, at 13:01, Don MacAskill <email address hidden> wrote:
> As noted, this is *not* fixed. We're testing nathanael-foy's fix right /github. com/onethumb/ libmemcached/ pull/1 /bugs.launchpad .net/bugs/ 928696 st::retry_ timeout) get WRITE_FAILURE result. hours of gdb'ing revealed that every such request "renews" memcached_ server_ write_instance_ st::next_ retry field. this does not cause any problems if requests come less frequently than memcached_ st::retry_ timeout - once memcached_ server_ write_instance_ st::next_ retry "expires", all functionality gets back to normal. storage. cc:180" , connect. cc:614" storage. cc:180" , connect. cc:614" quit_server( memcached_ server_ st *ptr, bool io_death)>: ------- ------- ------- ------- ------- -- SERVER_ STATE_IN_ TIMEOUT ) ------- ------- ------- ------- ------- -- /bugs.launchpad .net/libmemcach ed/+bug/ 928696/ +subscriptions
> now, with promising results.
>
> See: https:/
>
> --
> You received this bug notification because you are a bug assignee.
> https:/
>
> Title:
> incorrect handling of server restart
>
> Status in libmemcached - A C and C++ client library for memcached:
> Fix Released
>
> Bug description:
> libmemcached version: 1.0.3, 1.0.4
>
> we have the following setup:
> - backend (membase/couchbase: several sasl buckets)
> - libmemcached-based client (c++, multi-threaded, uses memcached_pool, single server record, binary protocol with sasl authentication, tcp sockets)
>
> client is organized to dispatch every single get/set/del request to
> distinct thread (thread pool is used). upon receiving task, thread
> obtains connection from memcached_pool related to backend bucket
> specified in request, performs needed memcached_* calls to process
> request, returns backend connection to memcached_pool, and waits for
> next request.
>
> everything goes ok until server is restarted.
> after server restarts, next request gets CONNECTION_FAILURE result (which is ok). and then comes the problem: all following memcached_set calls (performed with in-between time interval less than memcached_
>
> some info on failing requests:
> memcached_set() returns "WRITE FAILURE" (5)
> error stack(root): {
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/
> }
> error stack(server-0): {
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/
> "(166484608) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 192.168.65.3:11211 -> libmemcached/
> }
>
>
> there was dumb (in terms of overall libmemcached code awareness) attempt to add the following check at the very beginning of <void memcached_
> -------
> if ( ptr->state == MEMCACHED_
> return;
> -------
> this DID help with described problem, but at the same time it broke other functionality.
>
> To manage notifications about this bug go to:
> https:/