The retry is failing in the 0.51 libmemcached Release.
The retry is also failing with Branched 953 revision(s) downloaded with bzr branch lp:libmemcached. (Built on SUSE Linux 11 SP1).
The problem is reproducable.
Scenario:
Single Memcached Server (running on localhost).
Application gets a memcached connection (we use
memcached_set : SUCCESS
>> Stop the memecached Server.
memcached_set: UNKNOWN READ FAILURE
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> memecached Server is started.
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
etc.
Reason: in connect.cc method network_connect there is a loop over all the address_info objects. However ptr->address_info_next is always NULL since the reconnect attempt which resulted in the CONNECTION FAILED error as this iterated throuch all the available addres_info objects and advanced address_info_next to NULL which never gets reset.
/* Create the socket */
while (ptr->address_info_next && ptr->fd == INVALID_SOCKET)
To solve the problem the ptr->address_info_next needs to be reset to the first address_info
ptr->address_info_next= ptr->address_info;
ptr->state= MEMCACHED_SERVER_STATE_ADDRINFO;
Then it works fine again.
Attached is a diff as a suggested patch - however only first looked at the libmemcached code yesterday so there may be a much better way to fix this by someone with more experience.
Attached two test programs to reproduce this. One using a memcached pool and one without.
The retry is failing in the 0.51 libmemcached Release.
The retry is also failing with Branched 953 revision(s) downloaded with bzr branch lp:libmemcached. (Built on SUSE Linux 11 SP1).
The problem is reproducable.
Scenario:
Single Memcached Server (running on localhost).
Application gets a memcached connection (we use
memcached_set : SUCCESS
>> Stop the memecached Server.
memcached_set: UNKNOWN READ FAILURE
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> memecached Server is started.
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
etc.
Reason: in connect.cc method network_connect there is a loop over all the address_info objects. However ptr->address_ info_next is always NULL since the reconnect attempt which resulted in the CONNECTION FAILED error as this iterated throuch all the available addres_info objects and advanced address_info_next to NULL which never gets reset.
/* Create the socket */ info_next && ptr->fd == INVALID_SOCKET)
while (ptr->address_
To solve the problem the ptr->address_ info_next needs to be reset to the first address_info address_ info_next= ptr->address_info; SERVER_ STATE_ADDRINFO;
ptr->
ptr->state= MEMCACHED_
Then it works fine again.
Attached is a diff as a suggested patch - however only first looked at the libmemcached code yesterday so there may be a much better way to fix this by someone with more experience.
Attached two test programs to reproduce this. One using a memcached pool and one without.
Matt