Comment 17 for bug 777672

Revision history for this message
David Matthew Bond (mattbond) wrote :

The retry is failing in the 0.51 libmemcached Release.
The retry is also failing with Branched 953 revision(s) downloaded with bzr branch lp:libmemcached. (Built on SUSE Linux 11 SP1).
The problem is reproducable.

Scenario:
Single Memcached Server (running on localhost).
Application gets a memcached connection (we use
memcached_set : SUCCESS
>> Stop the memecached Server.
memcached_set: UNKNOWN READ FAILURE
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
>> memecached Server is started.
memcached_set: SERVER IS MARKED DEAD
>> Next Retry is due.
memcached_set: CONNECTION FAILURE
memcached_set: SERVER IS MARKED DEAD
memcached_set: SERVER IS MARKED DEAD
etc.

Reason: in connect.cc method network_connect there is a loop over all the address_info objects. However ptr->address_info_next is always NULL since the reconnect attempt which resulted in the CONNECTION FAILED error as this iterated throuch all the available addres_info objects and advanced address_info_next to NULL which never gets reset.

  /* Create the socket */
  while (ptr->address_info_next && ptr->fd == INVALID_SOCKET)

To solve the problem the ptr->address_info_next needs to be reset to the first address_info
  ptr->address_info_next= ptr->address_info;
  ptr->state= MEMCACHED_SERVER_STATE_ADDRINFO;

Then it works fine again.

Attached is a diff as a suggested patch - however only first looked at the libmemcached code yesterday so there may be a much better way to fix this by someone with more experience.
Attached two test programs to reproduce this. One using a memcached pool and one without.

Matt