More specific error messages when connect() fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libmemcached |
New
|
Undecided
|
Unassigned |
Bug Description
We've been logging intermittent connection failures for a while. It took a while to work out that it was probably due to local (ephemeral) port exhaustion. This causes connect() to fail with EADDRINUSE. We can easily reproduce connection failures in the libmemcached under realistic connection rates. On Linux, the error will occur when there are more than about 28232 connections from a single client host to a single server in a 60 second period (the TIME_WAIT expiry).
In libmemcached's network_connect(), EADDRINUSE is handled by the "default" case, so just gives MEMCACHED_
It would be nice if any errno was handled, since EACCES, ENETUNREACH and ENOMEM are probably also possible.
Also, according to Linux's man connect(2), EAGAIN indicates "no more free local ports or insufficient entries in the routing cache", so you probably shouldn't call poll() on the FD if that happens.