1) we could tweak $socket_timeout and set it to a lower value in both keystone and nova configs, e.g. 1 second
2) we could increase $dead_retry to a bigger value in both keystone and nova configs, e.g. 300 seconds
optionally:
3) turn off caching token validation results for keystone_authmiddleware for nova-api
In the future, we could probably try to use haproxy for managing memcached backends, so that haproxy would regularly do health checks and remove "dead" memcached servers from the list. In this case nova and keystone would only know about the haproxy frontend endpoint. Both active-active and active-backup modes should be ok for us, as long as we use memcache as a cache (I may be wrong, but I think Keystone can use it for lock management).
On what we could do here:
1) we could tweak $socket_timeout and set it to a lower value in both keystone and nova configs, e.g. 1 second
2) we could increase $dead_retry to a bigger value in both keystone and nova configs, e.g. 300 seconds
optionally:
3) turn off caching token validation results for keystone_ authmiddleware for nova-api
In the future, we could probably try to use haproxy for managing memcached backends, so that haproxy would regularly do health checks and remove "dead" memcached servers from the list. In this case nova and keystone would only know about the haproxy frontend endpoint. Both active-active and active-backup modes should be ok for us, as long as we use memcache as a cache (I may be wrong, but I think Keystone can use it for lock management).