- when I tried to attach to the process with strace, it figured out that connection to mysqld (haproxy in our case) is broken, then closed and re-opened it properly - http://paste.openstack.org/show/494795/
My current understanding is that destroying of a VIP produces a dead TCP peer - nova-cert simply does not know it should close the existing connection and open a new one. Maybe we should take a look at TCP keepalives? (http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html)
I checked the Oleksiy's environment and see the following:
- right after reverting a snapshot nova-cert service is marked as down
- nova-cert process is up and running
- there are no open connections to mysqld - http:// paste.openstack .org/show/ 494792/ - NOTE: there is one broken socket (can't identify protocol)
- service logs contain no errors: http:// paste.openstack .org/show/ 494794/
- when I tried to attach to the process with strace, it figured out that connection to mysqld (haproxy in our case) is broken, then closed and re-opened it properly - http:// paste.openstack .org/show/ 494795/
- all services are up now - http:// paste.openstack .org/show/ 494796/
NOTE: no service restarts were performed
My current understanding is that destroying of a VIP produces a dead TCP peer - nova-cert simply does not know it should close the existing connection and open a new one. Maybe we should take a look at TCP keepalives? (http:// tldp.org/ HOWTO/TCP- Keepalive- HOWTO/overview. html)