Comment 12 for bug 444352

Revision history for this message
Thierry Carrez (ttx) wrote : Re: DB deadlock on reboot prevents UEC from working, temporarily

So to summarize:
- To trigger the bug, reboot your cluster and try running euca-describe-availability-zones verbose once eucalyptus is ready. It fails, and makes DEADLOCK errors appear in the logs.
- After a few tries, eucalyptus does a "hard reset" on connections and autofixes itself.
- This occurs for me every time, the first time I start eucalyptus after system startup.
- Stopping and restarting eucalyptus on a running system does not trigger the error. It only happens at first eucalyptus startup after a boot.
- This does not depend on the boot order, since I can reproduce by starting up manually after boot is fully complete. It does not depend on static / DHCP networking either, I can reproduce in both cases.
- It does not depend on the way eucalyptus is stopped, since I reproduce it by manually stopping eucalyptus.

It seems like a condition that is only found after a fresh boot is triggering the issue. Something that starting and stopping eucalyptus once will fix. I tried cleaning up /var/run/eucalyptus and /tmp/ and reproduce, but that doesn't come from there. I also tried cleaning up the 169.254.169.254/32 address, but that doesn't trigger it either.