Comment 2 for bug 1961531

Revision history for this message
Boris Lukashev (rageltman) wrote :

The lockdep trace ended up too deep to discern (even journald didnt have the top of it, so likely started during init), and we had to get the node back to a happy state for ops. That said, this seems to be a pretty serious bug which produces ever longer delays in the ability to connect to a host to reboot it (only way i've found so far to fix it) and upon reboot, if the compute container crashes like this, the system starts to degrade pretty quickly. Since we've resolved blockers to nova-compute starting up correctly (disabled swtpm, got libvirt containers up first), things have been running swimmingly. However, now we have to sleep with one eye open for alerts on the containers going down or throwing errors as the next thing after that is very unpleasant.