Comment 13 for bug 1710278

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Mark, if you observe the deadlock again, can you run "systemctl stop bind9", wait a few minutes (at least 2, but maybe up to 5), and then check if bind9 successfully stops? It looks like systemd will (by default) resort to more aggressive methods to kill a service if it doesn't stop after ~90 seconds.

If the normal method of killing the bind9 service works, we can still avoid adding that scope and risk to MAAS. Rather, if we detect bind9 behaving badly, a stop/start cycle would also allow bind9 to properly shut down in most cases, and avoid any other bugs in BIND we might see as a side-effect of a "kill -9 <bind9-pid>" approach. (A human operator could troubleshoot those side effects, but it's more difficult for MAAS to anticipate, for example, why BIND might now fail to start up because of a lock file that was left on the filesystem when the 'kill -9' occurred.)