I've run into this as a side effect of working around https://bugs.launchpad.net/bugs/1492237 (which was my root symptom: controller disk fills up rapidly).
The model is 1.25.6, and is a long-running production deployment. The controllers (3 in HA) bumped up against > 98% disk space usage and it became impossible to issue juju commands or even get status.
I stopped juju services manually on each of the controller units, added storage, moved contents of /var/lib/juju, updated fstab, rebooted. But then none of the juju-* services would start.
Systemd unit files are read earlier in the boot process than mounts are handled, and since they are symlinks to files on a separate mount, the systemd unit files simply did not load.
I removed the symlinks and just copied the systemd unit files in place, and the controllers are happy once again, with a ton of space available. Juju status and other juju commands are back to normal.
I've run into this as a side effect of working around https:/ /bugs.launchpad .net/bugs/ 1492237 (which was my root symptom: controller disk fills up rapidly).
The model is 1.25.6, and is a long-running production deployment. The controllers (3 in HA) bumped up against > 98% disk space usage and it became impossible to issue juju commands or even get status.
I stopped juju services manually on each of the controller units, added storage, moved contents of /var/lib/juju, updated fstab, rebooted. But then none of the juju-* services would start.
Systemd unit files are read earlier in the boot process than mounts are handled, and since they are symlinks to files on a separate mount, the systemd unit files simply did not load.
I removed the symlinks and just copied the systemd unit files in place, and the controllers are happy once again, with a ton of space available. Juju status and other juju commands are back to normal.
Example, on unit 0:
sudo mv -fv /etc/systemd/ system/ juju-db. service /etc/systemd/ system/ juju-db. service. hold.$( date +%s ) system/ multi-user. target. wants/juju- db.service /etc/systemd/ system/ multi-user. target. wants/juju- db.service. hold.$( date +%s )
sudo mv -fv /etc/systemd/
sudo cp -fvp /var/lib/ juju/init/ juju-db/ juju-db. service /etc/systemd/ system/ juju-db. service juju.hold/ init/juju- db/juju- db.service /etc/systemd/ system/ multi-user. target. wants/juju- db.service
sudo cp -fvp /var/lib/
sudo mv -fv /etc/systemd/ system/ jujud-machine- 0.service /etc/systemd/ system/ jujud-machine- 0.service. hold.$( date +%s ) system/ multi-user. target. wants/jujud- machine- 0.service /etc/systemd/ system/ multi-user. target. wants/jujud- machine- 0.service. hold.$( date +%s )
sudo mv -fv /etc/systemd/
sudo cp -fvp /var/lib/ juju/init/ jujud-machine- 0/jujud- machine- 0.service /etc/systemd/ system/ jujud-machine- 0.service juju/init/ jujud-machine- 0/jujud- machine- 0.service /etc/systemd/ system/ multi-user. target. wants/jujud- machine- 0.service
sudo cp -fvp /var/lib/
That may or may not be the best approach, and will likely require careful attention on upgrades, but it got us back up and out of quite a snag.