Charm should indicate whether replication is paused on slaves
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
PostgreSQL Charm |
New
|
Undecided
|
Unassigned |
Bug Description
We hit a situation where a standby Postgres server accumulated roughly 60 GiB of WAL files in the pg_wal directory.
This appears to have been caused by replication having been paused via the replication-pause action. Replication was never resumed afterward, and thus the files started to build up, not being able to be applied to the running database - or at least, that's my theory. I concretely verified that pg_is_wal_
We spent hours trying to figure out why this was happening, and not being Postgres experts, also considered running WAL archive trimming tools on the /var/lib/
The charm status indicates very helpfully whether units are masters or standbys - it would be very helpful if the charm also indicated in some way whether replication on standbys was paused, as if left in this state too long, an out-of-disk situation like the one threatened in our case can occur.
TL;DR: Please provide a way to indicate via "juju status" if replication is paused on standby servers.