Canonical Juju

Bug #1772601
Comment #0

Comment 0 for bug 1772601

Revision history for this message

Stuart Bishop (stub) wrote on 2018-05-22:

Upgrading a model to 2.3.7 caused an unexpected outage when PostgreSQL was shutdown. The charm thankfully recovered when the first hook was run post-upgrade, so the outage was limited to 60 seconds.

It seems that upgradin jujud also sends kill signals to all processes spawned by its subprocesses, such as the PostgreSQL daemon started by running 'pg_ctlcluster 9.5 main start' from a hook.

We think everything linked by the cgroup gets affected, so even though pg_ctlcluster switches user id and group it is killed:

# cat /proc/4776/cgroup # The root PostgreSQL pid
11:cpu,cpuacct:/system.slice/jujud-unit-postgresql-1.service
10:pids:/system.slice/jujud-unit-postgresql-1.service
9:devices:/system.slice/jujud-unit-postgresql-1.service
8:perf_event:/
7:memory:/system.slice/jujud-unit-postgresql-1.service
6:freezer:/
5:net_cls,net_prio:/
4:hugetlb:/
3:blkio:/system.slice/jujud-unit-postgresql-1.service
2:cpuset:/
1:name=systemd:/system.slice/jujud-unit-postgresql-1.service

To reproduce, deploy a cs:postgresql unit with an older version of juju, and upgrade to 2.3.7. /var/log/postgresql-9.5-main.log will show the shutdown signal being received, and starting up again when the next hook is run.

systemd services seem unaffected, so pgbouncer (started by systemctl start pgbouncer) did not get signaled.