Model upgrade terminates daemons spawned by hooks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Upgrading a model to 2.3.7 caused an unexpected outage when PostgreSQL was shutdown. The charm thankfully recovered when the first hook was run post-upgrade, so the outage was limited to 60 seconds.
It seems that upgradeing jujud also sends kill signals to all processes spawned by its subprocesses, such as the PostgreSQL daemon started by running 'pg_ctlcluster 9.5 main start' from a hook.
We think everything linked by the cgroup gets affected, so even though pg_ctlcluster switches user id and group it is killed:
# cat /proc/4776/cgroup # The root PostgreSQL pid
11:cpu,
10:pids:
9:devices:
8:perf_event:/
7:memory:
6:freezer:/
5:net_cls,
4:hugetlb:/
3:blkio:
2:cpuset:/
1:name=
To reproduce, deploy a cs:postgresql unit with an older version of juju, and upgrade to 2.3.7. /var/log/
systemd services seem unaffected, so pgbouncer (started by systemctl start pgbouncer) did not get signaled.
description: | updated |
tags: | added: canonical-is |
Changed in juju: | |
status: | New → Triaged |
Is there a reason it is spawned directly by the hook rather than something
like creating a systemd job?
Offhand, I didn't think we used a signal to stop, so I'm not sure why it
would be killing other processes, but I suppose you have evidence to the
contrary.
On Tue, May 22, 2018 at 12:29 PM, Stuart Bishop <email address hidden>
wrote:
> ** Description changed: cpuacct: /system. slice/jujud- unit-postgresql -1.service /system. slice/jujud- unit-postgresql -1.service /system. slice/jujud- unit-postgresql -1.service /system. slice/jujud- unit-postgresql -1.service net_prio: / /system. slice/jujud- unit-postgresql -1.service systemd: /system. slice/jujud- unit-postgresql -1.service postgresql- 9.5-main. log will show the postgresql- 9.5-main. log will show the /bugs.launchpad .net/bugs/ 1772601 /bugs.launchpad .net/juju/ +bug/1772601/ +subscriptions
>
> Upgrading a model to 2.3.7 caused an unexpected outage when PostgreSQL
> was shutdown. The charm thankfully recovered when the first hook was run
> post-upgrade, so the outage was limited to 60 seconds.
>
> - It seems that upgradin jujud also sends kill signals to all processes
> + It seems that upgradeing jujud also sends kill signals to all processes
> spawned by its subprocesses, such as the PostgreSQL daemon started by
> running 'pg_ctlcluster 9.5 main start' from a hook.
>
> We think everything linked by the cgroup gets affected, so even though
> pg_ctlcluster switches user id and group it is killed:
>
> # cat /proc/4776/cgroup # The root PostgreSQL pid
> 11:cpu,
> 10:pids:
> 9:devices:
> 8:perf_event:/
> 7:memory:
> 6:freezer:/
> 5:net_cls,
> 4:hugetlb:/
> 3:blkio:
> 2:cpuset:/
> 1:name=
>
> -
> - To reproduce, deploy a cs:postgresql unit with an older version of juju,
> and upgrade to 2.3.7. /var/log/
> shutdown signal being received, and starting up again when the next hook is
> run.
> + To reproduce, deploy a cs:postgresql unit with an older version of juju,
> + and upgrade to 2.3.7. /var/log/
> + shutdown signal being received, and starting up again when the next hook
> + is run.
>
> systemd services seem unaffected, so pgbouncer (started by systemctl
> start pgbouncer) did not get signaled.
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https:/
>
> Title:
> Model upgrade terminates daemons spawned by hooks
>
> To manage notifications about this bug go to:
> https:/
>