Comment 15 for bug 1740892

Revision history for this message
Nish Aravamudan (nacc) wrote : Re: [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail

On Mon, Jan 8, 2018 at 9:51 AM, Nish Aravamudan
<email address hidden> wrote:
> On Mon, Jan 8, 2018 at 8:48 AM, Victor Tapia <email address hidden> wrote:
>> As mentioned by Mario @ #10, stopping corosync while pacemaker runs
>> throws the same error as the upgrade. Syslog from Xenial +
>> corosync=2.3.5-3ubuntu1:
>>
>> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopping Pacemaker High Availability Cluster Manager...
>> Jan 8 16:24:37 xenial-corosync pacemakerd[28747]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN origin=crm_shutdown ]
>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Delaying fencing operations until there are resources to manage
>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Scheduling Node xenial-corosync for shutdown
>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Calculated Transition 1: /var/lib/pacemaker/pengine/pe-input-52.bz2
>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Transition 1 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-52.bz2): Complete
>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Disconnecting from Corosync
>> Jan 8 16:24:37 xenial-corosync cib[28748]: warning: new_event_notification (28748-28753-12): Broken pipe (32)
>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync attrd[28751]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync lrmd[28750]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync stonith-ng[28749]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Invoking handler for signal 15: Terminated
>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from Corosync
>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from Corosync
>> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopped Pacemaker High Availability Cluster Manager.
>>
>>
>> Pacemakerd shuts down sending SIGTERM to its components, but after the install, corosync does not start pacemaker. BTW, "systemctl restart corosync" restarts both services perfectly
>>
>> I think that the option A from James Page (#11) is the way to go
>
> I took a quick look at a LXD container after seeing Felipe and
> Victor's posts. It seems like this is a bug in the xenial (at least)
> systemd unit files:
>
> # grep pacemaker /lib/systemd/system/corosync.service
> # pacemaker.service, and if you want to exert the watchdog when a
>
> # grep corosync /lib/systemd/system/pacemaker.service
> After=corosync.service
> Requires=corosync.service
> # ExecStopPost=/bin/sh -c 'pidof crmd || killall -TERM corosync'
>
> So, what I see is that corosync.service has no dependency on
> pacemaker.service (in the file).
>
> pacemaker.service will start after corosync.service. And when
> pacemaker.service is shutdown it will be before corosync.service.
> Additionally, if pacemaker.service is started, then corosync.service
> is started as well.
>
> Note, nothing specifies what Felipe said -- there is no guarantee that
> pacemaker is started, restarted, etc. when corosync is.
>
> I think the next step is to look at Bionic's systemd services
> (probably newer) or upstream's and see if there is a difference, or
> new dependencies added there.

Or perhaps ask upstream what they think is providing this assurance in
their systemd files, because I'm not seeing it.

If we have a hard dependency between pacemaker and corosync, then I
think we might need a PartOf directive, in order to ensure they are
always following the state transitions together.