Comment 5 for bug 1740892

Revision history for this message
Drew Freiberger (afreiberger) wrote :

@nacc:

The error condition was that when corosync restarted, pacemaker disconnected (as was normal) and then tried reconnecting, but when reconnecting ran into this error:

error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)

So, pacemaker is trying to re-handshake with the revived corosync, and when it does, the api fails due to a library error. Given that it's the CPG API, and the libcpg4 package was updated, I'd guess that there was an incompatible patch added to the libcpg4 library was incompatible with the previous version of libcpg4 that was in-memory linked into the running pacemaker binary. Once we restarted the dead pacemaker service, pacemaker reloaded the new library and was able to connect to the CPG API as normal.

I don't know if that's a library failure or a change to the CPG API that was not version-compatible with the previously running version of libcpg4 whenever the dying pacemaker had been started.

The issue occured in trusty and xenial clouds across Mitaka and Ocata cloud archives.