Comment 0 for bug 1644152

Revision history for this message
Anton Chevychalov (achevychalov) wrote :

During upgrade from pacemaker version 1.1.14-2~u14.04+mos1 to version 1.1.14-2~u14.04+mos2 lrmd process hang and does not allow pacemaker to recover from corosync outage.

Long way to reproduce:
~~~~~~~~~~~~~~~~~~~~~
1. Install 9.1 with one controller node in HA mode.
2. Try to upgrade to 9.2
------------------------------

Expected result:
~~~~~~~~~~~~~~~~
Upgrade finished without problems.
------------------------------

Result:
~~~~~~
upgrade failed on some random component outage.

There are errors in log:
error: mainloop_add_ipc_server: Could not start pengine IPC server: Address already in use (-98)
error: main: Failed to create IPC server: shutting down and inhibiting respawn
 info: crm_xml_cleanup: Cleaning up memory from libxml2

Pacemaker process restart every 2-3 minutes.

For example view https://bugs.launchpad.net/fuel/+bug/1641947
------------------------------

Fast way to reproduce:
~~~~~~~~~~~~~~~~~~~~~
1. Install 9.0 or 9.1 with one controller node in HA mode.
2. Login to SSH
3. service corosync stop
4. Update packages pacemaker-cli-utils, pacemaker-common, pacemaker-resource-agents, pacemaker to 1.1.14-2~u14.04+mos2
5. service corosync start
------------------------------

Expected result:
~~~~~~~~~~~~~~~~
Pacemaker recovers from corosync outage.
------------------------------

Result:
~~~~~~~
Pacemaker fail to communicate with lrmd and constantly restart.