Comment 11 for bug 1654116

Revision history for this message
David Ames (thedac) wrote :

This is a juju is-leader bug.

I have tipple checked that any call to leader-set is gated by an is-leader check in our charms. Specifically in rabbitmq-server, percona-cluster and ceilometer.

With the juju 2.1b3 and rabbitmq you can see that leadership is bouncing around between the three units. See the timestamps in the following:

rabbitmq-server-0/var/log/juju/unit-rabbitmq-server-0.log:2017-01-04 21:12:16 INFO juju-log Unknown hook leader-elected - skipping.
rabbitmq-server-0/var/log/juju/unit-rabbitmq-server-0.log:2017-01-04 21:47:22 INFO juju-log Unknown hook leader-elected - skipping.
rabbitmq-server-0/var/log/juju/unit-rabbitmq-server-0.log:2017-01-04 22:16:54 INFO juju-log Unknown hook leader-elected - skipping.
rabbitmq-server-0/var/log/juju/unit-rabbitmq-server-0.log:2017-01-04 22:25:38 INFO amqp-relation-changed subprocess.CalledProcessError: Command '['leader-set', 'amqp:62_password=VGYqpSqts4R39S9rcJrSwrB7s9ygd2Xp8cnSwcxbTSRKwBjznhHy7fF6247CCRHC']' returned non-zero exit status 1

rabbitmq-server-1/var/log/juju/unit-rabbitmq-server-1.log:2017-01-04 22:01:25 INFO juju-log Unknown hook leader-elected - skipping.
rabbitmq-server-1/var/log/juju/unit-rabbitmq-server-1.log:2017-01-04 22:13:54 INFO amqp-relation-changed subprocess.CalledProcessError: Command '['leader-set', 'ceilometer.passwd=4rcYrk2FfPNXFVgghdLtpC4VRCyBb4smXKFNHdwFxxdgsfqSrLy85WwW3MCCdPxM']' returned non-zero exit status 1

rabbitmq-server-2/var/log/juju/unit-rabbitmq-server-2.log:2017-01-04 21:39:21 INFO juju-log Unknown hook leader-elected - skipping.

With juju 2.1b4 and percona-cluster unit 0 is the leader but some time goes by before it attempts leader-set. At the end unit 2 takes over leadership.

mysql-0/var/log/juju/unit-mysql-0.log:2017-01-12 06:20:33 INFO juju-log Unknown hook leader-elected - skipping.
mysql-0/var/log/juju/unit-mysql-0.log:2017-01-12 06:35:01 DEBUG juju-log cluster:2: Leader unit - bootstrap required=True
mysql-0/var/log/juju/unit-mysql-0.log:2017-01-12 06:35:28 DEBUG juju-log cluster:2: Leader unit - bootstrap required=False
mysql-0/var/log/juju/unit-mysql-0.log:2017-01-12 06:50:55 INFO shared-db-relation-changed subprocess.CalledProcessError: Command '['leader-set', 'shared-db:54_access-network=']' returned non-zero exit status 1

mysql-2/var/log/juju/unit-mysql-2.log:2017-01-12 06:51:43 INFO juju-log Unknown hook leader-elected - skipping.

There are 4 possible problems as I see it:

1) is-leader is giving a false positive
2) is-leader is not in the PATH when is-leader is called in the charms
3) A race during leader election in which one or more units believe they are the leader
4) leader-set fails during a leader election