leader-settings-changed error when deploying a 2-unit cluster

Bug #1539466 reported by Björn Tillenius
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
PostgreSQL Charm
Fix Released
High
Stuart Bishop
postgresql (Juju Charms Collection)
Triaged
High
Unassigned

Bug Description

When deploying a postgresql cluster having 2 units, a master and a slave, the deployment failed with a leader-settings-changed error on the non-leader:

2016-01-28 13:50:21 DEBUG juju-log Not the master
2016-01-28 13:50:21 INFO juju-log ** Action leader-settings-changed/remount
2016-01-28 13:50:21 INFO leader-settings-changed Traceback (most recent call last):
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/leader-settings-changed", line 23, in <module>
2016-01-28 13:50:21 INFO leader-settings-changed bootstrap.default_hook()
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/bootstrap.py", line 56, in default_hook
2016-01-28 13:50:21 INFO leader-settings-changed sm.manage()
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/services/base.py", line 137, in manage
2016-01-28 13:50:21 INFO leader-settings-changed self.reconfigure_services()
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/services/base.py", line 191, in reconfigure_services
2016-01-28 13:50:21 INFO leader-settings-changed self.fire_event('data_ready', service_name)
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/charmhelpers/core/services/base.py", line 238, in fire_event
2016-01-28 13:50:21 INFO leader-settings-changed callback(service_name)
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/decorators.py", line 123, in wrapper
2016-01-28 13:50:21 INFO leader-settings-changed return func(*args, **kw)
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication.py", line 65, in wrapper
2016-01-28 13:50:21 INFO leader-settings-changed return func()
2016-01-28 13:50:21 INFO leader-settings-changed File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/replication.py", line 164, in clone_master
2016-01-28 13:50:21 INFO leader-settings-changed assert not postgresql.is_running()
2016-01-28 13:50:21 INFO leader-settings-changed AssertionError

This was using Juju 1.24.7 and r141 of lp:charms/trusty/postgresql.

Revision history for this message
Björn Tillenius (bjornt) wrote :
Revision history for this message
Björn Tillenius (bjornt) wrote :
Revision history for this message
Björn Tillenius (bjornt) wrote :
Revision history for this message
Björn Tillenius (bjornt) wrote :
tags: added: landscape
tags: added: kanban-cross-team
Revision history for this message
Stuart Bishop (stub) wrote :

This is a race, where the charm has been granted restart permission but has not yet joined the peer relation with the master. It should wait in this situation, but instead continues and starts PostgreSQL before the master has been cloned. This causes the clone operation to fail in a later hook, as it is being very careful not to risk data.

I suspect this is fixed naturally as part of the charms.reactive rework - lp:~stub/charms/trusty/postgresql/built if you want to test that.

2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/handle_storage_relation
2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/update_wal_e_env_dir
2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/request_restart
2016-01-28 13:50:13 INFO juju-log Restart already requested
2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/wait_for_restart
2016-01-28 13:50:13 DEBUG juju-log Not the master
2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/remount
2016-01-28 13:50:13 INFO juju-log Not yet joined peer relation with postgresql/1 - skipping
2016-01-28 13:50:13 INFO juju-log Not yet joined peer relation with postgresql/1 - skipping
2016-01-28 13:50:13 INFO juju-log ** Action leader-settings-changed/restart_or_reload
2016-01-28 13:50:13 INFO juju-log maintenance: Starting PostgreSQL
2016-01-28 13:50:16 INFO juju-log maintenance: Started
2016-01-28 13:50:17 DEBUG juju-log Not the master
2016-01-28 13:50:17 DEBUG juju-log Not the master

Changed in postgresql (Juju Charms Collection):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Stuart Bishop (stub)
Revision history for this message
Stuart Bishop (stub) wrote :

I've managed to reproduce this with the reactive rework branch after a lengthy test run.

Stuart Bishop (stub)
Changed in postgresql-charm:
status: New → Fix Released
importance: Undecided → High
assignee: nobody → Stuart Bishop (stub)
Changed in postgresql (Juju Charms Collection):
assignee: Stuart Bishop (stub) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.