KeyError for leader unit during cluster-relation-joined

Bug #1669051 reported by Francis Ginther
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-repository-cache (Juju Charms Collection)
Fix Released
Undecided
Unassigned

Bug Description

This Traceback was found during is-mojo-ci testing of the archive-mirror specs [1].

I also saw it during my testing and was able to determine that yes, the relation for the leader unit is missing. Here's the Traceback:

2017-02-11 02:13:38 INFO juju-log cluster:2: Cluster relation joined for ubuntu-repository-cache
2017-02-11 02:13:38 INFO juju-log cluster:2: Generating new SSH key for user www-sync.
2017-02-11 02:13:38 INFO juju-log cluster:2: Updating metadata on a peer
2017-02-11 02:13:39 INFO cluster-relation-joined Traceback (most recent call last):
2017-02-11 02:13:39 INFO cluster-relation-joined File "/var/lib/juju/agents/unit-ubuntu-repository-cache-2/charm/hooks/cluster-relation-joined", line 245, in <module>
2017-02-11 02:13:39 INFO cluster-relation-joined HOOKS.execute(sys.argv)
2017-02-11 02:13:39 INFO cluster-relation-joined File "/var/lib/juju/agents/unit-ubuntu-repository-cache-2/charm/lib/charmhelpers/core/hookenv.py", line 715, in execute
2017-02-11 02:13:39 INFO cluster-relation-joined self._hooks[hook_name]()
2017-02-11 02:13:39 INFO cluster-relation-joined File "/var/lib/juju/agents/unit-ubuntu-repository-cache-2/charm/hooks/cluster-relation-joined", line 175, in cluster_relation_joined
2017-02-11 02:13:39 INFO cluster-relation-joined mirror.peer_update_metadata()
2017-02-11 02:13:39 INFO cluster-relation-joined File "/var/lib/juju/agents/unit-ubuntu-repository-cache-2/charm/lib/ubuntu_repository_cache/mirror.py", line 296, in peer_update_metadata
2017-02-11 02:13:39 INFO cluster-relation-joined _nonleader_update_metadata()
2017-02-11 02:13:39 INFO cluster-relation-joined File "/var/lib/juju/agents/unit-ubuntu-repository-cache-2/charm/lib/ubuntu_repository_cache/mirror.py", line 202, in _nonleader_update_metadata
2017-02-11 02:13:39 INFO cluster-relation-joined leader_rel = rel[leader_id]
2017-02-11 02:13:39 INFO cluster-relation-joined KeyError: 'ubuntu-repository-cache/1'
2017-02-11 02:13:39 ERROR juju.worker.uniter.operation runhook.go:107 hook "cluster-relation-joined" failed: exit status 1

Debugging this showed that the relation list contained ubuntu-repository-cache/2 and ubuntu-repository-cache/0, but not ubuntu-repository-cache/1 which was the leader.

Thinking about this a bit, units join relations and leadership elections are all asynchronous events. As a result, there is no gaurantee that the leader unit has joined this peer relation yet. I tried the following just prior to line 202 in mirror.py.

    if leader_id not in rel:
        LOG('Leader {} not yet related'.format(leader_id))
        return

After the hook was retried, the exception no longer blocked the hook from completion. The leader unit then joined on a subsequent cluster-relation-joined event and the relation was completed and the charm completed it's mirror sync.

I have some changes in progress and will propose an MP shortly.

[1] https://jenkins.canonical.com/is-mojo-ci/job/live-cdo-archive-mirror/290/console

Chris Glass (tribaal)
Changed in ubuntu-repository-cache (Juju Charms Collection):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.