edge: bundle is blocked by relations not allowed to non-leader

Bug #1719328 reported by Greg Lutostanski
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Ian Booth

Bug Description

Not sure exactly what is causing this. Our ci for jujuedge has blocked deployments for openstack-base.

Here are crashdumps of two different bad runs (2.3-alpha1+develop-8657339):
https://10.245.162.101/artifacts/65f2ae63-aae6-4cde-822d-d4d12eb84f36/deploy_prepared_bundle_542/juju-crashdump-13bc6e26-87fd-43b6-ab2d-6b17e2950a3e.tar.xz
https://10.245.162.101/artifacts/27db770f-c343-4d16-a1d8-e41792cf53cd/deploy_prepared_bundle_564/juju-crashdump-0b36ebdf-2f1d-4d08-ac29-2b137b708a18.tar.xz

And for comparison a good one for jujubeta (2.2.4):
https://10.245.162.101/artifacts/5a14a3e5-6255-445f-8c0e-0f1faf9efb43/deploy_prepared_bundle_560/juju-crashdump-f2370f48-9eae-4f48-ad07-53495665846b.tar.xz

There is a significant difference in logs collected... I am looking at juju_debuglog.txt from the respective tarballs.

The thing that jumps out at me are the lines similar to:
manifold worker returned unexpected error: failed to initialize uniter for "unit-ceph-mon-1": cannot create relations: "ceph-mon/1" is not leader of "ceph-mon"
(small snippet here: https://pastebin.canonical.com/199251/).

It seems like a definite possibility since some of those that are spewing this message, the units asking for the relation are stuck in a 'blocked' state, causing the deployment to be blocked as a whole.

Last good run is juju_2.3-alpha1+develop-98ba1ff, earliest run that failed in this way is juju_2.3-alpha1+develop-cd0aea6

so with 98ba1ff (good) and cd0aea6 (bad) git commits, could be a short bisect.

summary: - On jujuedge 2.3-alpha1+develop-8657339 bundle is blocked by relations
- not allowed to non-leader
+ edge: bundle is blocked by relations not allowed to non-leader
tags: added: cdo-qa cdo-qa-blocker
Revision history for this message
Nicholas Skaggs (nskaggs) wrote :

Greg, can you sanitize and upload those dumps to a readable location? Attaching them to the bug would be ideal.

Revision history for this message
Greg Lutostanski (lutostag) wrote :
Revision history for this message
Greg Lutostanski (lutostag) wrote :
Revision history for this message
Greg Lutostanski (lutostag) wrote :
Revision history for this message
Ian Booth (wallyworld) wrote :

This issue is due to introducing the concept of relation status and having the unit agents set the relation status to "joined" after running the joined hook. There's a leadership check in the controller but not one in the uniter.

Changed in juju:
milestone: none → 2.3-alpha1
importance: Undecided → Critical
assignee: nobody → Ian Booth (wallyworld)
status: New → Triaged
Ian Booth (wallyworld)
Changed in juju:
status: Triaged → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote :
Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Greg Lutostanski (lutostag) wrote :

Working in our ci at the moment, I know its just in edge, but since that is what was broken, considering it fixed released for our purposes.

Thanks!

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.