scope: global -> scope: container relations with multiple machines: cannot read settings for unit ... settings not found

Bug #1721295 reported by Dmitrii Shcherbakov
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

when one side is;
    "scope": "global"

and the other side is:

    "scope": "container"

On a multi-machine environment some units will get this error:

ERROR cannot read settings for unit "postgresql/1" in relation "telegraf:postgresql postgresql:db-admin": settings not found

However, judging by the source code it should be supported properly:
 // If either endpoint has container scope, so must the other

It seems like the logic should be: "if either endpoint is container-scoped, both sides should be container-scoped" and but relation settings accesses should be properly filtered to avoid the above error.

Revision history for this message
John A Meinel (jameinel) wrote :

Stub also raised this on the list. My summary comment there is:

If telegraf isn't using the pgsql connection to imply "I want to store my data in postgresql", then it should probably be using a different relationship to postgresql. Because I imagine the pgsql is likely to tell telegraf things like "here is the current master, go send your data there", but telegraf monitoring of postgresql doesn't want to monitor the master, it wants to monitor the exact postgresql application that it is on the same machine that this instance of telegraf is sitting alongside.

I don't know the details of how telegraf is going to use its connection to postgresql, but the pgsql interface seems like it would have some very specific semantics around how you should be using the data, which may not fit a monitoring relationship in the same way it fits a 'store my data' relationship.

I could be completely wrong about that, though, and it may be that the needs are sufficiently similar that it should be ok.

I will say that if you did:
 juju deploy postgresql -n3
 juju deploy telegraf
 juju add-relation telegraf:? postgresql:?

Would it be weird that the leader of the postgresql charm *doesn't* see a relation from telegraf/1 and /2, because it is running on postgresql/0, but postgresql/1 does see telegraf/1, etc.

I don't know how postgresql itself does the charm data, but I would think that it would try to set up a username/password/database tables for the entire application that is connected to the pgsql endpoint, but won't be able to set up pg_hba.conf correctly because each unit of postgresql actually sees a different unit of telegraf.

Revision history for this message
John A Meinel (jameinel) wrote :

Also, I do believe the code has been in flux in this area recently, because of other issues that we encountered with container scoped relations actually having too many other units in their scope.

John A Meinel (jameinel)
Changed in juju:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers