telegraf haproxy input broken with Juju >= 2.8.7

Bug #1910974 reported by Thomas Cuthbert
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Joseph Phillips
Content Cache Charm
Fix Released
Low
Thomas Cuthbert
Telegraf Charm
Fix Released
Low
Thomas Cuthbert

Bug Description

# Code that configures the haproxy telegraf plugin.
## notice that it hasn't been changed since 2016.
29b50c58 src/reactive/telegraf.py (Xav Paice 2020-08-07 15:04:46 +1200 913) addr = rel["private-address"]
^a7836fc reactive/telegraf.py (Guillermo Gonzalez 2016-04-29 18:24:43 -0300 914) if addr == hookenv.unit_private_ip():
^a7836fc reactive/telegraf.py (Guillermo Gonzalez 2016-04-29 18:24:43 -0300 915) addr = "localhost"

# The problem
That code would work if we called unit-get public-address, see below it returns the fqdn which would satisfy line 914

juju run --unit content-cache-1ss/1 "relation-get -r haproxy-statistics:34 - telegraf-1ss/5"
ingress-address: darkbowser.canonical.com
private-address: darkbowser.canonical.com

juju run --unit telegraf-1ss/5 "relation-get -r haproxy:34 - content-cache-1ss/1"
enabled: "True"
ingress-address: darkbowser.canonical.com

port: "10000"
private-address: darkbowser.canonical.com
user: haproxy

juju run --unit telegraf-1ss/5 "unit-get private-address"
91.189.91.43

(mojo-prod-snapstore-content-cache)prod-snapstore-content-cache@wekufe:~$ juju run --unit telegraf-1ss/5 "unit-get private-address"
91.189.91.43
(mojo-prod-snapstore-content-cache)prod-snapstore-content-cache@wekufe:~$ =^C
(mojo-prod-snapstore-content-cache)prod-snapstore-content-cache@wekufe:~$ juju run --unit telegraf-1ss/5 "unit-get public-address"
darkbowser.canonical.com

Related branches

Revision history for this message
Thomas Cuthbert (tcuthbert) wrote :
Revision history for this message
Haw Loeung (hloeung) wrote :

Not a bug in the content-cache charm.

Changed in content-cache-charm:
status: New → Invalid
Revision history for this message
Haw Loeung (hloeung) wrote :

I think this is a bug in the telegraf charm, this logic here[1]:

| addr = rel["private-address"]
| if addr == hookenv.unit_private_ip():
| addr = "localhost"

If this is a 'subordinate' charm, would there ever be a case where it's not scraping haproxy stats from a locally running haproxy instance?

[1]https://git.launchpad.net/charm-telegraf/tree/src/reactive/telegraf.py#n913

Revision history for this message
Haw Loeung (hloeung) wrote :

I think the best solution here is to update the haproxy[1] and content-cache[2] to pass through the listen address for haproxy statistics, they both already pass through the username, password, and port. Then have the telegraf charm check the presence of this and use it if it's something other than 0.0.0.0. If it doesn't exist, try work it out with the existing code (rel["private-address"] and hookenv.unit_private_ip()).

For the HAProxy charm, it defaults to statistics on 0.0.0.0 so why it doesn't appear broken with the latest Juju version - localhost or private-address, HAProxy statistics will answer on both interfaces. For the Content-cache charm, it configures statistics to listen on 127.0.0.1.

[1]https://bazaar.launchpad.net/~haproxy-team/charm-haproxy/trunk/view/head:/hooks/hooks.py#L1437
[2]https://git.launchpad.net/~hloeung/content-cache-charm/tree/reactive/content_cache.py#n483

Revision history for this message
Joseph Phillips (manadart) wrote :

The relation settings must have been written/re-written with the FQDN as the private address. The charm logic that does that might be of interest.

Note that for manually provisioned machines, the provider always returns the address used to provision it (presumably, "juju add-machine ssh:<email address hidden>") when queried.

"unit-get" appears to be getting an IP address because it is using the NetworkInfo method, which queries link-layer devices.

Something curious though; can you check your logs for errors? Based on the model.yaml on your private fileshare, all of your link-layer device addresses have an origin of "provider". This is set as an upgrade step and I would expect that many of these would be relinquished to the machine since.

Revision history for this message
Joseph Phillips (manadart) wrote :

Actually scratch that. Manual machines are not updated by the instance-poller, so the addresses will remain as they are.

Revision history for this message
Joseph Phillips (manadart) wrote :

This is happening upon (re)entering relation scope. I am looking into it.

This bug is not against Juju, but it looks like a symptom of https://bugs.launchpad.net/bugs/1911135.

Revision history for this message
Joseph Phillips (manadart) wrote :

I have got to the bottom of this.

Juju changed behaviour for the "network-get" tool, pushing host name resolution from the hook context to be behind the API backing.

What was missed is that for "unit-get private-address", we call the same backing API first before falling back if required to the "private-address" set on the hook context.

So now, "unit-get private-address" will return FQDNs as IPs where it can resolve them.

Although this is an unintended change (2.8.6 -> 2.8.7), using "unit-get" for addresses is deprecated and intended for removal in Juju 3.0. The network-get tool should be used instead.

Will your use-case work if instead of comparing the results from "relation-get" and "unit-get", we use "network-get --ingress-address ..." in each case, using "--relation <rel>" for the first?

Changed in juju:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Joseph Phillips (manadart)
milestone: none → 2.8.8
Revision history for this message
Joseph Phillips (manadart) wrote :

https://github.com/juju/juju/pull/12528 addresses this for 2.8.8.

Changed in juju:
status: In Progress → Fix Committed
Haw Loeung (hloeung)
Changed in charm-telegraf:
assignee: nobody → Thomas Cuthbert (tcuthbert)
status: New → Fix Committed
Changed in content-cache-charm:
assignee: nobody → Thomas Cuthbert (tcuthbert)
importance: Undecided → Low
Changed in charm-telegraf:
importance: Undecided → Low
Changed in content-cache-charm:
status: Invalid → Fix Committed
Celia Wang (ziyiwang)
Changed in charm-telegraf:
status: Fix Committed → Fix Released
milestone: none → 21.01
Changed in content-cache-charm:
status: Fix Committed → Fix Released
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.