Missing network binding after Juju upgrade

Bug #1950835 reported by Nicolas Bock
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned

Bug Description

After upgrading Juju from 2.6 to 2.9 we cannot upgrade the nrpe charm because of a missing network binding:

208 2021-11-09 13:19:30 INFO unit.nrpe-host/45.juju-log server.go:314 Getting ingress IP address for binding monitors
  1 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 Traceback (most recent call last):
  2 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/var/lib/juju/agents/unit-nrpe-host-45/charm/hooks/install", line 6, in <module>
  3 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 services.manage()
  4 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/var/lib/juju/agents/unit-nrpe-host-45/charm/hooks/services.py", line 34, in manage
  5 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 nrpe_helpers.NagiosInfo(),
  6 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/var/lib/juju/agents/unit-nrpe-host-45/charm/hooks/nrpe_helpers.py", line 296, in __init__
  7 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 self["nrpe_ipaddress"] = get_local_ingress_address("monitors")
  8 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/var/lib/juju/agents/unit-nrpe-host-45/charm/hooks/nrpe_helpers.py", line 84, in get_local_ingress_address
  9 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 network_info = hookenv.network_get(binding)
 10 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/var/lib/juju/agents/unit-nrpe-host-45/charm/hooks/charmhelpers/core/hookenv.py", line 1392, in network_get
 11 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 stderr=subprocess.STDOUT).decode('UTF-8').strip()
 12 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
 13 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 **kwargs).stdout
 14 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 File "/usr/lib/python3.6/subprocess.py", line 438, in run
 15 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 output=stdout, stderr=stderr)
 16 2021-11-09 13:19:30 WARNING unit.nrpe-host/45.install logger.go:60 subprocess.CalledProcessError: Command '['network-get', 'monitors', '--format', 'yaml']' returned non-zero exit status 1.
 17 2021-11-09 13:19:30 ERROR juju.worker.uniter.operation runhook.go:139 hook "install" (via explicit, bespoke hook script) failed: exit status 1
 18 2021-11-09 13:19:30 DEBUG juju.machinelock machinelock.go:186 machine lock released for nrpe-host/45 uniter (run install hook)

The nagios <--> nrpe relation is in place

Potentially interesting is this information from the Juju database:

[0] $ bsondump machines.bson 2>/dev/null | jq 'select(."model-uuid" == "be9803c0-a874-4e54-8b34-5be074c231e8")|select(.machineid == "0/kvm/0")|.addresses'
[] #<- empty json array.
(The unit is running on 0/kvm/0)

And

[1] $ bsondump machines.bson 2>/dev/null \| jq 'select(."model-uuid" == "be9803c0-a874-4e54-8b34-5be074c231e8")\|select(.machineid == "0")\|.addresses'
[{
"value": "10.150.0.107",
"addresstype": "ipv4",
"networkscope": "local-cloud",
"origin": "provider",
"spaceid": "4"
},{
"value": "10.152.0.61",
"addresstype": "ipv4",
"networkscope": "local-cloud",
"origin": "provider",
"spaceid": "8"
},
...
]

description: updated
Revision history for this message
John A Meinel (jameinel) wrote :

This feels like a case where 2.6 didn't record some sort of information that 2.9 depends on, which we very likely have an upgrade step to sort out for most cases. I have the feeling we missed the case where the underlying 'machine' is a nested KVM guest.

I do believe you can 'juju refresh --force-units' (juju upgrade-charm) to change the content on disk even if the charm is failing a hook. However, that wouldn't necessarily fix the issue about why Juju doesn't know the network information for that unit.

Changed in juju:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Nicolas Bock (nicolasbock) wrote :

Thanks for the hint about `refresh` John.

In this case the upgrade did not go straight to 2.9 and the environment was upgraded to 2.7 and 2.8 in between. Both controller and model were upgraded in each step. The issue with the nrpe charm was only discovered though when trying to upgrade the charm using juju 2.9. In other words, we don't know when that information you mention went missing.

Revision history for this message
Joseph Phillips (manadart) wrote :

Can you get me a dump of the Mongo DB?

JUJU_DEV_FEATURE_FLAGS=developer-mode juju dump-db > dump.yaml

Might have to redact certificates and such from the output.

Revision history for this message
Nicolas Bock (nicolasbock) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.