relation-list missing a unit

Bug #1988728 reported by Liam Young
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
New
Undecided
Unassigned

Bug Description

After an OpenStack deployment completed the hacluster subordinate reported it wasn't at the expected scale. Looking into it showed that unit hacluster-aodh/1 was missing from the hanode relationship with hacluster-aodh/0 but only from hacluster-aodh/0's point of view. Inspecting the relationship from hacluster-aodh/1's point of view correctly shows both peer nodes (hacluster-aodh/0 and hacluster-aodh/2).

juju version: 2.9.33-ubuntu-amd64

$ juju run --application hacluster-aodh "relation-ids hanode" [11/51]
- Stdout: |
    hanode:8
  UnitId: hacluster-aodh/1
- Stdout: |
    hanode:8
  UnitId: hacluster-aodh/2
- Stdout: |
    hanode:8
  UnitId: hacluster-aodh/0

$ juju run --application hacluster-aodh "relation-list -r hanode:8"
- Stdout: |
    hacluster-aodh/2
  UnitId: hacluster-aodh/0
- Stdout: |
    hacluster-aodh/0
    hacluster-aodh/2
  UnitId: hacluster-aodh/1
- Stdout: |
    hacluster-aodh/0
    hacluster-aodh/1
  UnitId: hacluster-aodh/2

$ juju status aodh
Model Controller Cloud/Region Version SLA Timestamp
openstack foundations-maas maas_cloud/default 2.9.33 unsupported 08:46:42Z

SAAS Status Store URL
grafana active foundations-maas admin/lma-maas.grafana
graylog active foundations-maas admin/lma-maas.graylog
nagios active foundations-maas admin/lma-maas.nagios
prometheus active foundations-maas admin/lma-maas.prometheus

App Version Status Scale Charm Channel Rev Exposed Message
aodh 14.0.0 active 3 aodh yoga/stable 77 no Unit is ready
aodh-mysql-router 8.0.30 active 3 mysql-router 8.0/stable 35 no Unit is ready
filebeat 6.8.23 active 3 filebeat candidate 38 no Filebeat ready.
hacluster-aodh waiting 3 hacluster edge 109 no Resource: res_aodh_272179f_vip not yet configured
logrotated active 3 logrotated candidate 7 no Unit is ready.
nrpe active 3 nrpe candidate 94 no Ready
prometheus-grok-exporter maintenance 3 prometheus-grok-exporter candidate 8 no Installing software
public-policy-routing active 3 advanced-routing candidate 11 no Unit is ready
telegraf active 3 telegraf candidate 54 no Monitoring ceph-osd/2 (source version/commit 76901fd)

Unit Workload Agent Machine Public address Ports Message
aodh/0* active idle 0/lxd/0 10.246.165.92 8042/tcp Unit is ready
  aodh-mysql-router/0* active idle 10.246.165.92 Unit is ready
  filebeat/30 active idle 10.246.165.92 Filebeat ready.
  hacluster-aodh/0* waiting idle 10.246.165.92 Resource: res_aodh_272179f_vip not yet configured
  logrotated/24 active idle 10.246.165.92 Unit is ready.
  nrpe/36 active idle 10.246.165.92 icmp,5666/tcp Ready
  prometheus-grok-exporter/31 active idle 10.246.165.92 9144/tcp Unit is ready
  public-policy-routing/13 active idle 10.246.165.92 Unit is ready
  telegraf/29 active idle 10.246.165.92 9103/tcp Monitoring aodh/0 (source version/commit 76901fd)
aodh/1 active idle 1/lxd/0 10.246.166.208 8042/tcp Unit is ready
  aodh-mysql-router/1 active idle 10.246.166.208 Unit is ready
  filebeat/31 active idle 10.246.166.208 Filebeat ready.
  hacluster-aodh/1 waiting idle 10.246.166.208 Resource: res_aodh_272179f_vip not yet configured
  logrotated/25 active idle 10.246.166.208 Unit is ready.
  nrpe/37 active idle 10.246.166.208 icmp,5666/tcp Ready
  prometheus-grok-exporter/30 active idle 10.246.166.208 9144/tcp Unit is ready
  public-policy-routing/14 active idle 10.246.166.208 Unit is ready
  telegraf/31 active idle 10.246.166.208 9103/tcp Monitoring aodh/1 (source version/commit 76901fd)
aodh/2 active idle 2/lxd/0 10.246.165.66 8042/tcp Unit is ready
  aodh-mysql-router/2 active idle 10.246.165.66 Unit is ready
  filebeat/64 active idle 10.246.165.66 Filebeat ready.
  hacluster-aodh/2 waiting idle 10.246.165.66 Resource: res_aodh_272179f_vip not yet configured
  logrotated/56 active idle 10.246.165.66 Unit is ready.
  nrpe/68 active idle 10.246.165.66 icmp,5666/tcp Ready
  prometheus-grok-exporter/62 active idle 10.246.165.66 9144/tcp Unit is ready
  public-policy-routing/35 active idle 10.246.165.66 Unit is ready
  telegraf/63 active idle 10.246.165.66 9103/tcp Monitoring aodh/2 (source version/commit 76901fd)

Machine State Address Inst id Series AZ Message
0 started 10.246.164.163 solqa-lab1-server-37 jammy zone1 Deployed
0/lxd/0 started 10.246.165.92 juju-30865a-0-lxd-0 jammy zone1 Container started
1 started 10.246.166.192 solqa-lab1-server-32 jammy zone2 Deployed
1/lxd/0 started 10.246.166.208 juju-30865a-1-lxd-0 jammy zone2 Container started
2 started 10.246.165.238 solqa-lab1-server-33 jammy zone3 Deployed
2/lxd/0 started 10.246.165.66 juju-30865a-2-lxd-0 jammy zone3 Container started

Liam Young (gnuoy)
description: updated
Revision history for this message
Liam Young (gnuoy) wrote :
Revision history for this message
Ian Booth (wallyworld) wrote :

Ideally we'd get the output of juju show-unit for the affected units.
Plus a juju dump-db output (after exporting JUJU_DEV_FEATURE_FLAGS=developer-mode)

The above will tell us the internals of the juju model is and we can use that plus logging info to try and see what's going on.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.