Autopilot: Nagios uses the wrong subnet IP to reach one host

Bug #1615013 reported by David Coronel
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
Medium
Unassigned
Landscape Server
Invalid
Undecided
Unassigned
MAAS
Invalid
Undecided
Unassigned
nagios (Juju Charms Collection)
New
Undecided
Unassigned

Bug Description

After deploying Openstack with Landscape Autopilot, one of the hosts always fails all the checks in Nagios with a "CHECK_NRPE: Error - Could not complete SSL handshake" error. This error always happens in all Autopilot deployments and always for the same node.

The issue is caused by Nagios having the wrong IP address for that host in /etc/nagios3/conf.d/charm.cfg

The host entry in /etc/nagios3/conf.d/charm.cfg points to the private IP instead of the public IP:

define host {
  use generic-host
  statusmap_image base/ubuntu.gd2
  icon_image_alt Ubuntu Linux
  vrml_image ubuntu.png
  host_name <REMOVED - hostname of the host that fails>
  icon_image base/ubuntu.png
 address <REMOVED - private IP of the host>
}

Here is another entry from a similar host that works fine in Nagios:

define host {
  use generic-host
  statusmap_image base/ubuntu.gd2
  icon_image_alt Ubuntu Linux
  vrml_image ubuntu.png
  host_name <REMOVED - hostname of a host that works>
  icon_image base/ubuntu.png
 address <REMOVED - FQDN of the host in the form hostname.maas>
}

hostname.maas points to the public IP of the host instead of the private IP.

The workaround is to modify the address line of the bad host entry to be <hostname>.maas which points to the public IP and to restart Nagios.

We suspect that MAAS or Juju is not doing the right thing somewhere which leads to an incorrect entry in /etc/nagios3/conf.d/charm.cfg file.

David Britton (dpb)
information type: Proprietary → Private
David Britton (dpb)
Changed in landscape:
status: New → Invalid
tags: added: kanban-cross-team landscape
tags: removed: kanban-cross-team
David Britton (dpb)
information type: Private → Public
Revision history for this message
Andres Rodriguez (andreserl) wrote :

MAAS doesn't really have a differentiation between what is a public/private network. We always create a DNS record as <hostname>.<domain> for the PXE interface. Starting for 2.0+, it new records will be created as <ethX>.<hostname>.<domain>, so it is juju who should decide to what interface it wants to connect it to, and grab the correct hostname/domain/ip for it.

Changed in maas:
status: New → Won't Fix
status: Won't Fix → Invalid
Changed in juju-core:
status: New → Triaged
importance: Undecided → Medium
milestone: none → 2.1.0
affects: juju-core → juju
Changed in juju:
milestone: 2.1.0 → none
milestone: none → 2.1.0
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.1-rc2 → none
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 5 years, so we're marking it Expired. If you believe this is incorrect, please update the status.

Changed in juju:
status: Triaged → Expired
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.