Monasca Agent Libvirt issues with unqualified hostname

Bug #1894076 reported by Mariusz Karpiarz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
In Progress
Undecided
Mariusz Karpiarz

Bug Description

In an environment where hypervisors are identified with their FQDNs:

```
$ openstack compute service list | grep compute-intel01
| 34 | nova-compute | compute-intel01.novalocal | nova | enabled | up | 2020-09-03T09:38:23.000000 |
```
and with monasca-agent-collector's config coming from https://opendev.org/openstack/kolla-ansible/blame/commit/295f8d1b43cf2003f66634389fc0615609ee0f86/ansible/roles/monasca/templates/monasca-agent-collector/agent-collector.yml.j2, Monasca Agent is unable to find the `nova-compute` service on the hypervisor and ignores all the instances running on it:

```
[centos@compute-intel01 ~]$ sudo grep hostname /etc/kolla/monasca-agent-collector/agent-collector.yml
  hostname: compute-intel01
[centos@compute-intel01 ~]$ sudo cat /var/lib/docker/volumes/kolla_logs/_data/monasca/agent-collector.log
...
2020-09-03 09:17:16 UTC | WARNING | collector | monasca_agent.collector.checks.check.libvirt(libvirt.py:160) | No 'nova-compute' service found on host: compute-intel01
2020-09-03 09:17:16 UTC | ERROR | collector | monasca_agent.collector.checks.check.libvirt(libvirt.py:822) | instance-0000010c is not known to nova after instance cache update -- skipping this ghost VM.
2020-09-03 09:17:16 UTC | ERROR | collector | monasca_agent.collector.checks.check.libvirt(libvirt.py:822) | instance-000000ef is not known to nova after instance cache update -- skipping this ghost VM.
...
```
The above results in the list of instances for this host being empty and no instance-related metrics coming from the Libvirt plugin.

Removing the "hostname" config parameter fixes the problem:

```
[centos@compute-intel01 ~]$ sudo grep hostname /etc/kolla/monasca-agent-collector/agent-collector.yml
  #hostname: compute-intel01
[centos@compute-intel01 ~]$ sudo cat /var/lib/docker/volumes/kolla_logs/_data/monasca/agent-collector.log
...
2020-09-03 09:18:53 UTC | INFO | collector | monasca_agent.collector.checks.check.libvirt(libvirt.py:154) | Found 'nova-compute' registered with host: compute-intel01.novalocal
2020-09-03 09:18:54 UTC | INFO | collector | monasca_agent.collector.checks.collector(collector.py:99) | Finished run #1. Collection time: 1.26s.
...
```
This is because the Agent cannot match unqualified "hostname" parameter against the fully-qualified hypervisor name:
https://opendev.org/openstack/monasca-agent/src/commit/e021416e600f81d5ed544b1c66675719c3a896b6/monasca_agent/collector/checks_d/libvirt.py#L142-L163

The Agent can work out the hostname of the node it's running on automatically (using `hostname -f`) but the config parameter takes precedence:
https://opendev.org/openstack/monasca-agent/src/commit/e021416e600f81d5ed544b1c66675719c3a896b6/monasca_agent/common/util.py#L395-L404

Therefore I suggest we either remove the "hostname" parameter entirely or use `ansible_fqdn` instead.

Changed in kolla-ansible:
assignee: nobody → Mariusz Karpiarz (mkarpiarz)
status: New → In Progress
Revision history for this message
Tom Fifield (fifieldt) wrote :

Hi Mariusz, are you still interested in working on this?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.