Prometheus Openstack Exporter Charm

Allocation data should be taken from a placement API

Bug #1938078 reported by Dariusz Smigiel on 2021-07-26

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Prometheus Openstack Exporter Charm	Confirmed	Undecided	Unassigned

Bug Description

When we configured ram-allocation-ratio and cpu-allocation-ratio as 0 for nova-cloud-controller and updated prometheus-openstack-exporter with appropriate values, POE resports negative values of "-32"

https://github.com/CanonicalLtd/prometheus-openstack-exporter/blob/747a1475a718cd4811fb945d2acd7e3b2e019db7/prometheus-openstack-exporter#L379

Adam Dyess (addyess) on 2021-09-10

Changed in charm-prometheus-openstack-exporter:
status:	New → Confirmed

Revision history for this message

Adam Dyess (addyess) wrote on 2021-09-10 (last edit on 2021-09-10):

in fact, starting in openstack release train and beyond, all these allocation ratios (VCPU/mem/Disk) all are governed by the placement API -- not N-C-C's relation data.

This is fundamentally broken

Each Hypervisor according to the placement api gets to schedule different ratios for each of the class types: Disk, VCPU, and memory.

You can have
4 Hypervisors which are scheduled 1:1 and 16 vCPUs
4 Hypervisors which are scheduled 1:4 and 16 vCPUs
4 Hypervisors which are scheduled 1:8 and 16 vCPUs
4 Hypervisors which are scheduled 1:16 and 16 vCPUs

If each one of these machines had a single VM that took 16 vCPUs
* the first group would have all of its vCPUs consumed
* the second group would have 1/4th of its vCPUs consumed
* the third group would have 1/8th of its vCPUs consumed
* the fourth group would have 1/16th of its vCPUs consumed

imagine if the nova-cloud-controller relation said the cpu allocation ratio cloud-wide was 1:2

p-o-e would take the number of CPUs on each host and multiply by 2 to determine the "max vcpus available" (nope this is already wrong)

All the graphs in grafana would look like there were
16 vcpus left to schedule from the first group - really 0
16 vcpus left to schedule from the second group - really 48
16 vcpus left to schedule from the third group - really 112
16 vcpus left to schedule from the third group - really 240

This charm must use the relation to the keystone identity service and query the placement API for EACH resource (hypervisor in this case) and get the allocation ratio from each to use as the correct multiplier per hypervisor rather than use some explicit

config['openstack_allocation_ratio_vcpu']

the same is true for all `allocation_ratios`

Dariusz Smigiel (smigiel-dariusz) on 2021-09-10

summary:

- Incorrect schedulable_instances value when allocation is set to 0
+ Allocation data should be taken from a placement API

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.