Add placement audit commands

Bug #1793569 reported by Mohammed Naser on 2018-09-20
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Wishlist
sean mooney

Bug Description

It is possible that placement gets out of sync which can cause scheduling problems that would go unknown. I've built out this script would would be nice to have as `nova-manage placement audit`:

================================================================================
#!/usr/bin/env python

import argparse
import sys

from openstack import connection
import openstack.config

config = openstack.config.OpenStackConfig()
parser = argparse.ArgumentParser()
config.register_argparse_arguments(parser, sys.argv)

options = parser.parse_args()

cloud_region = config.get_one(argparse=options)
conn = connection.Connection(config=cloud_region)

# Grab list of all hypervisors and their servers
hypervisors = conn.compute.get('/os-hypervisors?with_servers=true', microversion='2.53').json().get('hypervisors')

# Generate a dictionary mapping of hypervisor => [instances]
hypervisor_mapping = {h['id']: [s['uuid'] for s in h.get('servers', [])] for h in hypervisors}
hypervisor_names = {h['id']: h['hypervisor_hostname'] for h in hypervisors}

# Grab list of all resource providers
resource_providers = conn.placement.get('/resource_providers').json().get('resource_providers')
for rp in resource_providers:
  # Check if RP has VCPU in inventory (i.e. compute node)
  inventories = conn.placement.get('/resource_providers/%s/inventories' % rp['uuid']).json().get('inventories')

  # Skip those without VCPU and MEMORY_MB (non computes)
  if 'MEMORY_MB' not in inventories and 'VCPU' not in inventories:
    continue

  # Get all allocations for RP
  allocations = conn.placement.get('/resource_providers/%s/allocations' % rp['uuid']).json().get('allocations')

  # Is there a compute node for this RP?
  if rp['uuid'] not in hypervisor_mapping:
    print "openstack resource provider delete %s # resource provider does not have matching provider" % rp['uuid']
    continue

  for allocation_id, info in allocations.iteritems():
    # The instance does not exist where placement says it should be.
    if allocation_id not in hypervisor_mapping[rp['uuid']]:
      hypervisor = None

      # Try to find where it's hiding.
      for hyp, instances in hypervisor_mapping.iteritems():
        if allocation_id in instances:
          hypervisor = hyp
          break

      # We found it.
      if hypervisor:
        classes = ','.join(["%s=%s" % (key, value) for key, value in info.get('resources').iteritems()])
        print "openstack resource provider allocation set --allocation rp=%s,%s %s # instance allocated on wrong rp" % (hypervisor, classes, allocation_id)
        continue

      # We don't know where this is. Let's see if it exists in Nova.
      server = conn.placement.get('/servers/%s' % allocation_id)
      if server.status_code == 404:
        print "openstack resource provider allocation delete %s # instance deleted" % allocation_id
        continue

      # TODO: idk? edge cases?
      raise
================================================================================

It would likely need to be rewritten to use the built-in placement HTTP client and objects to avoid extra API calls.

Matt Riedemann (mriedem) wrote :

Related tooling for cleaning up orphaned resources and fixing incorrect allocations:

https://github.com/larsks/os-placement-tools

Changed in nova:
importance: Undecided → Wishlist
status: New → Confirmed
tags: added: nova-manage placement
tags: added: ops
Mohammed Naser (mnaser) wrote :

Updated iteration that deletes allocations from RPs that has been deleted.

================
#!/usr/bin/env python

import argparse
import sys

from openstack import connection
import openstack.config

config = openstack.config.OpenStackConfig()
parser = argparse.ArgumentParser()
config.register_argparse_arguments(parser, sys.argv)

options = parser.parse_args()

cloud_region = config.get_one(argparse=options)
conn = connection.Connection(config=cloud_region)

# Grab list of all hypervisors and their servers
hypervisors = conn.compute.get('/os-hypervisors?with_servers=true', microversion='2.53').json().get('hypervisors')

# Generate a dictionary mapping of hypervisor => [instances]
hypervisor_mapping = {h['id']: [s['uuid'] for s in h.get('servers', [])] for h in hypervisors}
hypervisor_names = {h['id']: h['hypervisor_hostname'] for h in hypervisors}

# Grab list of all resource providers
resource_providers = conn.placement.get('/resource_providers').json().get('resource_providers')
for rp in resource_providers:
  # Check if RP has VCPU in inventory (i.e. compute node)
  inventories = conn.placement.get('/resource_providers/%s/inventories' % rp['uuid']).json().get('inventories')

  # Skip those without VCPU and MEMORY_MB (non computes)
  if 'MEMORY_MB' not in inventories and 'VCPU' not in inventories:
    continue

  # Get all allocations for RP
  allocations = conn.placement.get('/resource_providers/%s/allocations' % rp['uuid']).json().get('allocations')

  for allocation_id, info in allocations.iteritems():
    # No compute node for this rp, delete allocation
    if rp['uuid'] not in hypervisor_mapping:
      print "openstack resource provider allocation delete %s # resource provider does not have matching provider" % allocation_id
      continue

    # The instance does not exist where placement says it should be.
    if allocation_id not in hypervisor_mapping[rp['uuid']]:
      hypervisor = None

      # Try to find where it's hiding.
      for hyp, instances in hypervisor_mapping.iteritems():
        if allocation_id in instances:
          hypervisor = hyp
          break

      # We found it.
      if hypervisor:
        classes = ','.join(["%s=%s" % (key, value) for key, value in info.get('resources').iteritems()])
        print "openstack resource provider allocation set --allocation rp=%s,%s %s # instance allocated on wrong rp" % (hypervisor, classes, allocation_id)
        continue

      # We don't know where this is. Let's see if it exists in Nova.
      server = conn.placement.get('/servers/%s' % allocation_id)
      if server.status_code == 404:
        print "openstack resource provider allocation delete %s # instance deleted" % allocation_id
        continue

      # TODO: idk? edge cases?
      raise

  if rp['uuid'] not in hypervisor_mapping:
    print "openstack resource provider delete %s # resource provider does not have matching provider" % rp['uuid']
    continue

Chris Dent (cdent) wrote :
Download full text (3.7 KiB)

Somewhat related to this, some folks @vmware are making a script that talks to the database directly for some "placement audit" needs. Mostly because that's SOP for them and doesn't require dealing with tokens and such. I include it here as an example in case it is useful. This version is oriented towards a pre-nested world, mostly ocata, but sometimes queens.

-=-=-
#!/bin/bash

rp_count=0

rp_count=$(mysql -N -D nova_api -e "select count(id) from resource_providers;")

i=0
while [ $i -lt $rp_count ]
do

#Resource Provider ID
rp_id=$(mysql -N -D nova_api -e "select id from resource_providers limit $i, 1;")

#Resource Provider Name
rp_name=$(mysql -N -D nova_api -e "select name from resource_providers where id = $rp_id")

#Used Resources
cpu_used=$(mysql -N -D nova_api -e "select sum(used) from allocations where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_used=$(mysql -N -D nova_api -e "select sum(used) from allocations where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_used=$(mysql -N -D nova_api -e "select sum(used) from allocations where resource_class_id=2 and resource_provider_id=$rp_id;")

#Total Resources
cpu_total=$(mysql -N -D nova_api -e "select total from inventories where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_total=$(mysql -N -D nova_api -e "select total from inventories where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_total=$(mysql -N -D nova_api -e "select total from inventories where resource_class_id=2 and resource_provider_id=$rp_id;")

#Minimum Resource Value
cpu_min=$(mysql -N -D nova_api -e "select min_unit from inventories where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_min=$(mysql -N -D nova_api -e "select min_unit from inventories where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_min=$(mysql -N -D nova_api -e "select min_unit from inventories where resource_class_id=2 and resource_provider_id=$rp_id;")

#Maximum Resource Value
cpu_max=$(mysql -N -D nova_api -e "select max_unit from inventories where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_max=$(mysql -N -D nova_api -e "select max_unit from inventories where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_max=$(mysql -N -D nova_api -e "select max_unit from inventories where resource_class_id=2 and resource_provider_id=$rp_id;")

#Resource Step Size
cpu_step=$(mysql -N -D nova_api -e "select step_size from inventories where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_step=$(mysql -N -D nova_api -e "select step_size from inventories where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_step=$(mysql -N -D nova_api -e "select step_size from inventories where resource_class_id=2 and resource_provider_id=$rp_id;")

#Allocation Ratio
cpu_ratio=$(mysql -N -D nova_api -e "select allocation_ratio from inventories where resource_class_id=0 and resource_provider_id=$rp_id;")
mem_ratio=$(mysql -N -D nova_api -e "select allocation_ratio from inventories where resource_class_id=1 and resource_provider_id=$rp_id;")
disk_ratio=$(mysql -N -D nova_api -e "select allocation_ratio from inventories where resource_class_id=2 and resource...

Read more...

Matt Riedemann (mriedem) wrote :

Paste of mnaser's script: http://paste.openstack.org/show/734146/

Matt Riedemann (mriedem) wrote :

I think one thing that larsks' and mnaser's audit scripts probably aren't handling are allocations held by migration records which never got cleaned up when a migration completed, likely failed for some reason and weren't cleaned up, e.g. we have several bugs related to failed migrations not cleaning up allocations properly:

https://review.opendev.org/#/c/654067/

https://review.opendev.org/#/c/663737/

https://review.opendev.org/#/q/topic:bug/1821594+(status:open+OR+status:merged)

The "nova-manage placement heal_allocations" CLI could probably be updated to deal with at least reporting allocations held by migration records that are not in progress so operators could delete those allocations or at least investigate the related instance for the migration to make sure it's allocations are OK, i.e. the operator could delete the allocations for both the migration consumer and instance consumer and then run the heal_allocations CLI again to re-add the allocations for just the instance on its current compute node resource provider.

Changed in nova:
assignee: nobody → Sylvain Bauza (sylvain-bauza)
Matt Riedemann (mriedem) wrote :

Per comment 5, I think I meant to say:

The "nova-manage placement audit" CLI could probably be updated to deal with at least reporting allocations held by migration records...

since heal_allocations doesn't report on things really, nor does it delete allocations, it only adds allocations for instances (not migrations) that are missing.

Matt Riedemann (mriedem) wrote :

(10:07:47 AM) mriedem: bauzas: i think what i was thinking of was an audit command could detect that you have orphaned allocations tied to a not-in-progress migration, e.g. a migration that failed but we failed to cleanup the allocations,
(10:08:20 AM) mriedem: bauzas: and then that information could be provided to the admin to then determine what to do, e.g. delete the allocations for the migration record consumer and potentially the related instance,
(10:08:33 AM) bauzas: mriedem: yeah ok
(10:08:35 AM) mriedem: and if they delete the allocations for the instance, then they could run heal_allocations on the instance to fix things up
(10:08:49 AM) mriedem: we could also eventually build on that to make it automatic with options
(10:08:57 AM) mriedem: e.g. nova-manage placement audit --heal
(10:09:00 AM) mriedem: something like that

Fix proposed to branch: master
Review: https://review.opendev.org/670112

Changed in nova:
status: Confirmed → In Progress
Matt Riedemann (mriedem) wrote :

Bug 1836369 gives another scenario where allocations can be leaked and not tracked correctly when compute crashes during a resize.

Changed in nova:
assignee: Sylvain Bauza (sylvain-bauza) → sean mooney (sean-k-mooney)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers