Add placement audit commands
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Wishlist
|
Sylvain Bauza |
Bug Description
It is possible that placement gets out of sync which can cause scheduling problems that would go unknown. I've built out this script would would be nice to have as `nova-manage placement audit`:
=======
#!/usr/bin/env python
import argparse
import sys
from openstack import connection
import openstack.config
config = openstack.
parser = argparse.
config.
options = parser.parse_args()
cloud_region = config.
conn = connection.
# Grab list of all hypervisors and their servers
hypervisors = conn.compute.
# Generate a dictionary mapping of hypervisor => [instances]
hypervisor_mapping = {h['id']: [s['uuid'] for s in h.get('servers', [])] for h in hypervisors}
hypervisor_names = {h['id']: h['hypervisor_
# Grab list of all resource providers
resource_providers = conn.placement.
for rp in resource_providers:
# Check if RP has VCPU in inventory (i.e. compute node)
inventories = conn.placement.
# Skip those without VCPU and MEMORY_MB (non computes)
if 'MEMORY_MB' not in inventories and 'VCPU' not in inventories:
continue
# Get all allocations for RP
allocations = conn.placement.
# Is there a compute node for this RP?
if rp['uuid'] not in hypervisor_mapping:
print "openstack resource provider delete %s # resource provider does not have matching provider" % rp['uuid']
continue
for allocation_id, info in allocations.
# The instance does not exist where placement says it should be.
if allocation_id not in hypervisor_
hypervisor = None
# Try to find where it's hiding.
for hyp, instances in hypervisor_
if allocation_id in instances:
break
# We found it.
if hypervisor:
classes = ','.join(["%s=%s" % (key, value) for key, value in info.get(
print "openstack resource provider allocation set --allocation rp=%s,%s %s # instance allocated on wrong rp" % (hypervisor, classes, allocation_id)
continue
# We don't know where this is. Let's see if it exists in Nova.
server = conn.placement.
if server.status_code == 404:
print "openstack resource provider allocation delete %s # instance deleted" % allocation_id
continue
# TODO: idk? edge cases?
raise
=======
It would likely need to be rewritten to use the built-in placement HTTP client and objects to avoid extra API calls.
Changed in nova: | |
assignee: | nobody → Sylvain Bauza (sylvain-bauza) |
Changed in nova: | |
assignee: | Sylvain Bauza (sylvain-bauza) → sean mooney (sean-k-mooney) |
Changed in nova: | |
assignee: | sean mooney (sean-k-mooney) → Sylvain Bauza (sylvain-bauza) |
Related tooling for cleaning up orphaned resources and fixing incorrect allocations:
https:/ /github. com/larsks/ os-placement- tools