Ironic driver hash ring treats hostnames differing only by case as different hostnames
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
melanie witt | ||
Pike |
Fix Released
|
Low
|
Elod Illes | ||
Queens |
Fix Released
|
Low
|
melanie witt | ||
Rocky |
Fix Released
|
Low
|
melanie witt | ||
Stein |
Fix Released
|
Low
|
melanie witt | ||
Train |
Fix Released
|
Low
|
melanie witt |
Bug Description
Recently we had a customer case where attempts to add new ironic nodes to an existing undercloud resulted in half of the nodes failing to be detected and added to nova. Ironic API returned all of the newly added nodes when called by the driver, but half of the nodes were not returned to the compute manager by the driver.
There was only one nova-compute service managing all of the ironic nodes of the all-in-one typical undercloud deployment.
After days of investigation and examination of a database dump from the customer, we noticed that at some point the customer had changed the hostname of the machine from something containing uppercase letters to the same name but all lowercase. The nova-compute service record had the mixed case name and the CONF.host (socket.
The hash ring logic adds all of the nova-compute service hostnames plus CONF.host to hash ring, then the ironic driver reports only the nodes it owns by retrieving a service hostname from the ring based on a hash of each ironic node UUID.
Because of the machine hostname change, the hash ring contained, for example: {'MachineHostName', 'machinehostname'} when it should have contained only one hostname. And because the hash ring contained two hostnames, the driver was able to retrieve only half of the nodes as nodes that it owned. So half of the new nodes were excluded and not added as new compute nodes.
I propose adding some logging to the driver related to the hash ring to help with debugging in the future.
summary: |
- Difficult to debug unexpected ironic driver behavior related to - available nodes + Ironic driver hash ring treats hostnames differing only by case as + different hostnames |
Fix proposed to branch: master /review. opendev. org/711680
Review: https:/