Actions instance-count, remove-from-cloud and register-to-cloud are failing
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Compute Charm |
Fix Released
|
High
|
Martin Kalcok |
Bug Description
Root of this cause is a change of behavior in python's `socket.getfqdn()`. Previously this function returned hostname including full domain but now it returns only hostname without the domain. The full domain name is required to identify hypervisors in hypervisor list [1]. I don't know yet why this function changed its behavior but it got me to trace how actual nova comes up with the hostname when registering hypervisor in nova-cloud-
* Here's the final call on the nova-compute that register the hypervisor [2]
Working backwards from there and tracing origins of `host` variable
* `ResourceTracker` class receives it as an argument [3]
* ResourceTracker instance is created by `ComputeManager`[4] which also receives it as an argument (in **kwargs)
* `ComputeManager` is instantiated in `Service` class [5] which, either gets the `host` from constructor or, if it's not supplied, from config
* And finally the `Service` class is instantiated in `nova.cmd.
So the bottom line is that the `host` should be sourced from config.
This works out nice because it gives us a good place to source this variable ourselves. However, on bionic-train distribution, there's no `host` variable in configs. I tried it and the `nova.con.
That's as far as I got for now but bottom line I think is that we need to reliably reproduce the behavior of nova, when it's registering hypervisors, in our nova-compute charm.
mka.
---
[5] https:/
[6] https:/
Changed in charm-nova-compute: | |
assignee: | nobody → Martin Kalcok (martin-kalcok) |
tags: | added: unstable-test |
Changed in charm-nova-compute: | |
importance: | Undecided → High |
Changed in charm-nova-compute: | |
milestone: | none → 22.04 |
Changed in charm-nova-compute: | |
status: | Fix Committed → Fix Released |
Hi Martin,
Thanks for reporting this issue, which may not be a regression in the
tests and infrastructure related problems instead.
The CI jobs run on top of a focal-ussuri cloud that was migrated from
OVS to OVN-21.09 as SDN on December 15th.
We were discussing the symptoms described in this bug earlier and it /review. opendev. org/c/openstack /neutron/ +/822294 which updates
seems to be because our version of neutron is missing this patch
https:/
ovn NB entries.
Best,
On Fri, 2021-12-17 at 15:39 +0000, Martin Kalcok wrote: controller. Here's my trace: compute: main` [6] function which does NOT pass the `host` CONF.host` has /github. com/openstack/ charm-nova- blob/1d132e68cc abc44bec51d8847 24c5d2b96700fb3 /lib/nova_ compute/ /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ resource_ tracker. py#L693 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ resource_ tracker. py#L90 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ manager. py#L610 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/service. py#L128 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/service. py#L128
> Public bug reported:
>
> Root of this cause is a change of behavior in python's
> `socket.getfqdn()`. Previously this function returned hostname
> including
> full domain but now it returns only hostname without the domain. The
> full domain name is required to identify hypervisors in hypervisor list
> [1]. I don't know yet why this function changed its behavior but it got
> me to trace how actual nova comes up with the hostname when registering
> hypervisor in nova-cloud-
>
> * Here's the final call on the nova-compute that register the
> hypervisor
> [2]
>
> Working backwards from there and tracing origins of `host` variable
>
> * `ResourceTracker` class receives it as an argument [3]
>
> * ResourceTracker instance is created by `ComputeManager`[4] which also
> receives it as an argument (in **kwargs)
>
> * `ComputeManager` is instantiated in `Service` class [5] which, either
> gets the `host` from constructor or, if it's not supplied, from config
>
> * And finally the `Service` class is instantiated in
> `nova.cmd.
> argument explicitly.
>
> So the bottom line is that the `host` should be sourced from config.
>
> This works out nice because it gives us a good place to source this
> variable ourselves. However, on bionic-train distribution, there's no
> `host` variable in configs. I tried it and the `nova.con.
> value of hostname (without the domain). Regardless, of this, even
> hypervisors on this this distribution are registered with full
> hostname+domain.
> That's as far as I got for now but bottom line I think is that we need
> to reliably reproduce the behavior of nova, when it's registering
> hypervisors, in our nova-compute charm.
>
>
> mka.
>
> ---
>
> [1] https:/
> compute/
> cloud_utils.py#L113
>
> [2]
> https:/
>
> [3]
> https:/
>
> [4]
> https:/
>
> [5]
> https:/
>
> [6]
> https:/
>
> ** Affects: charm-nova-compute
> Importance: Undecided
> Status: New
>