On Fri, 2021-12-17 at 15:39 +0000, Martin Kalcok wrote:
> Public bug reported:
>
> Root of this cause is a change of behavior in python's
> `socket.getfqdn()`. Previously this function returned hostname
> including
> full domain but now it returns only hostname without the domain. The
> full domain name is required to identify hypervisors in hypervisor list
> [1]. I don't know yet why this function changed its behavior but it got
> me to trace how actual nova comes up with the hostname when registering
> hypervisor in nova-cloud-controller. Here's my trace:
>
> * Here's the final call on the nova-compute that register the
> hypervisor
> [2]
>
> Working backwards from there and tracing origins of `host` variable
>
> * `ResourceTracker` class receives it as an argument [3]
>
> * ResourceTracker instance is created by `ComputeManager`[4] which also
> receives it as an argument (in **kwargs)
>
> * `ComputeManager` is instantiated in `Service` class [5] which, either
> gets the `host` from constructor or, if it's not supplied, from config
>
> * And finally the `Service` class is instantiated in
> `nova.cmd.compute:main` [6] function which does NOT pass the `host`
> argument explicitly.
>
> So the bottom line is that the `host` should be sourced from config.
>
> This works out nice because it gives us a good place to source this
> variable ourselves. However, on bionic-train distribution, there's no
> `host` variable in configs. I tried it and the `nova.con.CONF.host` has
> value of hostname (without the domain). Regardless, of this, even
> hypervisors on this this distribution are registered with full
> hostname+domain.
> That's as far as I got for now but bottom line I think is that we need
> to reliably reproduce the behavior of nova, when it's registering
> hypervisors, in our nova-compute charm.
>
>
> mka.
>
> ---
>
> [1] https://github.com/openstack/charm-nova-
> compute/blob/1d132e68ccabc44bec51d884724c5d2b96700fb3/lib/nova_compute/
> cloud_utils.py#L113
>
> [2]
> https://github.com/openstack/nova/blob/6667fcb92bfaf03a8a274dc26806c137aace6b49/nova/compute/resource_tracker.py#L693
>
> [3]
> https://github.com/openstack/nova/blob/6667fcb92bfaf03a8a274dc26806c137aace6b49/nova/compute/resource_tracker.py#L90
>
> [4]
> https://github.com/openstack/nova/blob/6667fcb92bfaf03a8a274dc26806c137aace6b49/nova/compute/manager.py#L610
>
> [5]
> https://github.com/openstack/nova/blob/6667fcb92bfaf03a8a274dc26806c137aace6b49/nova/service.py#L128
>
> [6]
> https://github.com/openstack/nova/blob/6667fcb92bfaf03a8a274dc26806c137aace6b49/nova/service.py#L128
>
> ** Affects: charm-nova-compute
> Importance: Undecided
> Status: New
>
Hi Martin,
Thanks for reporting this issue, which may not be a regression in the
tests and infrastructure related problems instead.
The CI jobs run on top of a focal-ussuri cloud that was migrated from
OVS to OVN-21.09 as SDN on December 15th.
We were discussing the symptoms described in this bug earlier and it /review. opendev. org/c/openstack /neutron/ +/822294 which updates
seems to be because our version of neutron is missing this patch
https:/
ovn NB entries.
Best,
On Fri, 2021-12-17 at 15:39 +0000, Martin Kalcok wrote: controller. Here's my trace: compute: main` [6] function which does NOT pass the `host` CONF.host` has /github. com/openstack/ charm-nova- blob/1d132e68cc abc44bec51d8847 24c5d2b96700fb3 /lib/nova_ compute/ /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ resource_ tracker. py#L693 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ resource_ tracker. py#L90 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/compute/ manager. py#L610 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/service. py#L128 /github. com/openstack/ nova/blob/ 6667fcb92bfaf03 a8a274dc26806c1 37aace6b49/ nova/service. py#L128
> Public bug reported:
>
> Root of this cause is a change of behavior in python's
> `socket.getfqdn()`. Previously this function returned hostname
> including
> full domain but now it returns only hostname without the domain. The
> full domain name is required to identify hypervisors in hypervisor list
> [1]. I don't know yet why this function changed its behavior but it got
> me to trace how actual nova comes up with the hostname when registering
> hypervisor in nova-cloud-
>
> * Here's the final call on the nova-compute that register the
> hypervisor
> [2]
>
> Working backwards from there and tracing origins of `host` variable
>
> * `ResourceTracker` class receives it as an argument [3]
>
> * ResourceTracker instance is created by `ComputeManager`[4] which also
> receives it as an argument (in **kwargs)
>
> * `ComputeManager` is instantiated in `Service` class [5] which, either
> gets the `host` from constructor or, if it's not supplied, from config
>
> * And finally the `Service` class is instantiated in
> `nova.cmd.
> argument explicitly.
>
> So the bottom line is that the `host` should be sourced from config.
>
> This works out nice because it gives us a good place to source this
> variable ourselves. However, on bionic-train distribution, there's no
> `host` variable in configs. I tried it and the `nova.con.
> value of hostname (without the domain). Regardless, of this, even
> hypervisors on this this distribution are registered with full
> hostname+domain.
> That's as far as I got for now but bottom line I think is that we need
> to reliably reproduce the behavior of nova, when it's registering
> hypervisors, in our nova-compute charm.
>
>
> mka.
>
> ---
>
> [1] https:/
> compute/
> cloud_utils.py#L113
>
> [2]
> https:/
>
> [3]
> https:/
>
> [4]
> https:/
>
> [5]
> https:/
>
> [6]
> https:/
>
> ** Affects: charm-nova-compute
> Importance: Undecided
> Status: New
>