Magnum -> Octavia, Neutron driver - Not using region or availability zone to search, rather picking a random neutron endpoint causing lookup failures on subnets

Bug #2051604 reported by Noel Ashford
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
octavia
Fix Released
High
Gregory Thiemonge

Bug Description

t is lookup up on the wrong endpoint... I check endpoints, no errors there - seems there is no filter to tell it what to look at.... 2023.2 is my version. I never saw this issue prior to 2023.x & amphorav2

Trigger = Call Magnum COE create, it creates a LB, looks up the subnet and chooses the wrong endpoint when the LB is being made in the neutron driver for octavia....

2024-01-29 17:15:55.984 733 ERROR wsme.api [None req-026ef7fd-67aa-412a-89e1-a7529c45e72c - e5b9296fbd9e4d9ea5e925780c64690f - - default default] Server-side error: "subnet not found (subnet id: 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf).". Detail:
Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 189, in _get_resource
    resource = getattr(

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/network/v2/_proxy.py", line 5111, in get_subnet
    return self._get(_subnet.Subnet, subnet)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/proxy.py", line 61, in check
    return method(self, expected, actual, *args, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/proxy.py", line 665, in _get
    return res.fetch(

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/resource.py", line 1711, in fetch
    self._translate_response(response, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/resource.py", line 1287, in _translate_response
    exceptions.raise_from_response(response, error_message=error_message)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/exceptions.py", line 250, in raise_from_response
    raise cls(

openstack.exceptions.ResourceNotFound: No Subnet found for 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf: Client Error for url: https://int.dave.openstack.tunninet.com:9696/v2.0/subnets/1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf, Subnet 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf could not be found.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/wsmeext/pecan.py", line 82, in callfunction
    result = f(self, *args, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 453, in post
    self._validate_vip_request_object(load_balancer, context=context)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 308, in _validate_vip_request_object
    self._validate_subnets_share_network_but_no_duplicates(load_balancer)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 243, in _validate_subnets_share_network_but_no_duplicates
    used_subnets[subnet_id] = network_driver.get_subnet(subnet_id)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 250, in get_subnet
    return self._get_resource('subnet', subnet_id, context=context)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 197, in _get_resource
    raise getattr(base, '%sNotFound' % ''.join(

https://docs.openstack.org/octavia/latest/_modules/octavia/network/drivers/neutron/base.html#BaseNeutronDriver.get_subnet

def _get_resources_by_filters(self, resource_type, unique_item=False,
                                  **filters):
        """Retrieves item(s) from filters. By default, a list is returned.

        If unique_item set to True, only the first resource is returned.
        """
        try:
            resources = getattr(
                self.network_proxy, f"{resource_type}s")(**filters)
            conversion_function = getattr(
                utils,
                'convert_%s_to_model' % resource_type)
            try:
                # get first item to see if there is at least one resource
                res_list = [conversion_function(next(resources))]
            except StopIteration:
                # pylint: disable=raise-missing-from
                raise os_exceptions.NotFoundException(
                    f'No resource of type {resource_type} found that matches '
                    f'given filter criteria: {filters}.')

            if unique_item:
                return res_list[0]
            return res_list + [conversion_function(r) for r in resources]

        except os_exceptions.NotFoundException as e:
            message = _('{resource_type} not found '
                        '({resource_type} Filters: {filters}.').format(
                resource_type=resource_type, filters=filters)
            raise getattr(base, '%sNotFound' % ''.join(
                [w.capitalize() for w in resource_type.split('_')]
            ))(message) from e
        except Exception as e:
            message = _('Error retrieving {resource_type} '
                        '({resource_type} Filters: {filters}.').format(
                resource_type=resource_type, filters=filters)
            LOG.exception(message)
            raise base.NetworkException(message) from e

[docs]
    def get_network(self, network_id, context=None):
        return self._get_resource('network', network_id, context=context)

[docs]
    def get_subnet(self, subnet_id, context=None):
        return self._get_resource('subnet', subnet_id, context=context)

Somewhere, it is not using context to create the correct filter.... I am dead in the water on octavia usage without this, its also impacting magnum as i using octavia for this ;0 Thoughts ?

cross posting here: https://bugs.launchpad.net/kolla-ansible/+bug/2051602

Revision history for this message
Noel Ashford (nashford77) wrote :

Seems to not be honoring region etc. at a high level...

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

Hi, it looks similar to https://bugs.launchpad.net/octavia/+bug/2049551

there's an open patch that should fix it https://review.opendev.org/c/openstack/octavia/+/905794

do you have a way to test it?

Revision history for this message
Noel Ashford (nashford77) wrote :

Yes - I will test this later today or tomorrow and advise. Many Thanks in advance if this works

Revision history for this message
Noel Ashford (nashford77) wrote :
Download full text (5.1 KiB)

Does Not seem to fix it .... I applied it to no avail

openstack.exceptions.ResourceNotFound: No Subnet found for 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf: Client Error for url: https://int.dave.openstack.tunninet.com:9696/v2.0/subnets/1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf, Subnet 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf could not be found.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/wsmeext/pecan.py", line 82, in callfunction
    result = f(self, *args, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 453, in post
    self._validate_vip_request_object(load_balancer, context=context)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 308, in _validate_vip_request_object
    self._validate_subnets_share_network_but_no_duplicates(load_balancer)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/api/v2/controllers/load_balancer.py", line 243, in _validate_subnets_share_network_but_no_duplicates
    used_subnets[subnet_id] = network_driver.get_subnet(subnet_id)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 250, in get_subnet
    return self._get_resource('subnet', subnet_id, context=context)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 197, in _get_resource
    raise getattr(base, '%sNotFound' % ''.join(

octavia.network.base.SubnetNotFound: subnet not found (subnet id: 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf).
: octavia.network.base.SubnetNotFound: subnet not found (subnet id: 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf).

==> /var/log/kolla/octavia/octavia-api-access.log <==
192.168.5.99 - - [31/Jan/2024:00:15:50 -0500] "POST /v2.0/lbaas/loadbalancers HTTP/1.1" 500 128 1485190 "-" "heat-engine keystoneauth1/5.3.0 python-requests/2.28.2 CPython/3.10.12"

==> /var/log/kolla/octavia/octavia-api.log <==
2024-01-31 00:15:51.686 734 ERROR wsme.api [None req-580cb5f0-3c34-4ec6-bf6c-5596590d51b1 - e5b9296fbd9e4d9ea5e925780c64690f - - default default] Server-side error: "subnet not found (subnet id: 1b98d0d6-3eaa-490d-a9a4-e351fb0ebedf).". Detail:
Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/network/drivers/neutron/base.py", line 189, in _get_resource
    resource = getattr(

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/network/v2/_proxy.py", line 5111, in get_subnet
    return self._get(_subnet.Subnet, subnet)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/proxy.py", line 61, in check
    return method(self, expected, actual, *args, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/proxy.py", line 665, in _get
    return res.fetch(

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/resource.py", line 1711, in fetch
    self._translate_response(response, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.10/site-packages/openstack/resource.py", line 1287...

Read more...

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :
Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

Ok, there's a difference between the 2 backtraces

the first one is an issue with the neutron generic client (it uses octavia creds to communicate with neutron)
the second backtrace is an issue with the neutron user client (it uses the request's credentials to ensure that the user has access to the neutron resources)

I think https://review.opendev.org/c/openstack/octavia/+/905794 fixed your first issue but then you hit a second issue.

can you share more info on your deployment? what are the settings in the [service_auth] and [neutron] (and maybe other services like nova/glance, to compare them) section in the octavia config?

Revision history for this message
Noel Ashford (nashford77) wrote :

Question, the first fix i applied, i also see this patch? Would this have the same effect as the KWARGS addition to pass to SESS oslo config?

https://review.opendev.org/c/openstack/octavia/+/905805/1/octavia/common/clients.py

Ref to requested info:

[service_auth]
auth_url = https://int.noel.openstack.tunninet.com:5000
auth_type = password
username = octavia
password = <REDACTED FOR SECURITY REASONS>
user_domain_name = Default
project_name = service
project_domain_name = Default
cafile = /etc/ssl/certs/ca-certificates.crt
memcache_security_strategy = ENCRYPT
memcache_secret_key = <REDACTED FOR SECURITY REASONS>
memcached_servers = 192.168.5.1:11211

[glance]
region_name = TN_DEV_NY_5_NET
endpoint_type = internal
ca_certificates_file = /etc/ssl/certs/ca-certificates.crt

[neutron]
region_name = TN_DEV_NY_5_NET
endpoint_type = internal
ca_certificates_file = /etc/ssl/certs/ca-certificates.crt
endpoint = https://int.noel.openstack.tunninet.com:9696

[nova]
region_name = TN_DEV_NY_5_NET
endpoint_type = internal
ca_certificates_file = /etc/ssl/certs/ca-certificates.crt
availability_zone = 5Net

[cinder]
availability_zone = 5Net

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

--

[neutron]
region_name = TN_DEV_NY_5_NET
endpoint_type = internal
ca_certificates_file = /etc/ssl/certs/ca-certificates.crt
endpoint = https://int.noel.openstack.tunninet.com:9696

--

In 2023.2, we have deprecated the `endpoint_type` and `endpoint` parameters in favor of `valid_interfaces` and `endpoint_override`.
we added some code to convert automatically the "old" settings to the new ones, but I think there's a bug (I'm reproducing a similar issue in my env).

I found a possible workaround, if you rename the keys of the settings with the new names (only in the neutron section), it should work correctly:

[neutron]
region_name = TN_DEV_NY_5_NET
valid_interfaces = internal
ca_certificates_file = /etc/ssl/certs/ca-certificates.crt
endpoint_override = https://int.noel.openstack.tunninet.com:9696

can you try that?

On my side, I'm working on a bugfix.

Changed in octavia:
assignee: nobody → Gregory Thiemonge (gthiemonge)
importance: Undecided → High
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to octavia (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/octavia/+/907426

Changed in octavia:
status: Confirmed → In Progress
Revision history for this message
Noel Ashford (nashford77) wrote : Re: [Bug 2051604] Re: Magnum -> Octavia, Neutron driver - Not using region or availability zone to search, rather picking a random neutron endpoint causing lookup failures on subnets
Download full text (15.6 KiB)

Did any other sections like NOVA etc change as well? Should all instances
of endpoint_type be removed from octavia or just Neutron?

I updated all sections to use the new monikers in one attempt and only
[neutron] updated in the second attempt.

As it complained about the below, I tried to update [glance], [nova]
[cinder] and still it gives me the same ?

It got further but seemingly broke on the other services it needs
regardless of the above approaches.

[image: image.png]

==> /var/log/kolla/octavia/octavia-worker.log <==
2024-02-01 12:12:56.958 731 WARNING octavia.common.base_taskflow [-] Task
'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-amp-compute-connectivity-wait'
(1747f037-6b94-448b-98f5-e2fad599f03d) transitioned into state 'REVERTED'
from state 'REVERTING'
2024-02-01 12:12:57.134 731 WARNING
octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to
instance. Retrying.: requests.exceptions.ConnectionError:
HTTPSConnectionPool(host='172.16.5.129', port=9443): Max retries exceeded
with url: // (Caused by
NewConnectionError('<urllib3.connection.HTTPSConnection object at
0x7f1b9057fa60>: Failed to establish a new connection: [Errno 111]
Connection refused'))

==> /var/log/kolla/octavia/octavia-api-access.log <==
2024-02-01 12:12:32.207 733 WARNING openstack [None
req-1ee9dfa6-a33d-4550-951a-d05260daa944 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'block-storage': Encountered an
exception attempting to process config for project 'cinder' (service type
'block-storage'): no such option valid_interfaces in group [cinder]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[cinder]
2024-02-01 12:12:32.208 733 WARNING openstack [None
req-1ee9dfa6-a33d-4550-951a-d05260daa944 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'compute': Encountered an exception
attempting to process config for project 'nova' (service type 'compute'):
no such option valid_interfaces in group [nova]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[nova]
2024-02-01 12:12:32.208 733 WARNING openstack [None
req-1ee9dfa6-a33d-4550-951a-d05260daa944 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'image': Encountered an exception
attempting to process config for project 'glance' (service type 'image'):
no such option valid_interfaces in group [glance]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[glance]
2024-02-01 12:12:32.242 736 WARNING openstack [None
req-5bb468c8-0f06-4351-ab5f-4b5754d62775 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'block-storage': Encountered an
exception attempting to process config for project 'cinder' (service type
'block-storage'): no such option valid_interfaces in group [cinder]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[cinder]
2024-02-01 12:12:32.242 736 WARNING openstack [None
req-5bb468c8-0f06-4351-ab5f-4b5754d62775 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'compute': Encountered an exception
attempting to process config for project 'nova' (service type 'compute...

Revision history for this message
Noel Ashford (nashford77) wrote :
Download full text (20.0 KiB)

OK, an update ... The above config changing everything worked with the
above errors tho?!

Here is what happens when i do not update the new moniker for glance,
cinder, nova etc.... (which makes no sense to me?)

==> /var/log/kolla/octavia/octavia-api.log <==
2024-02-01 12:36:03.842 733 WARNING openstack [None
req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'block-storage': Encountered an
exception attempting to process config for project 'cinder' (service type
'block-storage'): no such option valid_interfaces in group [cinder]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[cinder]
2024-02-01 12:36:03.843 733 WARNING openstack [None
req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'compute': Encountered an exception
attempting to process config for project 'nova' (service type 'compute'):
no such option valid_interfaces in group [nova]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[nova]
2024-02-01 12:36:03.843 733 WARNING openstack [None
req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'image': Encountered an exception
attempting to process config for project 'glance' (service type 'image'):
no such option valid_interfaces in group [glance]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[glance]
2024-02-01 12:36:03.884 737 WARNING openstack [None
req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'block-storage': Encountered an
exception attempting to process config for project 'cinder' (service type
'block-storage'): no such option valid_interfaces in group [cinder]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[cinder]
2024-02-01 12:36:03.884 737 WARNING openstack [None
req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'compute': Encountered an exception
attempting to process config for project 'nova' (service type 'compute'):
no such option valid_interfaces in group [nova]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[nova]
2024-02-01 12:36:03.885 737 WARNING openstack [None
req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Disabling service 'image': Encountered an exception
attempting to process config for project 'glance' (service type 'image'):
no such option valid_interfaces in group [glance]:
oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
[glance]
2024-02-01 12:36:03.994 733 INFO octavia.api.v2.controllers.member [None
req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
- - default default] Sending create Member
4a551e43-42b8-49ea-a989-42a28c1466ef to provider amphora

Why when i dont provide this is it saying this?

Unrelated, I also see this:

I also saw this Error in the logs:

==> /var/log/kolla/octavia/octavia-api-error.log <==
2024-02-01 12:14:01.370602
/var/lib/kolla/ven...

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

You only have to update the [neutron] section, don't change the other sections.

the messages

WARNING octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to
instance. Retrying.: requests.exceptions.ConnectionError:
HTTPSConnectionPool(host='172.16.5.129', port=9443): Max retries exceeded
with url: // (Caused by
NewConnectionError('<urllib3.connection.HTTPSConnection object at
0x7f1b9057fa60>: Failed to establish a new connection: [Errno 111]
Connection refused'))

are normal, Octavia tries to reach the VM that it created, but it takes a few second to boot it and start the agent in the VM, during that time, those WARNINGs are displayed. If the same message is displayed with an ERROR-level, it's a real issue, the VM is not reachable.

your screenshot shows that the load balancer was successfully created/updated, its provisioning_status is ACTIVE, from the point of view of the Octavia controller, the load balancer is ready.
the ERROR operating_status means that you have health-monitor in one of your pool and the health-monitor failed when it ensures that the backend members are up.

in most of the cases, it means that there's a problem outside of octavia:
- the members are down
- the members are not correctly configured (subnet not reachable from the LB)
- the HM is not correctly configured (for instance, an HTTP HM checks for GET / with a 200 status, but receives a 302 status)

you can get the status of all the resources of a LB with the following command:

openstack loadbalancer status show <lb>

in your case it will probably show that the operating_status of the members is ERROR

Revision history for this message
Noel Ashford (nashford77) wrote :
Download full text (20.8 KiB)

It also worked to standup magnum (which calls octavia) fully by using the
old moniker for all but Neutron (but with the above errors that dont make
sense as i didnt pass them)

On Thu, Feb 1, 2024 at 12:38 PM Noel Ashford <email address hidden> wrote:

> OK, an update ... The above config changing everything worked with the
> above errors tho?!
>
> Here is what happens when i do not update the new moniker for glance,
> cinder, nova etc.... (which makes no sense to me?)
>
> ==> /var/log/kolla/octavia/octavia-api.log <==
> 2024-02-01 12:36:03.842 733 WARNING openstack [None
> req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'block-storage': Encountered an
> exception attempting to process config for project 'cinder' (service type
> 'block-storage'): no such option valid_interfaces in group [cinder]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [cinder]
> 2024-02-01 12:36:03.843 733 WARNING openstack [None
> req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'compute': Encountered an exception
> attempting to process config for project 'nova' (service type 'compute'):
> no such option valid_interfaces in group [nova]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [nova]
> 2024-02-01 12:36:03.843 733 WARNING openstack [None
> req-366da871-ab3b-4e98-9780-0f29a93cfe14 - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'image': Encountered an exception
> attempting to process config for project 'glance' (service type 'image'):
> no such option valid_interfaces in group [glance]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [glance]
> 2024-02-01 12:36:03.884 737 WARNING openstack [None
> req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'block-storage': Encountered an
> exception attempting to process config for project 'cinder' (service type
> 'block-storage'): no such option valid_interfaces in group [cinder]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [cinder]
> 2024-02-01 12:36:03.884 737 WARNING openstack [None
> req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'compute': Encountered an exception
> attempting to process config for project 'nova' (service type 'compute'):
> no such option valid_interfaces in group [nova]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [nova]
> 2024-02-01 12:36:03.885 737 WARNING openstack [None
> req-4b65bd1b-3e6d-437d-a879-c08a1eaf225a - e5b9296fbd9e4d9ea5e925780c64690f
> - - default default] Disabling service 'image': Encountered an exception
> attempting to process config for project 'glance' (service type 'image'):
> no such option valid_interfaces in group [glance]:
> oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group
> [glance]
> 2024-02-01 12:36:03.994 733 INFO octavia.api.v2.controllers.member [None
> req-366da871-ab3b-4e98-9780-0f29a93c...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to octavia (master)

Reviewed: https://review.opendev.org/c/openstack/octavia/+/907426
Committed: https://opendev.org/openstack/octavia/commit/7bb6096eccc7966bd963a0529dc7b352246dfdbb
Submitter: "Zuul (22348)"
Branch: master

commit 7bb6096eccc7966bd963a0529dc7b352246dfdbb
Author: Gregory Thiemonge <email address hidden>
Date: Thu Feb 1 14:23:01 2024 +0100

    Fix neutron setting overrides

    Since 2023.2, we deprecated some settings in the [neutron] section
    ('endpoint', 'endpoint_type' and 'ca_certificates_file'), they are
    respectively replaced by 'endpoint_override', 'valid_interfaces' and
    'cafile'. There's some code in Octavia that automatically sets the new
    settings if the user still has the old settings (it is required because
    keystoneauth uses the CONF objects to establish the sessions).
    But some corner cases were not correctly addressed in that patch.

    Now Octavia ensures that the override of the parameters is correctly
    handled.

    Change-Id: Ic37e9f699e32431ae1735ddc9642689967ddc696
    Closes-Bug: 2051604

Changed in octavia:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/octavia 14.0.0.0rc1

This issue was fixed in the openstack/octavia 14.0.0.0rc1 release candidate.

Revision history for this message
Noel Ashford (nashford77) wrote :
Download full text (7.1 KiB)

Hello, I may be seeing another issue ? One of my regions is working fine, a second one is not ? It claims the nova availability zone is not available which seems more like it is not passing it at all ?

2024-03-23 06:33:56.370 734 INFO octavia.certificates.generator.local [-] Using CA Private Key Passphrase from config.
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver [-] Nova failed to build the instance due to: The requested availability zone is not available (HTTP 400) (Request-ID: req-2522b80e-8c8a-45f2-94b0-f699183f23ff): novaclient.exceptions.BadRequest: The requested availability zone is not available (HTTP 400) (Request-ID: req-2522b80e-8c8a-45f2-94b0-f699183f23ff)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver Traceback (most recent call last):
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/octavia/compute/drivers/nova_driver.py", line 139, in build
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver amphora = self.manager.create(
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/novaclient/v2/servers.py", line 1657, in create
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver return self._boot(response_key, *boot_args, **boot_kwargs)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/novaclient/v2/servers.py", line 966, in _boot
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver return self._create(
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/novaclient/base.py", line 363, in _create
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver resp, body = self.api.client.post(url, body=body)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/keystoneauth1/adapter.py", line 401, in post
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver return self.request(url, 'POST', **kwargs)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver File "/var/lib/kolla/venv/lib/python3.10/site-packages/novaclient/client.py", line 78, in request
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver raise exceptions.from_response(resp, body, url, method)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver novaclient.exceptions.BadRequest: The requested availability zone is not available (HTTP 400) (Request-ID: req-2522b80e-8c8a-45f2-94b0-f699183f23ff)
2024-03-23 06:33:57.115 734 ERROR octavia.compute.drivers.nova_driver
2024-03-23 06:33:57.115 734 ERROR octavia.controller.worker.v2.tasks.compute_tasks [-] Compute create for amphora id: 4696bf7c-9538-47f8-8583-3f7bb4ea7fda failed: octavia.common.exceptions.ComputeBuildException: Failed to build compute instance due to: The requested availability zone is not available (HTTP 400) (Request-ID: req-2522b80e-8c8a-45f2-9...

Read more...

Revision history for this message
Noel Ashford (nashford77) wrote :

Noa wont tell me WHAT availability zone it tried to pass. but here is the same thing on the nova side ...

2024-03-23 06:42:47.885 735 INFO nova.api.openstack.requestlog [None req-5d3c4ee3-a9f6-4ab7-8b0c-5a967c33a1bd 0f28f91c6e9a422dac08846921757bea e5b9296fbd9e4d9ea5e925780c64690f - - default default] 192.168.8.1 "GET /" status: 200 len: 373 microversion: - time: 0.000435
2024-03-23 06:42:55.114 736 INFO nova.api.openstack.wsgi [None req-9397d1bb-5bb5-4749-9e6c-1ef905507e3a 21e97fa85af14618a1f93b1cb1669443 d159d34277044c29b710e20ef51fe0ca - - default default] HTTP exception thrown: The requested availability zone is not available
2024-03-23 06:42:55.115 736 INFO nova.api.openstack.requestlog [None req-9397d1bb-5bb5-4749-9e6c-1ef905507e3a 21e97fa85af14618a1f93b1cb1669443 d159d34277044c29b710e20ef51fe0ca - - default default] 192.168.8.99 "POST /v2.1/servers" status: 400 len: 92 microversion: 2.15 time: 0.112837

Revision history for this message
Noel Ashford (nashford77) wrote :

Actually I think I failed to set a host aggregate on this particular machine and i was calling it from Octavia. Never mind.

Revision history for this message
Noel Ashford (nashford77) wrote :

One additional question, I also see this occuring and i dont use valid_interfaces in my config. How can this be corrected code wise ? Also has this code merged, i tried a newer code base recently hoping the change was in kolla 17.3.0 without. See Below.

root@5net:~# tail /var/log/kolla/octavia/octavia-api.log
2024-04-13 05:35:19.931 734 WARNING openstack [None req-67516d28-2888-43c4-85f6-e5e2d122d3b7 - e95c8c84b62e416daf03b91a9648962b - - default default] Disabling service 'image': Encountered an exception attempting to process config for project 'glance' (service type 'image'): no such option valid_interfaces in group [glance]: oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group [glance]
2024-04-13 05:38:09.270 736 WARNING openstack [None req-bbf790ca-b0cc-4149-a44b-4119972c69ec - e95c8c84b62e416daf03b91a9648962b - - default default] Disabling service 'block-storage': Encountered an exception attempting to process config for project 'cinder' (service type 'block-storage'): no such option valid_interfaces in group [cinder]: oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group [cinder]
2024-04-13 05:38:09.271 736 WARNING openstack [None req-bbf790ca-b0cc-4149-a44b-4119972c69ec - e95c8c84b62e416daf03b91a9648962b - - default default] Disabling service 'compute': Encountered an exception attempting to process config for project 'nova' (service type 'compute'): no such option valid_interfaces in group [nova]: oslo_config.cfg.NoSuchOptError: no such option valid_interfaces in group [nova]

Also, a more major concern is it is losing state when i reboot and all of the load balancers made by Kubernetes etc. go bad / pending / error after having been running. I tried to reboot their compute instance to no avail. I thought state was added some time back ?

Changed in octavia:
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
Revision history for this message
Noel Ashford (nashford77) wrote :

Is it correct to assume this would only be in 2024.1 kolla images once kolla has 2024.1 available?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to octavia (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/octavia/+/917673

Revision history for this message
Noel Ashford (nashford77) wrote :

Thank you - will this address the above issue ?

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

Hi,

about the first issue "no such option valid_interfaces in group"
i don't know the cause of it, but I think it's harmless, none of those services are used by the octavia-api, they are used by the worker, housekeeping and health-manager services.

for the 2nd issue. are you rebooting the nodes that host the octavia services?
I think we need the logs to understand what's happening.

Even if the octavia services are not running, the load balancers should be in a correct state when the services are back.
But it could go wrong if the network is down for a certain period of time and octavia detects that the amphora VMs are not reachable, it may trigger a failover of the load balancers, in case one of the required service is not active (nova, neutron, glance, etc..) the failover can fail and leave the load balancer in error.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to octavia (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/octavia/+/917673
Committed: https://opendev.org/openstack/octavia/commit/a9a3c64eeb277a9c980ff5b84d0eba41eb40b622
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit a9a3c64eeb277a9c980ff5b84d0eba41eb40b622
Author: Gregory Thiemonge <email address hidden>
Date: Thu Feb 1 14:23:01 2024 +0100

    Fix neutron setting overrides

    Since 2023.2, we deprecated some settings in the [neutron] section
    ('endpoint', 'endpoint_type' and 'ca_certificates_file'), they are
    respectively replaced by 'endpoint_override', 'valid_interfaces' and
    'cafile'. There's some code in Octavia that automatically sets the new
    settings if the user still has the old settings (it is required because
    keystoneauth uses the CONF objects to establish the sessions).
    But some corner cases were not correctly addressed in that patch.

    Now Octavia ensures that the override of the parameters is correctly
    handled.

    Conflicts:
            octavia/common/config.py

    Change-Id: Ic37e9f699e32431ae1735ddc9642689967ddc696
    Closes-Bug: 2051604
    (cherry picked from commit 7bb6096eccc7966bd963a0529dc7b352246dfdbb)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.