[ovn-octavia-provider]: batch update fails when members to remove is empty

Bug #1912779 reported by Michal Nasiadka
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Unassigned

Bug Description

Discovered bug using Kubernetes cloud-provider-openstack and trying to expose service using type: Loadbalancer.

I0122 07:06:34.842456 1 openstack_loadbalancer.go:1122] Updating 1 members for pool fa4db405-2e66-4ce2-a29c-743700e4d53c
I0122 07:06:34.842578 1 loadbalancer.go:371] OpenStack Request URL: PUT https://sparrow.cf.ac.uk:9876/v2.0/lbaas/pools/fa4db405-2e66-4ce2-a29c-743700e4d53c/members
I0122 07:06:34.842607 1 loadbalancer.go:371] OpenStack Request Headers:
I0122 07:06:34.842613 1 loadbalancer.go:371] Accept: application/json
I0122 07:06:34.842619 1 loadbalancer.go:371] Content-Type: application/json
I0122 07:06:34.842631 1 loadbalancer.go:371] User-Agent: openstack-cloud-controller-manager/da85a2c6-dirty gophercloud/2.0.0
I0122 07:06:34.842637 1 loadbalancer.go:371] X-Auth-Token: ***
I0122 07:06:34.842676 1 loadbalancer.go:371] OpenStack Request Body: {
I0122 07:06:34.842686 1 loadbalancer.go:371] "members": [
I0122 07:06:34.842692 1 loadbalancer.go:371] {
I0122 07:06:34.842698 1 loadbalancer.go:371] "address": "10.0.0.75",
I0122 07:06:34.842704 1 loadbalancer.go:371] "name": "demo-k8s-czynmnrzfgtu-node-0",
I0122 07:06:34.842711 1 loadbalancer.go:371] "protocol_port": 31166,
I0122 07:06:34.842717 1 loadbalancer.go:371] "subnet_id": "b1c8ea56-f7d1-4b14-b584-d621d77f88c1"
I0122 07:06:34.842723 1 loadbalancer.go:371] }
I0122 07:06:34.842729 1 loadbalancer.go:371] ]
I0122 07:06:34.842735 1 loadbalancer.go:371] }
I0122 07:06:35.844346 1 loadbalancer.go:371] OpenStack Response Code: 500
I0122 07:06:35.844384 1 loadbalancer.go:371] OpenStack Response Headers:
I0122 07:06:35.844389 1 loadbalancer.go:371] Connection: keep-alive
I0122 07:06:35.844394 1 loadbalancer.go:371] Content-Length: 114
I0122 07:06:35.844399 1 loadbalancer.go:371] Content-Type: application/json
I0122 07:06:35.844403 1 loadbalancer.go:371] Date: Fri, 22 Jan 2021 07:06:35 GMT
I0122 07:06:35.844408 1 loadbalancer.go:371] Server: WSGIServer/0.2 CPython/3.6.8
I0122 07:06:35.844413 1 loadbalancer.go:371] X-Openstack-Request-Id: req-2fff1131-4956-48cf-a018-c0b0f9ab55d7
I0122 07:06:35.844518 1 loadbalancer.go:371] OpenStack Response Body: {
I0122 07:06:35.844530 1 loadbalancer.go:371] "debuginfo": null,
I0122 07:06:35.844535 1 loadbalancer.go:371] "faultcode": "Server",
I0122 07:06:35.844540 1 loadbalancer.go:371] "faultstring": "Provider 'ovn' reports error: list index out of range"
I0122 07:06:35.844545 1 loadbalancer.go:371] }

tags: added: ovn-octavia-provider
Revision history for this message
Brian Haley (brian-haley) wrote :

Can you give more info on how to reproduce this outside of kuryr? For example with just some 'openstack loadbalancer...' commands? Thanks.

Revision history for this message
Brian Haley (brian-haley) wrote :
Changed in neutron:
status: New → In Progress
importance: Undecided → High
Revision history for this message
Michal Nasiadka (mnasiadka) wrote :
Download full text (4.3 KiB)

Well, it's not kuryr. It's Magnum created cluster, and it seems occm (cloud-provider-openstack using gopher cloud) when it adds members to a pool, it uses PUT (so batch mode), and the for loop that does batch adds/removals of members, goes crazy when it gets an empty member to delete.

1. cloud-provider-openstack calls:
- https://github.com/kubernetes/cloud-provider-openstack/blob/a2e094b14c40425b6e0c3c7be089e8b75a61b1eb/pkg/cloudprovider/providers/openstack/openstack_loadbalancer.go#L1120
- https://github.com/kubernetes/cloud-provider-openstack/blob/a2e094b14c40425b6e0c3c7be089e8b75a61b1eb/pkg/util/openstack/loadbalancer.go#L355
2. ovn-octavia-provider function that fails:
https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/driver.py#L342

Logs from octavia-api:
2021-01-22 09:48:59.279 7 INFO octavia.api.v2.controllers.member [req-8e558ff2-361f-4bd8-adac-bfe21be6c09e - 6732efe112134d5c880f5b95874f271a - default default] Sending Pool 175b48d3-fddb-406c-ae1d-dfa882f146d2 batch member update to provider ovn
2021-01-22 09:48:59.281 7 DEBUG ovn_octavia_provider.driver [req-8e558ff2-361f-4bd8-adac-bfe21be6c09e - 6732efe112134d5c880f5b95874f271a - default default] Members to delete: member_batch_update /var/lib/kolla/venv/lib/python3.6/site-packages/ovn_octavia_provider/driver.py:2246
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils [req-8e558ff2-361f-4bd8-adac-bfe21be6c09e - 6732efe112134d5c880f5b95874f271a - default default] Provider 'ovn' raised an unknown error: list index out of range: IndexError: list index out of range
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils Traceback (most recent call last):
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils File "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/api/drivers/utils.py", line 52, in call_provider
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils return driver_method(*args, **kwargs)
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils File "/var/lib/kolla/venv/lib/python3.6/site-packages/ovn_octavia_provider/driver.py", line 2248, in member_batch_update
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils request_info = {'id': member_info[1],
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils IndexError: list index out of range
2021-01-22 09:48:59.281 7 ERROR octavia.api.drivers.utils
2021-01-22 09:48:59.285 7 ERROR wsme.api [req-8e558ff2-361f-4bd8-adac-bfe21be6c09e - 6732efe112134d5c880f5b95874f271a - default default] Server-side error: "Provider 'ovn' reports error: list index out of range". Detail:
Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/api/drivers/utils.py", line 52, in call_provider
    return driver_method(*args, **kwargs)

  File "/var/lib/kolla/venv/lib/python3.6/site-packages/ovn_octavia_provider/driver.py", line 2248, in member_batch_update
    request_info = {'id': member_info[1],

IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib/python3.6/site-packages/wsmeext/pecan.py", line 84, in...

Read more...

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :

It's important to note that the batch members update is done when there are no members defined for a pool.

Revision history for this message
Brian Haley (brian-haley) wrote :

So since you're able to reproduce this can you look at my comments in the review? Thanks

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.0.0

This issue was fixed in the openstack/ovn-octavia-provider 1.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 0.4.0

This issue was fixed in the openstack/ovn-octavia-provider 0.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 7.4.1

This issue was fixed in the openstack/networking-ovn 7.4.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 0.1.3

This issue was fixed in the openstack/ovn-octavia-provider 0.1.3 release.

Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.