[SRU] Loadbalancer is stuck with PENDING_UPDATE state on member update API

Bug #2067441 reported by Hoyoun Lee
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
New
Undecided
Unassigned
Antelope
New
Undecided
Unassigned
Bobcat
New
Undecided
Unassigned
Caracal
New
Undecided
Unassigned
Yoga
New
Undecided
Unassigned
Zed
New
Undecided
Unassigned
octavia
In Progress
Medium
Gregory Thiemonge
octavia (Ubuntu)
Status tracked in Oracular
Jammy
New
Undecided
Unassigned
Mantic
New
Undecided
Unassigned
Noble
New
Undecided
Unassigned
Oracular
New
Undecided
Unassigned

Bug Description

[Impact]

Loadbalancer is stuck with PENDING_UPDATE state on batch member update API.

[Test Case]

Please refer to [Test steps] section below.

[Regression Potential]

The fix is already in the upstream main, stable/2024.1, stable/2023.2, stable/2023.1 branches, so it is a clean backport and might be helpful for deployments using octavia.

I also test this fix, it works well - https://paste.ubuntu.com/p/wPy7pB3SR6/ and https://paste.ubuntu.com/p/zpPDScQCtK/

and I also test debdiff for this fix, it works well - https://paste.ubuntu.com/p/nS6c3QYRGn/

[Others]

Original Bug Description Below
===========

By mistake, I sent wrong request with duplicated ip, port compbination through the Batch Update Members API(ver 2023.1).
https://docs.openstack.org/api-ref/load-balancer/v2/#batch-update-members

For example :
192.0.2.16:80 Member already exists, and request data like follows
{
    "members": [
        {
            "subnet_id": "xxxxxxx",
            "address": "192.0.2.16",
            "protocol_port": 80
        }, {
            "subnet_id": "xxxxxxx",
            "address": "192.0.2.16",
            "protocol_port": 80
        }
    ]
}

After the request, the status of Loadbalancer does not change from PENDING_UPDATE.

When checking the source code, there is no logic to check for duplicates.

In the controller logic(member.py), members are classified into new_members/updated_members/deleted_member, but the updated_members data is being passed as is with duplicates, so this is suspected to be the cause of the problem.

## log : 33fe25ab-5477-4787-a8e1-f657376b0ead is duplicated
May 29 04:14:32 ubuntu octavia-worker[123317]: INFO octavia.controller.queue.v2.endpoints [-] Batch updating members: old='[]', new='[]', updated='['825dbebc-da79-4f88-bf48-0e3e63a09d90', '33fe25ab-5477-4787-a8e1-f657376b0ead', '33fe25ab-5477-4787-a8e1-f657376b0ead']'...
May 29 04:14:32 ubuntu octavia-worker[123317]: ERROR oslo_messaging.rpc.server [-] Exception during message handling: taskflow.exceptions.Duplicate: Atoms with duplicate names found: ['octavia-mark-member-active-indb-33fe25ab-5477-4787-a8e1-f657376b0ead']

FYI, There is validation logic for new_members.

[Test steps]

1, set up a openstack env with octavia deployment

2, create a test lb

3, add a member into lb pool

openstack loadbalancer member create --subnet-id private_subnet --address 192.168.21.226 --protocol-port 80 lb1-pool
$ openstack loadbalancer member list lb1-pool |grep ACTIVE
| b36bb21e-8eed-40bc-a1cb-e69da070c0b9 | | 4f1016d73ae245fe8c5c6a637930f3d2 | ACTIVE | 192.168.21.226 | 80 | ONLINE | 1 |

3, run test.py (https://paste.ubuntu.com/p/38vPW5R5S8/) to call batch member update API to add the same member (eg: 192.168.21.226 above)

4, then we will reproduce the problem, lb will be stuck with PENDING_UPDATE state.

$ openstack loadbalancer member list lb1-pool |grep 192
| b36bb21e-8eed-40bc-a1cb-e69da070c0b9 | | 4f1016d73ae245fe8c5c6a637930f3d2 | PENDING_UPDATE | 192.168.21.226 | 80 | ONLINE | 40 |

5, This is error log I saw - https://paste.ubuntu.com/p/K5s7knNmWw/

[Some Analyses]

You can see some analysis from the bugs I created earlier - https://bugs.launchpad.net/octavia/+bug/2070348

Tags: patch sts
Hoyoun Lee (hoyoun-lee)
description: updated
summary: - Loadbalancers is stuck with PENDING_UPDATE state on member update API
+ Loadbalancer is stuck with PENDING_UPDATE state on member update API
Hoyoun Lee (hoyoun-lee)
description: updated
Revision history for this message
Gregory Thiemonge (gthiemonge) wrote : Re: Loadbalancer is stuck with PENDING_UPDATE state on member update API

Hi, I have an open patch that fixes this issue:

https://review.opendev.org/c/openstack/octavia/+/864192

I will ask folks to review it so we can backport it down to 2023.1

Changed in octavia:
assignee: nobody → Gregory Thiemonge (gthiemonge)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Hoyoun Lee (hoyoun-lee) wrote (last edit ):

https://review.opendev.org/c/openstack/octavia/+/921430 (2023.2)
https://review.opendev.org/c/openstack/octavia/+/921429 (2024.1)
https://review.opendev.org/c/openstack/octavia/+/921433 (2023.1)

Thank you for your bug-fix.

But, recently when I checked updated source code, I couldn't find updated code on branch(2023.1, 2023.2, 2024.1)
There is no "updated_member_uniques = set()".

Only on master branch, I found the updated code.

Is it gone? or is it going to apply?

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

Hi, the backports to the stable branches are under review, I believe that they will merge this week (I'll ping reviewers about them)

Revision history for this message
Hoyoun Lee (hoyoun-lee) wrote :

I checked the merged commit on 2023.1, 2023.1, 2024.1.
I appreciate your concern.

Revision history for this message
Hua Zhang (zhhuabj) wrote :
description: updated
summary: - Loadbalancer is stuck with PENDING_UPDATE state on member update API
+ [SRU] Loadbalancer is stuck with PENDING_UPDATE state on member update
+ API
tags: added: sts
Revision history for this message
Hua Zhang (zhhuabj) wrote :
Revision history for this message
Hua Zhang (zhhuabj) wrote :
Revision history for this message
Hua Zhang (zhhuabj) wrote :
Revision history for this message
Hua Zhang (zhhuabj) wrote :

I uploaded 4 debdiffs, noble.debdiff, mantic.debdiff, jammy.debdiff and antelope.debdiff

1, I didn't upload debdiff for oracular, because it has the same pkg as noble

2, I didn't upload debdiff or bobcat and caracle, because they have the same pkg as noble and mantic

3, I didn't upload debdiff for focal-yoga, because it has the same pkg as jammy

I created these debdiffs by using pbuilder, I didn't test them with PPA due to one 'unmatch md5' issue - https://paste.ubuntu.com/p/6TSmYBdrhD/

but I create jammy.debdiff by using debuild locally, it works well, see - https://paste.ubuntu.com/p/nS6c3QYRGn/

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "noble.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.