octavia LBs always offline in Xena

Bug #1950350 reported by Andre Ruiz
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
octavia (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

After switching from Wallaby to Xena, the same procedure to create a simple loadbalancer is no longer working (both from dashboard and from CLI). All backends appear as offline, as if the monitor tests were failing, although the service is accessible.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Subscribed field-critical.

I'll add more information as soon as possible, but I'm available for debugging if someone can work on this.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Can you please provide various "show" outputs for the loadbalancers, members, listeners, etc?

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

https://pastebin.canonical.com/p/fBwxNVPVB7/

This is the whole loadbalancer creation log. You can see all backends are "offline" at the end (and they stay offline).

https://pastebin.canonical.com/p/yZVCb2Py63/

This is all the security groups on the system, including the ones created byu octavia.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I have no errors when installing octavia, nothing unusual on the juju logs.

I'm searching for specific backend health logs.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I have tried to add ALL possible roles related to octavia to the admin user and it did not improve the situation.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote (last edit ):

https://pastebin.canonical.com/p/dnq9sBZMKn/

Found SQL error in /var/log/octavia/octavia-health-manager.log about health monitor. I have very similar messagens on the 3 octavia units.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

https://pastebin.canonical.com/p/rW8b88MRVj/

Also found these errors in /var/log/octavia/octavia-worker.log

Revision history for this message
Andre Ruiz (andre-ruiz) wrote (last edit ):

I'm taking a step back and trying to just create a simple loadbalancer without any resources and get that ONLINE, like you can see here (from coreycb):

https://paste.ubuntu.com/p/NRdsvPrRhB/ and https://paste.ubuntu.com/p/zvN6VMYvpz/

Instead, I get this:

https://pastebin.canonical.com/p/FGcwHkgZ27/

The amphora image in these tests was generated by GSS + retrofit.

This is some extra info about the flavors configured on my system:

https://pastebin.canonical.com/p/PD875kjXw9/

I still get errors in the logs while creating the LB. These are the last 200 lines of each log file in the first unit (other two are very similar).

https://pastebin.canonical.com/p/cdSSfRkj43/

This cloud was tottaly redeployed from scratch for this test.

The juju-log for the octavia units do not have anything useful (no errors or clues).

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I removed all configs (flavor, flavorprofile, availabilityzone, availabilityzoneprofile) and just created a very simple loadbalancer passing only name and network options.

It still get ACTIVE/OFFLINE.

ubuntu@app1maas001p:~/2021-09-20-OP-212891-xxx-Prod1$ openstack loadbalancer list
+--------------------------------------+------+----------------------------------+-------------+---------------------+------------------+----------+
| id | name | project_id | vip_address | provisioning_status | operating_status | provider |
+--------------------------------------+------+----------------------------------+-------------+---------------------+------------------+----------+
| 0debcdb0-64a6-47b8-847d-ec6610544711 | lb4 | 4d766f4d974c4419bfe0037f51e0b1aa | 172.16.0.57 | ACTIVE | OFFLINE | amphora |
+--------------------------------------+------+----------------------------------+-------------+---------------------+------------------+----------+

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in octavia (Ubuntu):
status: New → Confirmed
Revision history for this message
Marian Gasparovic (marosg) wrote :

We can see the same in SQA env. I would not normally notice it as rally and tempest passed just fine, but Andre mentioned this error and I can see the same ACTIVE/OFFLINE output

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Ok, I was taking a deeper look at corey's output (where it shows LB is active) (comment #8) and wanted to try that "ovn" driver that appears on his output (not directly related to this problem, but anyway).

Then I saw the documentation saying that if you deploy octavia with OVN you should get that automatically. You can see in my outputs that I did not have it, even using OVN.

On deeper inspection, I noticed I was missing a relation from the example overlay. This relation is not automatically added by the FCE bundle builder, this one:

- - octavia:ovsdb-cms
  - ovn-central:ovsdb-cms

I'm not sure if this is new in Xena, or was already needed before (Wallaby? maybe even Ussuri?). When I added it manually, I did get the OVN driver to appear in the list.

And.... to my surprise, my test loadbalancer (the one with octavia driver) went online!!!

This makes it seem like this relation is more important than just the OVN driver itself.

Can someone confirm what it is for?

Anyway, I'm adding FCE to the affected components.

information type: Public → Private
information type: Private → Public
Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I created a new bug for FCE --> https://bugs.launchpad.net/cpe-foundation/+bug/1950678

I could not just add it here.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Confirmed, everything working as it should. LB is online/active, all resources working well (health checsks, backends, etc.). Even adding a FIP and accessing from "outside".

I'm closing as invalid, and keeping the other bug.

Changed in octavia (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.