Do not define/configure lbaasv2 agents when deploying Octavia on Rocky+

Bug #1825906 reported by Drew Freiberger
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Neutron Gateway Charm
Fix Released
Medium
Unassigned

Bug Description

In a typical deployment, neutron-gateway charm installs and configures neutron-lbaasv2-agent service and registers it as a service in the neutron database. However, when running Octavia, the haproxy backend for the neutron-lbaasv2-agent configuration is not updated to list octavia or the localhaproxy plugin that is configured on the neutron-api services when related to octavia.

neutron-gateway should get a relation to octavia as an lbaas-provider to update the neutron-lbaas.conf file with an octavia appropriate configuration and drop the neutron-lbaasv2-agent.

These installs also have a very large backup in rabbit queue n-lbaasv2-plugin from lbaasv2-agent trying to call the haproxy backend.

Reference cloud is Foundation Cloud running bionic-rocky on 19.04 charms with octavia enabled in lxd on juju 2.5.4/MAAS 2.5.x.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

The behavior of keeping the lbaasv2-agent around on the ``neutron-gateway`` is intentional to support deployments migrating from the Neutron built in agent to Octavia.

When you add Octavia to the deployment and relate it to the ``neutron-api`` charm the Neutron API will be configured with the ``lbaasv2-proxy`` service which forwards any LBaaS requests to the Octavia API.

So in short after Octavia has been deployed all new loadbalancers will be created by Octavia and any existing load balancers already running on the gateways will be left untouched.

I can see this behavior might be unwanted when deploying a fresh cloud and we do intend to remove it altogether when the Neutron built-in support is removed upstream.

Could you share some more detail about what side effects you get from having it enabled and unused on the gateways, is it the message queue buildup the main issue or are there others too?

Revision history for this message
Drew Freiberger (afreiberger) wrote :

The message queue buildup, the log file accumulating errors that are invalid to the architecture, and the lbaasv2 agents showing as dead in neutron agent-list, which is alerted by openstack-service-checks

CRITICAL: Neutron agents enabled but down; net1[Loadbalancerv2 agent,48ec21c9-177c-46b3-b42a-e9f5a79d7265], net2[Loadbalancerv2 agent,d5d1ece8-7708-4d60-b435-d7c12c26bdc8]

This masks our ability to monitor other neutron agents that should be alive.

I can appreciate the continued inclusion of the neutron-lbaasv2-agent, but I would think we need to continue to include the haproxy backend as well for the agents to continue to work, or is the theory that the backend would be installed if the environment had been an upgrade rather than a fresh install and would not have been removed by the charm upgrade?

Revision history for this message
Drew Freiberger (afreiberger) wrote :

To reduce queue build-up and alerting, you can use the following:

juju run --unit rabbitmq-server/0 "rabbitmqctl set_policy -p openstack event-sample-overflow '^n-lbaasv2-plugin$' '{\"max-length\": 99 }' --apply-to queues --priority 1"
juju run --unit rabbitmq-server/0 "rabbitmqctl -p openstack list_queues name messages consumers state |grep n-lbaasv2-plugin"
juju config rabbitmq-server queue_thresholds="[['openstack', 'n-lbaasv2-plugin', 100, 100], ['\*', '\*', 100, 200]]"

Changed in charm-neutron-gateway:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Syed Mohammad Adnan Karim (karimsye) wrote :

Running into this issue in an openstack stein cloud with 39 nodes:
 - 37 nova-compute+ceph-osd
 - 2 neutron-gateway+ceph-osd

I also have octavia in HA deployed in lxd containers.

bundle.yaml can be found here - https://pastebin.canonical.com/p/wwGndCF45n/
I also have 2 overlay bundles that configure hostnames and ssl:
 - overlay-hostnames.yaml: https://pastebin.canonical.com/p/hhBr3WRbFs/
 - overlay-ssl.yaml: https://pastebin.canonical.com/p/hJ5X9RpKyQ/

I ran sudo rabbitmqctl list_queues -p openstack | less to search for the offending queue
and found n-lbaasv2-plugin 14806 to be the offending queue

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

This is just happening to me in a Stein fresh install.

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Neutron gateway should have an interface such as: https://github.com/openstack/charm-neutron-api/blob/master/metadata.yaml#L36

If that is set, then neutron-lbaasv2-agent && corresponding nrpe-checks are installed

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-gateway (master)

Reviewed: https://review.opendev.org/738477
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-gateway/commit/?id=86eb58317fc1d28dccc76e201ddff2c9f5fd1235
Submitter: Zuul
Branch: master

commit 86eb58317fc1d28dccc76e201ddff2c9f5fd1235
Author: Pedro Guimaraes <email address hidden>
Date: Mon Jun 29 18:40:33 2020 +0200

    Add disable-neutron-lbaas option

    Since Rocky, Octavia is a valid alternative as LBaaS.
    If enabled, we should not configure Neutron LBaaS(v2)
    agent at the same time.

    The fact that we configure both means neutron-lbaas-agent
    will generate messages on rabbitmq which never gets consumed
    and creating alarms on NRPE without any actual issues.

    This change introduces an option to disable neutron LBaaS
    solution. Once activated, it masks lbaas agent service.

    Change-Id: I10c4cc2983245efb5bef3d7cbc8e3b6963448a7d
    Closes-Bug: #1825906

Changed in charm-neutron-gateway:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-neutron-gateway:
milestone: none → 20.08
Changed in charm-neutron-gateway:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.