[Yoga] Octavia's LB VIPs not working with allow-address-pairs

Bug #1976461 reported by Pedro Guimarães
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Octavia Charm
Invalid
Undecided
Unassigned
charm-ovn-central
Invalid
Undecided
Unassigned
charm-ovn-chassis
Invalid
Undecided
Unassigned
neutron
New
Undecided
Unassigned

Bug Description

Hi team,

I am currently deploying with:
juju 2.9.31
MAAS 3.1
openstack/yoga, bundle: https://pastebin.canonical.com/p/Rw376CF4Dw/
Octavia: standalone setup

When I create a LB for my kubernetes cluster, I've noticed the LB is unresponsive if I try to reach out from one of my VMs.

I can access the LB and confirm the amphora-haproxy namespace exists, with the network interface attached and it has both LB IP and VIP configured to it

Trying to reach out to the LB from one of the k8s vms results in timeout.

I can see the behavior changes according to which IP I try to connect to on the LB.

In scenario (1): from the client VM > LB IP (not the VIP):
I can see the connection works, this is the ovs-ofctl on the hypervisor of the sending machine shows: https://pastebin.canonical.com/p/bZ77hhWgD6/
Traffic gets correctly routed to one of the GENEVE tunnels, given the VM and the LB front end IP are placed in the same tenant subnet

In scenario (2): from the client VM > LB VIP (the address-pair):
I can see the connection does not work.
ovs-ofctl from the sending hypervisor shows: https://pastebin.canonical.com/p/SBmW97yHVr/
Traffic gets dropped from the sending hypervisor.

**** DETAILS OF MY CURRENT YOGA DEPLOYMENT ****
network openstack: https://pastebin.canonical.com/p/mfrPgVjyMp/
server and LB list: https://pastebin.canonical.com/p/VKjdHzTNvD/
port list: https://pastebin.canonical.com/p/trk2CPhDzf/
ovn-nbctl show: https://pastebin.canonical.com/p/njKjWGX5gX/
ovn-nbctl details of the VIP: https://pastebin.canonical.com/p/wwQy3HH4QR/
***********************************************

**** STEPS TO REPRODUCE ****
1) Deploy Openstack/Yoga with the bundle above
2) Create 2x backend nodes on a tenant network
3) Create an LB on the same tenant network
4) Access one of the backend nodes (or create a client VM for this test)
5) Try to reach to the LB: connection times out
****************************

Revision history for this message
Nobuto Murata (nobuto) wrote (last edit ):

I can also confirm Octavia Amphora LB doesn't work as expected somehow only with Yoga deployment, but works with Xena or older deployments.

I'm following the OpenStack integration for k8s API endpoint.
https://ubuntu.com/kubernetes/docs/openstack-integration#api-server-load-balancer

For both releases (Yoga and Xena), Octavia Apmhora LB gets ACTIVE and ONLINE.

$ openstack loadbalancer list --format yaml
- id: e6277cf5-216d-4ed2-a285-dd1c73e799d1
  name: openstack-integrator-dfb7e924f5f7-kubernetes-control-plane
  operating_status: ONLINE
  project_id: 8d38f13076ff4c6ea7aafa6cd5948859
  provider: amphora
  provisioning_status: ACTIVE
  vip_address: 10.5.5.216

However, access to k8s API through Amphora works only with Xena deployment, but with Yoga deployment the connection times out.

### 192.168.151.52 is a FIP assigned to the LB
$ curl -vm 5 https://192.168.151.52:443/
* Trying 192.168.151.52:443...
* TCP_NODELAY set
* Connection timed out after 5000 milliseconds
* Closing connection 0
curl: (28) Connection timed out after 5000 milliseconds

My test deployment is based on openstack-base + octavia overlay:
https://github.com/nobuto-m/quick-maas/blob/1a2f307c0c1efcca88f0e8988f46967be5e024d0/user-script.sh#L150

Revision history for this message
odie (odiecoranes) wrote (last edit ):

I'm trying to deploy openstack-base-focal-yoga bundle but I can't create Load balancer. Got this error "can't connect to amphora".

MAAS: 3.0.0
JUJU: 2.9.31

openstack-base-focal-yoga:
https://github.com/openstack-charmers/openstack-bundles/blob/master/development/openstack-base-focal-yoga/bundle.yaml
https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/overlays/loadbalancer-octavia.yaml

ERROR:
2022-06-07 23:05:36.381 366327 ERROR oslo_messaging.rpc.server octavia.amphorae.driver_exceptions.exceptions.AmpConnectionRetry: Could not connect to amphora, exception caught: HTTPSConnectionPool(host='172.31.0.96', port=9443): Max retries exceeded with url: // (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f94ba2b3eb0>: Failed to establish a new connection: [Errno 113] No route to host'))
2022-06-07 23:05:36.381 366327 ERROR oslo_messaging.rpc.server

Hi Nobuto Murata,
May I know which versions/rev are working with Xena? Also tried to deploy openstack-base-focal-xena stable versions but I got the same error when creating LB. I want to test K8's but was stuck on this error.

openstack-base-focal-xena:
https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/bundle.yaml
https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/overlays/loadbalancer-octavia.yaml

Thanks!

Revision history for this message
Nobuto Murata (nobuto) wrote :

Hi odie,

It sounds like your issue is different from this bug report. Just for the record, here are the steps I followed for focal-xena deployment:
https://github.com/nobuto-m/quick-maas/blob/97d83b3d2256f3a525c918c4a61782c7698dc134/user-script.sh#L197

Revision history for this message
Frode Nordahl (fnordahl) wrote :
Download full text (3.3 KiB)

Adding Neutron to the bug as it appears Neutron is not filling out the LSP correctly in the NB DB for some reason.

For the Octavia VIP I see:
$ sudo ovn-nbctl find logical-switch-port
...
_uuid : 00c81b7f-56cb-44ab-9047-d310180f6a1b
addresses : ["fa:16:3e:8a:0f:41 192.168.0.59"]
dhcpv4_options : c8a7fb25-4c16-4ee0-9c1f-2cccef5576dd
dhcpv6_options : []
dynamic_addresses : []
enabled : false
external_ids : {"neutron:cidrs"="192.168.0.59/24", "neutron:device_id"=lb-6f63ef60-47ef-4a70-a008-016378d6adc2, "neutron:device_owner"=Octavia, "neutron:network_name"=neutron-e95d48e1-b476-49b1-9a58-5b4d5b8762b2, "neutron:port_fip"="10.78.95.100", "neutron:port_name"=octavia-lb-6f63ef60-47ef-4a70-a008-016378d6adc2, "neutron:project_id"="4956c242dd6d481a90d5d61217981e4f", "neutron:revision_number"="10", "neutron:security_group_ids"="a1faabe4-898d-42b9-b26c-d4d6c402d1b3"}
ha_chassis_group : []
name : "ae6c9e3f-5981-4d84-aeea-b71ed7c1f961"
options : {mcast_flood_reports="true", requested-chassis=""}
parent_name : []
port_security : ["fa:16:3e:8a:0f:41 192.168.0.59"]
tag : []
tag_request : []
type : ""
up : false

Here the type is not set to 'virtual', there is also missing 'virtual-ip' and 'virtual-parents' options.

Pausing Neutron and setting these manually resolves the issue:
$ juju run-action neutron-api/0 pause
$ sudo ovn-nbctl add logical-switch-port 00c81b7f-56cb-44ab-9047-d310180f6a1b \
    options virtual-ip="192.168.0.59"
$ sudo ovn-nbctl add logical-switch-port 00c81b7f-56cb-44ab-9047-d310180f6a1b \
    options virtual-parents=\"bcb6da0d-bae0-48cb-9d4f-bb4676af7db2,34abcd0d-4032-4855-9af5-74125ca1a569\"
$ sudo ovn-nbctl set logical-switch-port 00c81b7f-56cb-44ab-9047-d310180f6a1b \
    type=virtual

$ sudo ovn-sbctl find port-binding type=virtual
_uuid : 5dcc583f-6e2e-4bc4-833e-823d2bfe02e8
chassis : []
datapath : fa349a7e-645b-4e26-8a79-b1866f75088d
encap : []
external_ids : {name=octavia-lb-6f63ef60-47ef-4a70-a008-016378d6adc2, "neutron:cidrs"="192.168.0.59/24", "neutron:device_id"=lb-6f63ef60-47ef-4a70-a008-016378d6adc2, "neutron:device_owner"=Octavia, "neutron:network_name"=neutron-e95d48e1-b476-49b1-9a58-5b4d5b8762b2, "neutron:port_fip"="10.78.95.100", "neutron:port_name"=octavia-lb-6f63ef60-47ef-4a70-a008-016378d6adc2, "neutron:project_id"="4956c242dd6d481a90d5d61217981e4f", "neutron:revision_number"="10", "neutron:security_group_ids"="a1faabe4-898d-42b9-b26c-d4d6c402d1b3"}
gateway_chassis : []
ha_chassis_group : []
logical_port : "ae6c9e3f-5981-4d84-aeea-b71ed7c1f961"
mac : ["fa:16:3e:8a:0f:41 192.168.0.59"]
nat_addresses : []
options : {mcast_flood_reports="true", requested-chassis="", virtual-ip="192.168.0.59", virtual-parents="bcb6da0d-bae0-48cb-9d4f-bb4676af7db2,34abcd0d-4032-4855-9af5-74125ca1a569"}
parent_port : []
requested_chassis : []
tag : []
tunnel_key : 5
type : virtual
up : true
virtual_parent : []

$ ...

Read more...

Changed in charm-ovn-chassis:
status: New → Invalid
Changed in charm-ovn-central:
status: New → Invalid
Changed in charm-octavia:
status: New → Invalid
Revision history for this message
Nobuto Murata (nobuto) wrote :

https://bugs.launchpad.net/neutron/+bug/1973276
> tags: added: in-stable-yoga
>
> This issue was fixed in the openstack/neutron 20.1.0 release.

https://bugs.launchpad.net/cloud-archive/+bug/1975632
> The update contains the following package updates:
>
> * neutron 20.1.0

So from Ubuntu's point of view for Yoga, it's tracked as an SRU in LP: #1975632.

And I've confirmed that my test case in https://bugs.launchpad.net/charm-ovn-chassis/+bug/1976461/comments/1 works after using cloud:focal-yoga/proposed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.