Generated amphora images don't support UDP load balancing

Bug #1910369 reported by Vern Hart
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
charm-octavia-diskimage-retrofit
Incomplete
Undecided
Unassigned
snap-octavia-diskimage-retrofit
Incomplete
Undecided
Unassigned

Bug Description

In a current customer deployment we have attempted to create an octavia load balancer instance for dns (udp port 53).

  $ openstack loadbalancer create --vip-network-id lb-test-net2 --name vern-dns-lb \
    --vip-address 10.10.58.6
  $ openstack loadbalancer listener create --protocol UDP --protocol-port 53 vern-dns-lb
  $ openstack loadbalancer pool create --protocol UDP --lb-algorithm ROUND_ROBIN \
    --loadbalancer vern-dns-lb --name vern-dns-pool
  $ openstack loadbalancer member create --name dns-server --disable-backup \
    --address 10.10.58.5 --protocol-port 53

When I do a dns lookup to the member (10.10.58.5) it works but to the LB VIP (10.10.58.6) it fails.

  $ nslookup www.udptesttt.com 10.10.58.5
  Server: 10.10.58.5
  Address: 10.10.58.5:53

  Name: www.udptesttt.com
  Address: 10.10.57.5

  $ nslookup www.udptesttt.com 10.10.58.6
  ;; connection timed out; no servers could be reached

Environment:
octavia-diskimage-retrofit charm: cs:~openstack-charmers-next/octavia-diskimage-retrofit-33
octavia-diskimage-retrofit snap: version: 0.9.10 revision: 179 channel: latest/beta
openstack release: train
juju: 2.8.7

We got another test amphora from a network vendor and it works -- I can do dns lookups to the vip. When I asked how this test amphora was created they stated it was created using the diskimage-create.sh from octavia upstream (stable/train release, I believe).

When I compare the diskimage-create in the octavia-diskimage-retrofit snap it seems to be old. https://pastebin.ubuntu.com/p/gvS37zhr69/ I'm not sure what differences are significant but the newer one defaults to ubuntu-minimal instead of ubuntu, which seems important.

Is there a reason the diskimage-create inside the snap is old?

Revision history for this message
Vern Hart (vern) wrote :

Subscribing field medium since we have a test amphora that works but this is just a temporary fix.

I realize that I didn't specifically mention that the amphora that doesn't work was created by the octavia-diskimage-retrofit charm based on ubuntu images synced using glance-simiplestreams-sync. I tried with both bionic and focal based images.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

haproxy does not support UDP load-balancing
https://github.com/haproxy/haproxy/issues/62

I think we need images that support the use of LVS & keepalived to do that (I would expect that keepalived and ipvsadm need to be installed):

https://storyboard.openstack.org/#!/story/1657091
https://github.com/openstack/octavia/tree/stable/train/octavia/common/jinja/lvs

I am going to check if anything is missing in order to make it work.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Sources of elements for disk image builder (dib):

https://github.com/openstack-charmers/octavia-diskimage-retrofit/tree/master/src/elements (openstack-charmers-specific elements shipped with the snap)
https://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements (disk-image-builder-specific)
https://github.com/openstack/octavia/tree/master/elements (Octavia-specific: includes ipvsadmin and keepalived-octavia)

The right elements seem to be included into the virt-dib invocation:

https://github.com/openstack-charmers/octavia-diskimage-retrofit/blob/2dad8d8d2b6615f406ad7339d92c4c1a3a6557d0/src/retrofit/retrofit.sh#L106-L108
virt-dib ${DEBUG} \
# ...
    ubuntu-cloud-archive ubuntu-ppa \
    haproxy-octavia rebind-sshd no-resolvconf amphora-agent \
    sos keepalived-octavia ipvsadmin pip-cache certs-ramfs \
    ubuntu-amphora-agent tuning bug189583

Will look further.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

The diskimage-create.sh script from the Octavia repo builds a list of elements to include into an image by concatenating elements to the AMP_element_sequence variable:

https://opendev.org/openstack/octavia/src/commit/bd01ad94dcea3eca7636f1286ceed0753b563a28/diskimage-create/diskimage-create.sh#L459-L461
# Add keepalived-octavia element
AMP_element_sequence="$AMP_element_sequence keepalived-octavia"
AMP_element_sequence="$AMP_element_sequence ipvsadmin"

but then uses diskimage-builder as opposed to virt-dib used by octavia-diskimage-retrofit.

https://libguestfs.org/virt-dib.1.html
"Virt-dib is intended as safe replacement for diskimage-builder and its ramdisk-image-create mode"

This should not matter that much if the right elements get to the built image.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

# deployed bionic-train with octavia `tox -e func-target bionic-train-ha`

juju ssh octavia-diskimage-retrofit/0
# the right elements are present on the octavia-diskimage-retrofit unit
$ sudo ls -1 /snap/octavia-diskimage-retrofit/current/usr/local/lib/elements/
amphora-agent
bug1895835
certs-ramfs
debian-networking
growrootfs
haproxy-octavia
ipvsadmin
keepalived-octavia
no-resolvconf
rebind-sshd
remove-sshd
root-passwd
sos
tuning
ubuntu-amphora-agent
ubuntu-archive
ubuntu-cloud-archive
ubuntu-ppa

# saved the image generated by the functional test
openstack image save --file /tmp/amphora-haproxy-x86_64-ubuntu-18.04-20210325 amphora-haproxy-x86_64-ubuntu-18.04-20210325
sudo guestmount -a /tmp/amphora-haproxy-x86_64-ubuntu-18.04-20210325 -i --ro /mnt/amphora

# ipvsadm and keepalived are present on the filesystem of the retrofitted image

find /mnt/amphora -name '*ipvsadm*'
/mnt/amphora/etc/rc2.d/S01ipvsadm
/mnt/amphora/etc/rc5.d/S01ipvsadm
/mnt/amphora/etc/rc0.d/K01ipvsadm
/mnt/amphora/etc/rc4.d/S01ipvsadm
/mnt/amphora/etc/rc3.d/S01ipvsadm
/mnt/amphora/etc/rc1.d/K01ipvsadm
/mnt/amphora/etc/default/ipvsadm
/mnt/amphora/etc/rc6.d/K01ipvsadm
/mnt/amphora/etc/ipvsadm.rules
/mnt/amphora/etc/init.d/ipvsadm
/mnt/amphora/var/lib/dpkg/info/ipvsadm.md5sums
/mnt/amphora/var/lib/dpkg/info/ipvsadm.postrm
/mnt/amphora/var/lib/dpkg/info/ipvsadm.list
/mnt/amphora/var/lib/dpkg/info/ipvsadm.conffiles
/mnt/amphora/var/lib/dpkg/info/ipvsadm.prerm
/mnt/amphora/var/lib/dpkg/info/ipvsadm.postinst
/mnt/amphora/sbin/ipvsadm-restore
/mnt/amphora/sbin/ipvsadm-save
/mnt/amphora/sbin/ipvsadm

find /mnt/amphora -name '*keepalived*'
/mnt/amphora/etc/rc2.d/S01keepalived
/mnt/amphora/etc/rc5.d/S01keepalived
/mnt/amphora/etc/rc0.d/K01keepalived
/mnt/amphora/etc/rc4.d/S01keepalived
/mnt/amphora/etc/rc3.d/S01keepalived
/mnt/amphora/etc/rc1.d/K01keepalived
/mnt/amphora/etc/systemd/system/multi-user.target.wants/keepalived.service
/mnt/amphora/etc/dbus-1/system.d/org.keepalived.Vrrp1.conf
/mnt/amphora/etc/default/keepalived
/mnt/amphora/etc/rc6.d/K01keepalived
/mnt/amphora/etc/keepalived
/mnt/amphora/etc/init.d/keepalived
/mnt/amphora/var/lib/dpkg/info/keepalived.postrm
/mnt/amphora/var/lib/dpkg/info/keepalived.conffiles
/mnt/amphora/var/lib/dpkg/info/keepalived.md5sums
/mnt/amphora/var/lib/dpkg/info/keepalived.list
/mnt/amphora/var/lib/dpkg/info/keepalived.prerm
/mnt/amphora/var/lib/dpkg/info/keepalived.postinst
/mnt/amphora/var/lib/systemd/deb-systemd-helper-enabled/keepalived.service.dsh-also
/mnt/amphora/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/keepalived.service

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (31.3 KiB)

So I tried to replicate the scenario in the bug description, however, I was not able to reproduce the issue and the UDP traffic got to a backend successfully (see below). The created amphora has the right set of packages and configuration generated by the amphora-agent.

The scenario I tested with bionic-train (ML2/OVS):

external machine -> provider network -> qrouter ns (non-distributed, non-L3HA router) -> DNAT from LB FIP to LB VIP -> amphora with keepalived+LVS -> backend member IP -> bind9

It would be good to have more information on the networking setup used when the issue was encountered. Specifically, on whether LBs were attached directly to a provider network or via a router and using a FIP (if so, whether that router was distributed, which would be a problem, or L3HA/legacy). Likewise, it would be good to know whether OVN or ML2/OVS was used.

https://paste.ubuntu.com/p/BmcVxjMcxs/ (same as below for readability)

# deployed via `tox -e func-target bionic-train-ha`

ubuntu@dmitriis-bastion:~$ openstack loadbalancer create --name testlb --vip-network-id private_lb_fip_network
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| admin_state_up | True |
| availability_zone | |
| created_at | 2021-04-06T21:51:05 |
| description | |
| flavor_id | None |
| id | 4273ab40-b760-43b5-9a40-19f7490e3bbd |
| listeners | |
| name | testlb |
| operating_status | OFFLINE |
| pools | |
| project_id | bd5208e7ae354f339567d71e49cf544b |
| provider | amphora |
| provisioning_status | PENDING_CREATE |
| updated_at | None |
| vip_address | 10.42.0.216 |
| vip_network_id | 06ab6573-55df-43ad-ace3-0a4afc1964b9 |
| vip_port_id | 7453d6da-f0f0-4f1a-bc5f-6258e40a2182 |
| vip_qos_policy_id | None |
| vip_subnet_id | 4d011e74-cbda-42ab-b58c-7a5aef4cb5fd |
+---------------------+--------------------------------------+

ubuntu@dmitriis-bastion:~$ openstack loadbalancer listener create --name test-listener --protocol UDP --protocol-port 53 testlb
+-----------------------------+--------------------------------------+
| Field | Value |
+-----------------------------+--------------------------------------+
| admin_state_up | True |
| connection_limit | -1 |
| created_at | 2021-04-06T21:51:58 |
| default_pool_id | None |
| default_tls_conta...

Changed in charm-octavia-diskimage-retrofit:
status: New → Incomplete
Changed in snap-octavia-diskimage-retrofit:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.