CDK Distribution Does Not Have Redundant Load Balancer

Bug #1836885 reported by Lorenzo Cavassa
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes API Load Balancer
Fix Released
High
Cory Johns
Openstack Integrator Charm
Fix Released
High
Cory Johns

Bug Description

CDK deployment consists of 1 Nginx LB that has 3 Masters underneath it. Which means that this single LB is a SPOF.

A possible solution here would be to use haproxy with 3 Nginx LB and therefore solve the issue, like it's documented in the HAproxy page:

https://www.ubuntu.com/kubernetes/docs/hacluster

From that page, when configuring the service, the floating IPs have to be defined in the configuration:

juju config kubeapi-load-balancer ha-cluster-vips=”192.168.0.1 192.168.0.2”

We should have a way to control the IP assignment process. A way to ask OpenStack to just give an IP to be used as the floating ip for the VIP.

The possible solution would be would be to create a subordinate charm that creates a new Octavia LB per cluster/application adds a listener a pool and registers each unit of the application as a member.

Tags: atos sts
Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

Proposal for how we could solve this:

1. Add a new relation to the openstack-integrator charm that looks like this:

provides:
  loadbalancer:
      interface: public-address

(Notice that this matches the loadbalancer relation provided by the kubeapi-load-balancer.)

2. Deploy CDK without the kubeapi-load-balancer, and instead relate the masters to the new loadbalancer relation on the openstack-integrator.

3. In the openstack-integrator, add handler code for the loadbalancer relation that will create a new Octavia LB that will balance across the k8s masters.

Is this an acceptable approach?

Changed in charm-kubeapi-load-balancer:
status: New → Triaged
Revision history for this message
João Pedro Seara (jpseara) wrote :

Hello,

Forwarding concerns on this one from Edward Stewart from Atos:

"1. The biggest concern is that we will hit the issue witih other charms, eg we want to run percona clusters on top of OpenStack - The percona charm also uses the ha cluster charm to provide HA and so we will hit exactly the same issue here. And any other charms that use hacluster, eg rabbitmq, vault, mariadb, galera, etc. These don't support the load-balancer interface so would still have the issue.

2. The openstack-integrator charm (IIRC) requires you to specify a single load balancer network from which the load balancers are created - from the docs, it also expects the network to be floating ip, whereas currently it would be possible to create an ha kubernetes cluster with haproxy on a tenant network without forcing the API to be exposed as a floating ip.

So I'm wondering whether a more generic solution that extends haproxy or provides a subordinate to haproxy to allocate the port and release the port leaving the haproxy to do the proxying. This still would need to support the use case of VIPs on either tenant network or floating network (depending on whether use-floating-ip is set in the model or not)."

JP

Cory Johns (johnsca)
Changed in charm-kubeapi-load-balancer:
assignee: nobody → Cory Johns (johnsca)
status: Triaged → In Progress
Changed in charm-kubeapi-load-balancer:
importance: Undecided → Wishlist
milestone: none → 1.16
Revision history for this message
Cory Johns (johnsca) wrote :

The edge version of the OpenStack Integrator charm (cs:~containers/openstack-integrator-27) now has a loadbalancer endpoint and can be used to replace the kubeapi-load-balancer with a native OpenStack load balancer using the following overlay:

applications:
  kubeapi-load-balancer: null
  openstack-integrator:
    charm: cs:~containers/openstack-integrator
    channel: edge
    num_units: 1

relations:
  - ['kubernetes-master:openstack', 'openstack-integrator']
  - ['kubernetes-worker:openstack', 'openstack-integrator']
  - ['kubernetes-master:loadbalancer', 'openstack-integrator']
  - ['kubernetes-master:kube-api-endpoint', 'kubernetes-worker:kube-api-endpoint']

I'm still working on testing of this on Octavia, but have run into some environmental issues which have been holding me up. So if someone wants test this as well, that would be much appreciated.

Revision history for this message
Cory Johns (johnsca) wrote :

Testing on Octavia-enabled OpenStack is now complete as well.

Revision history for this message
Cory Johns (johnsca) wrote :
Cory Johns (johnsca)
Changed in charm-kubeapi-load-balancer:
status: In Progress → Fix Committed
Changed in charm-kubeapi-load-balancer:
status: Fix Committed → Fix Released
Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

From Ed @ Atos:

ok thanks - I've tested this and have an issue. It looks like some of the code within the openstack-integrator is trying to use the admin endpoints of OpenStack instead of the public ones. I get this error in the openstack-integrator logs when it tries to create the k8s master loadbalancer:

2019-10-04 20:12:29 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {3 {map[kubernetes-worker/0:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {1 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {2 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:15:42 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:18 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:52 DEBUG loadbalancer-relation-joined Unable to establish connection to https://192.168.10.200:35357/v3/domains?: HTTPSConnectionPool(host='192.168.10.
200', port=35357): Max retries exceeded with url: /v3/domains (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fcdbe655160>: Fail
ed to establish a new connection: [Errno 110] Connection timed out',))
2019-10-04 20:21:52 DEBUG worker.uniter.jujuc server.go:182 running hook tool "juju-log"
2019-10-04 20:21:52 ERROR juju-log loadbalancer:1: Error creating loadbalancer
Traceback (most recent call last):
File "lib/charms/layer/openstack.py", line 325, in get_or_create
lb.create()
File "lib/charms/layer/openstack.py", line 388, in create
sg_id = self._impl.find_secgrp(self.name)
File "lib/charms/layer/openstack.py", line 584, in find_secgrp
'--project', self.project_id)}
File "lib/charms/layer/openstack.py", line 577, in project_id
project)['id']
File "lib/charms/layer/openstack.py", line 267, in _openstack
output = _run_with_creds('openstack', *args, '--format=yaml')
File "lib/charms/layer/openstack.py", line 262, in _run_with_creds
stdout=subprocess.PIPE)
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('openstack', 'project', 'show', '--domain', 'admin_domain', 'dpcop_mgmt', '--format=yaml')' returned non-zero exit status 1.

The issue is this: 192.168.10.200', port=35357
Keystone endpoints are as follows:
ubuntu@juju-4a107b-0-lxd-1:~$ openstack endpoint list | grep keystone
| 86307cea7d984b48871c043186393c63 | RegionOne | keystone | identity | True | public | https://auth.ohc01.customerb.internal:5000/v3
|
| b877879836c0490695d345459367e34b | RegionOne | keystone | identity | True | internal | https://192.168.10.200:5000/v3
|
| f37fb0f679654f53bf845dd738491aeb | RegionOne | keystone | identity | True | admin | https://192.168.10.200:35357/v3

The OpenStack network hosting the services can route/access the public URL but not the internal or admin ones. Shouldn't the integrator only access the public API endpoints?
thanks
Ed

Changed in charm-kubeapi-load-balancer:
status: Fix Released → Triaged
importance: Wishlist → High
milestone: 1.16 → 1.16+ck2
Changed in charm-openstack-integrator:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.16+ck2
Revision history for this message
Ed Stewart (emcs2) wrote :

FYI this was using cs:~containers/openstack-integrator-28

Changed in charm-openstack-integrator:
assignee: nobody → Cory Johns (johnsca)
Revision history for this message
Cory Johns (johnsca) wrote :

The openstack-integrator charm uses the endpoint URL provided in the credentials it is given[1]. If using `juju trust`, this would be the same URL given to Juju. If you want to use `juju trust` but have the integrator charm use a different endpoint URL, you can override it by using the `auth_url` config option in addition to `juju trust`. Or, of course, if you want to use an entirely different credential for the integrator charm, you can provide that via either the `credentials` config option, or the individual config options for each field.

[1]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/lib/charms/layer/openstack.py#L240

Revision history for this message
João Pedro Seara (jpseara) wrote :

@Cory Johns, I've had already instructed the customer to set the auth_url to a different URL this afternoon. I am now waiting for his input. Thank you all very much for your help.

Revision history for this message
Ed Stewart (emcs2) wrote :

@Cory Johns - aware of this, and we had already passed the correct public API endpoint in our auth_url setting within the openstack-integrator juju config (and that endpoint is accessible). However, the openstack integrator still tried to call the internal admin API endpoint.

Changed in charm-kubeapi-load-balancer:
milestone: 1.16+ck2 → 1.16+ck3
Changed in charm-openstack-integrator:
milestone: 1.16+ck2 → 1.16+ck3
Revision history for this message
Edward Hope-Morley (hopem) wrote :

I have tested this for myself and confirm that it does indeed not work but the cause is not quite as previously suspected. In short, the cause of the problem here is that the openstack integrator is performing some commands that are admin-only and therefore regardless of what the auth_url is set to in the charm/application, when it performs these api requests the openstack client will switch to use the admin url as provided by the catalog. We can see this here:

root@juju-5727c3-k8s-3:~# sqlite3 /var/lib/juju/agents/unit-openstack-integrator-0/charm/.unit-state.db 'select data from kv where key="charm.openstack.full-creds"'| jq -r '.| to_entries[]| "export os_\(.key)=\(.value)"'| sed -r 's/(export )(.+)(=.*)/\1\U\2\E\3/g' > novarc
root@juju-5727c3-k8s-3:~# sed -i -r -e 's/OS_VERSION/OS_IDENTITY_API_VERSION/g' novarc
root@juju-5727c3-k8s-3:~# cat novarc
export OS_AUTH_URL=http://10.100.0.74:5000/v3
export OS_REGION=RegionOne
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_USER_DOMAIN_NAME=admin_domain
export OS_PROJECT_DOMAIN_NAME=admin_domain
export OS_PROJECT_NAME=admin
export OS_ENDPOINT_TLS_CA=
export OS_IDENTITY_API_VERSION=3
root@juju-5727c3-k8s-3:~# source novarc
root@juju-5727c3-k8s-3:~# openstack project show admin --domain admin_domain
Unable to establish connection to http://10.0.0.183:35357/v3/domains?: HTTPConnectionPool(host='10.0.0.183', port=35357): Max retries exceeded with url: /v3/domains (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3344a84748>: Failed to establish a new connection: [Errno 111] Connection refused',))

So the openstack-integrator needs a way to stop relying on admin operations.

Changed in charm-kubeapi-load-balancer:
status: Triaged → In Progress
Changed in charm-openstack-integrator:
status: Triaged → In Progress
Revision history for this message
Edward Hope-Morley (hopem) wrote :

If i inject my project id:

project_id=a51de420811c4a40aa76c6fd8719ba5d
application=openstack-integrator
sqlite3 /var/lib/juju/agents/unit-${application}-*/charm/.unit-state.db "update kv set data = '\"$project_id\"' where key='project_id';"

and make the following change to the code:

diff --git a/lib/charms/layer/openstack.py b/lib/charms/layer/openstack.py
index 8520013..17aebea 100644
--- a/lib/charms/layer/openstack.py
+++ b/lib/charms/layer/openstack.py
@@ -580,8 +580,7 @@ class BaseLBImpl:

     def find_secgrp(self, name):
         secgrps = {sg['Name']: sg
- for sg in _openstack('security', 'group', 'list',
- '--project', self.project_id)}
+ for sg in _openstack('security', 'group', 'list'}
         return secgrps.get(name, {}).get('ID')

     def create_secgrp(self, name):

Then everything works fine.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

I think the fact that the charm is assuming it will be admin is an error in the first place. As far as I understand the universal use-case for this charm is as a non-admin i.e. many non-admin tenant each with their own k8s using openstack integrator to drive their tenant account/resources.

Revision history for this message
Edward Hope-Morley (hopem) wrote :
Revision history for this message
Edward Hope-Morley (hopem) wrote :

I have tested cs:~containers/openstack-integrator-32 from channel:edge which contains the above patch and works for me i.e. the charm is able to setup load-balancers with as a non-admin tenant.

Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

This is in the stable channel at rev 36:

https://jaas.ai/u/containers/openstack-integrator/36

Changed in charm-openstack-integrator:
status: In Progress → Fix Released
Changed in charm-kubeapi-load-balancer:
status: In Progress → Fix Released
Changed in charm-kubeapi-load-balancer:
milestone: 1.16+ck3 → 1.16+ck2
Changed in charm-openstack-integrator:
milestone: 1.16+ck3 → 1.16+ck2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.