Bug #1836885 “CDK Distribution Does Not Have Redundant Load Bala...” : Bugs : Kubernetes API Load Balancer

Revision history for this message

Tim Van Steenburgh (tvansteenburgh) wrote on 2019-07-17:

#1

Proposal for how we could solve this:

1. Add a new relation to the openstack-integrator charm that looks like this:

provides:
loadbalancer:
interface: public-address

(Notice that this matches the loadbalancer relation provided by the kubeapi-load-balancer.)

2. Deploy CDK without the kubeapi-load-balancer, and instead relate the masters to the new loadbalancer relation on the openstack-integrator.

3. In the openstack-integrator, add handler code for the loadbalancer relation that will create a new Octavia LB that will balance across the k8s masters.

Is this an acceptable approach?

Changed in charm-kubeapi-load-balancer:
status:	New → Triaged

Revision history for this message

João Pedro Seara (jpseara) wrote on 2019-07-19:

#2

Hello,

Forwarding concerns on this one from Edward Stewart from Atos:

"1. The biggest concern is that we will hit the issue witih other charms, eg we want to run percona clusters on top of OpenStack - The percona charm also uses the ha cluster charm to provide HA and so we will hit exactly the same issue here. And any other charms that use hacluster, eg rabbitmq, vault, mariadb, galera, etc. These don't support the load-balancer interface so would still have the issue.

2. The openstack-integrator charm (IIRC) requires you to specify a single load balancer network from which the load balancers are created - from the docs, it also expects the network to be floating ip, whereas currently it would be possible to create an ha kubernetes cluster with haproxy on a tenant network without forcing the API to be exposed as a floating ip.

So I'm wondering whether a more generic solution that extends haproxy or provides a subordinate to haproxy to allocate the port and release the port leaving the haproxy to do the proxying. This still would need to support the use case of VIPs on either tenant network or floating network (depending on whether use-floating-ip is set in the model or not)."

JP

Cory Johns (johnsca) on 2019-08-26

Changed in charm-kubeapi-load-balancer:
assignee:	nobody → Cory Johns (johnsca)
status:	Triaged → In Progress

Tim Van Steenburgh (tvansteenburgh) on 2019-08-26

Changed in charm-kubeapi-load-balancer:
importance:	Undecided → Wishlist
milestone:	none → 1.16

Revision history for this message

Cory Johns (johnsca) wrote on 2019-09-09:

#3

The edge version of the OpenStack Integrator charm (cs:~containers/openstack-integrator-27) now has a loadbalancer endpoint and can be used to replace the kubeapi-load-balancer with a native OpenStack load balancer using the following overlay:

applications:
  kubeapi-load-balancer: null
  openstack-integrator:
    charm: cs:~containers/openstack-integrator
    channel: edge
    num_units: 1

relations:
  - ['kubernetes-master:openstack', 'openstack-integrator']
  - ['kubernetes-worker:openstack', 'openstack-integrator']
  - ['kubernetes-master:loadbalancer', 'openstack-integrator']
  - ['kubernetes-master:kube-api-endpoint', 'kubernetes-worker:kube-api-endpoint']

I'm still working on testing of this on Octavia, but have run into some environmental issues which have been holding me up. So if someone wants test this as well, that would be much appreciated.

Revision history for this message

Cory Johns (johnsca) wrote on 2019-09-10:

#4

Testing on Octavia-enabled OpenStack is now complete as well.

Revision history for this message

Cory Johns (johnsca) wrote on 2019-09-10:

#5

https://github.com/juju-solutions/charm-openstack-integrator/pull/21
https://github.com/juju-solutions/interface-public-address/pull/7
https://github.com/charmed-kubernetes/kubernetes-docs/pull/267

Cory Johns (johnsca) on 2019-09-11

Changed in charm-kubeapi-load-balancer:
status:	In Progress → Fix Committed

Tim Van Steenburgh (tvansteenburgh) on 2019-09-27

Changed in charm-kubeapi-load-balancer:
status:	Fix Committed → Fix Released

Revision history for this message

Tim Van Steenburgh (tvansteenburgh) wrote on 2019-10-04:

#6

From Ed @ Atos:

ok thanks - I've tested this and have an issue. It looks like some of the code within the openstack-integrator is trying to use the admin endpoints of OpenStack instead of the public ones. I get this error in the openstack-integrator logs when it tries to create the k8s master loadbalancer:

2019-10-04 20:12:29 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {3 {map[kubernetes-worker/0:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {1 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {2 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:15:42 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:18 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:52 DEBUG loadbalancer-relation-joined Unable to establish connection to https://192.168.10.200:35357/v3/domains?: HTTPSConnectionPool(host='192.168.10.
200', port=35357): Max retries exceeded with url: /v3/domains (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fcdbe655160>: Fail
ed to establish a new connection: [Errno 110] Connection timed out',))
2019-10-04 20:21:52 DEBUG worker.uniter.jujuc server.go:182 running hook tool "juju-log"
2019-10-04 20:21:52 ERROR juju-log loadbalancer:1: Error creating loadbalancer
Traceback (most recent call last):
File "lib/charms/layer/openstack.py", line 325, in get_or_create
lb.create()
File "lib/charms/layer/openstack.py", line 388, in create
sg_id = self._impl.find_secgrp(self.name)
File "lib/charms/layer/openstack.py", line 584, in find_secgrp
'--project', self.project_id)}
File "lib/charms/layer/openstack.py", line 577, in project_id
project)['id']
File "lib/charms/layer/openstack.py", line 267, in _openstack
output = _run_with_creds('openstack', *args, '--format=yaml')
File "lib/charms/layer/openstack.py", line 262, in _run_with_creds
stdout=subprocess.PIPE)
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('openstack', 'project', 'show', '--domain', 'admin_domain', 'dpcop_mgmt', '--format=yaml')' returned non-zero exit status 1.

The OpenStack network hosting the services can route/access the public URL but not the internal or admin ones. Shouldn't the integrator only access the public API endpoints?
thanks
Ed

From Ed @ Atos:

ok thanks - I've tested this and have an issue. It looks like some of the code within the openstack-integrator is trying to use the admin endpoints of OpenStack instead of the public ones. I get this error in the openstack-integrator logs when it tries to create the k8s master loadbalancer:

2019-10-04 20:12:29 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {3 {map[kubernetes-worker/0:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {1 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:13:01 DEBUG juju.worker.uniter.remotestate watcher.go:525 got a relation units change: {2 {map[kubernetes-master/1:{0}] []}}
2019-10-04 20:15:42 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:18 DEBUG juju.worker.uniter.remotestate watcher.go:531 update status timer triggered
2019-10-04 20:21:52 DEBUG loadbalancer-relation-joined Unable to establish connection to https://192.168.10.200:35357/v3/domains?: HTTPSConnectionPool(host='192.168.10.
200', port=35357): Max retries exceeded with url: /v3/domains (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fcdbe655160>: Fail
ed to establish a new connection: [Errno 110] Connection timed out',))
2019-10-04 20:21:52 DEBUG worker.uniter.jujuc server.go:182 running hook tool "juju-log"
2019-10-04 20:21:52 ERROR juju-log loadbalancer:1: Error creating loadbalancer
Traceback (most recent call last):
File "lib/charms/layer/openstack.py", line 325, in get_or_create
lb.create()
File "lib/charms/layer/openstack.py", line 388, in create
sg_id = self._impl.find_secgrp(self.name)
File "lib/charms/layer/openstack.py", line 584, in find_secgrp
'--project', self.project_id)}
File "lib/charms/layer/openstack.py", line 577, in project_id
project)['id']
File "lib/charms/layer/openstack.py", line 267, in _openstack
output = _run_with_creds('openstack', *args, '--format=yaml')
File "lib/charms/layer/openstack.py", line 262, in _run_with_creds
stdout=subprocess.PIPE)
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('openstack', 'project', 'show', '--domain', 'admin_domain', 'dpcop_mgmt', '--format=yaml')' returned non-zero exit status 1.

The OpenStack network hosting the services can route/access the public URL but not the internal or admin ones. Shouldn't the integrator only access the public API endpoints?
thanks
Ed

Changed in charm-kubeapi-load-balancer:
status:	Fix Released → Triaged
importance:	Wishlist → High
milestone:	1.16 → 1.16+ck2
Changed in charm-openstack-integrator:
status:	New → Triaged
importance:	Undecided → High
milestone:	none → 1.16+ck2

Revision history for this message

Ed Stewart (emcs2) wrote on 2019-10-07:

#7

FYI this was using cs:~containers/openstack-integrator-28

Tim Van Steenburgh (tvansteenburgh) on 2019-10-07

Changed in charm-openstack-integrator:
assignee:	nobody → Cory Johns (johnsca)

Revision history for this message

Cory Johns (johnsca) wrote on 2019-10-07:

#8

The openstack-integrator charm uses the endpoint URL provided in the credentials it is given[1]. If using `juju trust`, this would be the same URL given to Juju. If you want to use `juju trust` but have the integrator charm use a different endpoint URL, you can override it by using the `auth_url` config option in addition to `juju trust`. Or, of course, if you want to use an entirely different credential for the integrator charm, you can provide that via either the `credentials` config option, or the individual config options for each field.

[1]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/lib/charms/layer/openstack.py#L240

Revision history for this message

João Pedro Seara (jpseara) wrote on 2019-10-07:

#9

@Cory Johns, I've had already instructed the customer to set the auth_url to a different URL this afternoon. I am now waiting for his input. Thank you all very much for your help.

Revision history for this message

Ed Stewart (emcs2) wrote on 2019-10-14:

#10

@Cory Johns - aware of this, and we had already passed the correct public API endpoint in our auth_url setting within the openstack-integrator juju config (and that endpoint is accessible). However, the openstack integrator still tried to call the internal admin API endpoint.

Tim Van Steenburgh (tvansteenburgh) on 2019-10-15

Changed in charm-kubeapi-load-balancer:
milestone:	1.16+ck2 → 1.16+ck3
Changed in charm-openstack-integrator:
milestone:	1.16+ck2 → 1.16+ck3

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-10-16:

#11

I have tested this for myself and confirm that it does indeed not work but the cause is not quite as previously suspected. In short, the cause of the problem here is that the openstack integrator is performing some commands that are admin-only and therefore regardless of what the auth_url is set to in the charm/application, when it performs these api requests the openstack client will switch to use the admin url as provided by the catalog. We can see this here:

root@juju-5727c3-k8s-3:~# sqlite3 /var/lib/juju/agents/unit-openstack-integrator-0/charm/.unit-state.db 'select data from kv where key="charm.openstack.full-creds"'| jq -r '.| to_entries[]| "export os_\(.key)=\(.value)"'| sed -r 's/(export )(.+)(=.*)/\1\U\2\E\3/g' > novarc
root@juju-5727c3-k8s-3:~# sed -i -r -e 's/OS_VERSION/OS_IDENTITY_API_VERSION/g' novarc
root@juju-5727c3-k8s-3:~# cat novarc
export OS_AUTH_URL=http://10.100.0.74:5000/v3
export OS_REGION=RegionOne
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_USER_DOMAIN_NAME=admin_domain
export OS_PROJECT_DOMAIN_NAME=admin_domain
export OS_PROJECT_NAME=admin
export OS_ENDPOINT_TLS_CA=
export OS_IDENTITY_API_VERSION=3
root@juju-5727c3-k8s-3:~# source novarc
root@juju-5727c3-k8s-3:~# openstack project show admin --domain admin_domain
Unable to establish connection to http://10.0.0.183:35357/v3/domains?: HTTPConnectionPool(host='10.0.0.183', port=35357): Max retries exceeded with url: /v3/domains (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3344a84748>: Failed to establish a new connection: [Errno 111] Connection refused',))

So the openstack-integrator needs a way to stop relying on admin operations.

Changed in charm-kubeapi-load-balancer:
status:	Triaged → In Progress
Changed in charm-openstack-integrator:
status:	Triaged → In Progress

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-10-16:

#12

If i inject my project id:

project_id=a51de420811c4a40aa76c6fd8719ba5d
application=openstack-integrator
sqlite3 /var/lib/juju/agents/unit-${application}-*/charm/.unit-state.db "update kv set data = '\"$project_id\"' where key='project_id';"

and make the following change to the code:

diff --git a/lib/charms/layer/openstack.py b/lib/charms/layer/openstack.py
index 8520013..17aebea 100644
--- a/lib/charms/layer/openstack.py
+++ b/lib/charms/layer/openstack.py
@@ -580,8 +580,7 @@ class BaseLBImpl:

     def find_secgrp(self, name):
         secgrps = {sg['Name']: sg
- for sg in _openstack('security', 'group', 'list',
- '--project', self.project_id)}
+ for sg in _openstack('security', 'group', 'list'}
         return secgrps.get(name, {}).get('ID')

def create_secgrp(self, name):

Then everything works fine.

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-10-16:

#13

I think the fact that the charm is assuming it will be admin is an error in the first place. As far as I understand the universal use-case for this charm is as a non-admin i.e. many non-admin tenant each with their own k8s using openstack integrator to drive their tenant account/resources.

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-10-21:

#14

https://github.com/juju-solutions/charm-openstack-integrator/pull/23

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-10-23:

#15

I have tested cs:~containers/openstack-integrator-32 from channel:edge which contains the above patch and works for me i.e. the charm is able to setup load-balancers with as a non-admin tenant.

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2019-11-04:

#16

This is in the stable channel at rev 36:

https://jaas.ai/u/containers/openstack-integrator/36

Tim Van Steenburgh (tvansteenburgh) on 2019-11-04

Changed in charm-openstack-integrator:
status:	In Progress → Fix Released
Changed in charm-kubeapi-load-balancer:
status:	In Progress → Fix Released

Kevin W Monroe (kwmonroe) on 2019-11-20

Changed in charm-kubeapi-load-balancer:
milestone:	1.16+ck3 → 1.16+ck2
Changed in charm-openstack-integrator:
milestone:	1.16+ck3 → 1.16+ck2

Affects		Status	Importance	Assigned to	Milestone
	Kubernetes API Load Balancer	Fix Released	High	Cory Johns	Kubernetes API Load Balancer 1.16+ck2
	Openstack Integrator Charm	Fix Released	High	Cory Johns	Openstack Integrator Charm 1.16+ck2

Kubernetes API Load Balancer

CDK Distribution Does Not Have Redundant Load Balancer

Bug Description

Other bug subscribers

Remote bug watches