Unable to add a gke or eks cluster to Juju > 3.0

Bug #2007575 reported by Guillaume Belanger
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
Medium
Unassigned

Bug Description

## Description

I am unable to add a GKE cluster to juju 3.1 and I'm having the following errors.

```
juju add-k8s gke-feb-15
ERROR making juju admin credentials in cluster: ensuring cluster role "juju-credential-4da1a681" in namespace "kube-system": Get "https://35.238.75.225/apis/rbac.authorization.k8s.io/v1/clusterroles/juju-credential-4da1a681": getting credentials: exec: executable gke-gcloud-auth-plugin not found

It looks like you are trying to use a client-go credential plugin that is not installed.

To learn more about this feature, consult the documentation available at:
      https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins

Install gke-gcloud-auth-plugin for use with kubectl by following https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
```

I do have the `gke-gcloud-auth-plugin` installed

Notes:
- I just tried with Juju 2.9 and it was working just fine

## Environment
- Juju snap: 3.1/stable
- GKE Version: 1.24.9-gke.2000
- gke-gcloud-auth-plugin: Kubernetes v1.25.2-alpha+ae91c1fc0c443c464a4c878ffa2a4544483c6d1f

Revision history for this message
Juan M. Tirado (tiradojm) wrote :

We're already aware of this and working to fix it. Thanks for reporting.

Changed in juju:
assignee: nobody → Vitaly Antonenko (anvial)
importance: Undecided → Medium
status: New → Triaged
tags: added: gke k8s
Changed in juju:
milestone: none → 3.1.2
Changed in juju:
milestone: 3.1.2 → none
Revision history for this message
Vitaly Antonenko (anvial) wrote (last edit ):

The problem is more profound than expected.

To resolve it, our cloud provider partners need to introduce strictly confined snaps of their CLI tools because the Juju team, for now, has no plans to maintain snaps of cloud CLI tools by ourselves.

The discussible solution may be introducing "mini-binaries" limited only to the functionality needed to operate with the k8s cluster (gke cluster in the particular case). But anyway, that should be a Roadmap item as a minimum.

So, I'll keep it without milestones.

Revision history for this message
John A Meinel (jameinel) wrote :

Since, in the long run, the access to the cloud is being driven by the controller, and not by the client, we should dig a little bit more to understand what we need, and whether the users can just trigger the request themselves, and then reconfigure the cloud definition to do the right thing. (We don't install the aks/eks tooling on the controllers, so any tokens that we are using to control the k8s cloud can be done controller side once it gets set up.)

Changed in juju:
assignee: Vitaly Antonenko (anvial) → nobody
Revision history for this message
John A Meinel (jameinel) wrote :

IOW, there may be a bootstrap/add-cloud issue, but once that has been completed, the expectation is that a normal 'juju' client can interoperate with the cloud just fine from then onward.

Revision history for this message
Paulo Machado (paulomachado) wrote :

Rebroadcast of the current workaround from public MM, by jameinel:

You should be able to either download a juju binary directly from:
https://launchpad.net/juju/+download

(eg: https://launchpad.net/juju/3.3/3.3.0/+download/juju-3.3.0-linux-amd64.tar.xz
)

And run juju bootstrap using that binary.
You also can likely access the binary inside the snap (without confinement) using:

/snap/juju/current/bin/juju bootstrap
(note /snap/bin/juju is the wrapper that sets up confinement and then executes the above binary)

Once bootstrapped, the controller should have a service account and be able to control the k8s cluster, and your juju client should be able to talk to that api.

John A Meinel (jameinel)
summary: - Unable to add a gke cluster to Juju 3.1
+ Unable to add a gke or eks cluster to Juju > 3.0
Revision history for this message
Peter Jose De Sousa (pjds) wrote (last edit ):
Download full text (6.0 KiB)

Further on this issue - there are other problems encountered as reported by @Barteus and team

barteus@barteus-xps:~$ /snap/juju/current/bin/juju add-k8s --cloud google hackaton-cluster1
ERROR making juju admin credentials in cluster: ensuring service account "juju-credential-64f26484" in namespace "kube-system": serviceaccounts is forbidden: User "<email address hidden>" cannot create resource "serviceaccounts" in API group "" in the namespace "kube-system": GKE Warden authz [denied by managed-namespaces-limitation]: the namespace "kube-system" is managed and the request's verb "create" is denied
barteus@barteus-xps:~$ kubectl get service account -A
error: a resource cannot be retrieved by name across all namespaces
barteus@barteus-xps:~$ kubectl get serviceaccount -A
NAMESPACE NAME SECRETS AGE
default default 0 43h
gke-gmp-system collector 0 43h
gke-gmp-system default 0 43h
gke-gmp-system operator 0 43h
gke-managed-filestorecsi default 0 43h
gmp-public default 0 43h
kube-node-lease default 0 43h
kube-public default 0 43h
kube-system antrea-agent 0 43h
kube-system antrea-controller 0 43h
kube-system antrea-cpha 0 43h
kube-system attachdetach-controller 0 43h
kube-system certificate-controller 0 43h
kube-system cilium-win 0 43h
kube-system cloud-provider 0 43h
kube-system clouddns 0 43h
kube-system clusterrole-aggregation-controller 0 43h
kube-system cronjob-controller 0 43h
kube-system daemon-set-controller 0 43h
kube-system default 0 43h
kube-system deployment-controller 0 43h
kube-system disruption-controller 0 43h
kube-system egress-nat-controller 0 43h
kube-system endpoint-controller 0 43h
kube-system endpointslice-controller 0 43h
kube-system endpointslicemirroring-controller 0 43h
kube-system ephemeral-volume-controller 0 43h
kube-system event-exporter-sa 0 43h
kube-system expand-controller 0 43h
kube-system filestorecsi-node-sa ...

Read more...

tags: added: aiml-hackathon
tags: added: canonical-data-platform-eng
Revision history for this message
Ian Booth (wallyworld) wrote :

I tested this using a locally built version of the juju cli. I was using a college's account to debug another issue. My kubeconfig was set up to point to his GKE cluster.

I could use add-k8s to register access to that cluster.

$ kubectl config get-contexts
* gke_neppel-k8s-dev_europe-west1-c_ubuntu-21713 gke_neppel-k8s-dev_europe-west1-c_ubuntu-21713 gke_neppel-k8s-dev_europe-west1-c_ubuntu-21713

$ juju add-k8s gketest
This operation can be applied to both a copy on this client and to the one on a controller.
No current controller was detected and there are no registered controllers on this client: either bootstrap one or register one.

k8s substrate "gce/europe-west1" added as cloud "gketest".
You can now bootstrap to this cloud by running 'juju bootstrap gketest'.

--

This looks like a GKE cluster set up issue, related to Autopilot being enabled. People seem to be complaining about this issue independent of juju, eg

https://github.com/argoproj/argo-cd/issues/13054

It seems plausible to me that if you have configured your GKE cluster to ask Google to manage your configuration for you by enabling AUto Pilot, then it may well result in it denying access to external parties to perform certain operations. Juju needs to create a system service account to use to delegate access to the cluster. Can you retry with Auto Pilot turned off?

Changed in juju:
status: Triaged → Incomplete
Revision history for this message
Alex Lutay (taurus) wrote :

Ian, I am happy to troubleshoot this with you together, as it is still topical:

> juju add-k8s gke
ERROR making juju admin credentials in cluster: ensuring cluster role "juju-credential-1234567" in namespace "kube-system": Get "https://x.x.x.x/apis/rbac.authorization.k8s.io/v1/clusterroles/juju-credential-1234567": getting credentials: exec: executable gke-gcloud-auth-plugin not found

It looks like you are trying to use a client-go credential plugin that is not installed.

To learn more about this feature, consult the documentation available at:
      https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins

Install gke-gcloud-auth-plugin for use with kubectl by following https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl#install_plugin

I believe it is not related to AutoPilot.
My steps to reproduce are here: https://charmhub.io/postgresql-k8s/docs/h-deploy-gke

Revision history for this message
Alex Lutay (taurus) wrote :

As Paulo wrote above, the workaround is to run Juju outside SNAP confinment to add cloud:

> /snap/juju/current/bin/juju add-k8s ...

Tried it with Ian in Madrid and it worked well.

Adding go binary gke-gcloud-auth-plugin to snap/juju could be a final solution to close this ticket.
Thank you!

Revision history for this message
Ian Booth (wallyworld) wrote :

I looked at using the gke gcloud auth plugin (https://github.com/kubernetes/cloud-provider-gcp/tree/master/cmd/gke-gcloud-auth-plugin) which is a static go binary and therefore in theory could be included in the snap to perform the necessary work to create the access token. Sadly, it turns out that this binary is just a fancy wrapper which still calls out to gcloud to do the actual work (which is what juju itself does in 2.9 in the classic snap). I built a strict snap with the gke plugin included to confirm it fails because it tries to call out to gcloud.

There's a third party standalone binary that someone has written
https://github.com/traviswt/gke-auth-plugin

Sadly, the initial experiment with this plugin failed:
executable gke-gcloud-auth-plugin failed with exit code 1

Work would be required to get debugging to see what's wrong. But often 3rd party libs like this continually need to be updated to keep up with upstream api changes. The root cause might be that the plugin tries to access credential files outside the sandbox.

BTW, the lack of a standalone auth plugin is a much complained about issue on various forum, so it's not just us.

For now, we are probably just going to have to ensure the Juju CLI emits a very clear message to instruct users about running the unconfined juju cli when running add-k8s.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.