Juju secret doesn't exist in cross-cloud relation

Bug #2046484 reported by Marcelo Henrique Neppel
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth
3.3
Fix Released
High
Ian Booth

Bug Description

Hi Juju team!

When testing a cross-cloud relation between two charms, I receive an error telling me that the secret shared by one side doesn't exist for the other side to consume.

The following setup uses Istio to connect two K8S clusters (one in AKS and the other in GKE): https://docs.google.com/document/d/13IopQX1YdlzY-tF-cMSOusMzuNkx5p65Tds4b1qturA/edit#bookmark=id.j1vl4tq0koep

Locally, with two K8S clusters spawned by Kind (Kubernetes in Docker), I could use a hacky workaround (create some roles and give access to the secret that was offloaded to a namespace in the consumer cluster with the same name as the namespace where the secret was created in the first cluster). However, in this case, even with a workaround I still get an error telling me that the secret doesn't exist.

I could test a cross-cloud relation for VM between controllers bootstrapped in EC2 and GCE. Secrets worked there.

Using the above document, you can bootstrap the environment and deploy the secrets demo charms from https://github.com/PietroPasotti/secrets-demo-charms, then create a cross-cloud relation between them.

juju switch aks:dev1
juju deploy ./owner_ubuntu-20.04-amd64.charm
juju offer owner:secret_id secret-id

juju switch gke:dev
juju deploy ./holder_ubuntu-20.04-amd64.charm
juju consume aks:admin/dev1.secret-id
juju relate holder secret-id

Then you can see the following errors:

unit-holder-0: 16:24:39 INFO juju.worker.uniter.operation ran "secret_id-relation-joined" hook (via hook dispatching script: dispatch)
unit-holder-0: 16:24:40 ERROR unit.holder/0.juju-log secret_id:6: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-get', 'secret://548fc04a-1364-4f8f-875c-f9f39c7bed38/cltlctmrg8jc758htor0', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 282, in get_secret
    content = self._backend.secret_get(id=id, label=label)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2920, in secret_get
    result = self._run('secret-get', *args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR getting cluster client: unable to determine legacy status for namespace "dev1": namespaces "dev1" is forbidden: User "system:serviceaccount:dev:holder" cannot get resource "namespaces" in API group "" in the namespace "dev1"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-info-get', 'secret://548fc04a-1364-4f8f-875c-f9f39c7bed38/cltlctmrg8jc758htor0', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2930, in _run_for_secret
    return self._run(*args, return_output=return_output, use_json=use_json)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR secret "cltlctmrg8jc758htor0" not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./src/charm.py", line 83, in <module>
    main(ConsumerCharm)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/main.py", line 435, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/main.py", line 144, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/framework.py", line 355, in emit
    framework._emit(event) # noqa
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/framework.py", line 824, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/framework.py", line 899, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 62, in _on_update_status
    secret = self._obtain_secret()
  File "./src/charm.py", line 35, in _obtain_secret
    return self.model.get_secret(id=secret_id)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 287, in get_secret
    info = self._backend.secret_info_get(id=id, label=label)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2944, in secret_info_get
    result = self._run_for_secret('secret-info-get', *args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2933, in _run_for_secret
    raise SecretNotFoundError() from e
ops.model.SecretNotFoundError
unit-holder-0: 16:24:40 ERROR juju.worker.uniter.operation hook "secret_id-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-holder-0: 16:24:40 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook

Is something missing in the cross-cloud configuration to make secrets work?

description: updated
description: updated
description: updated
Revision history for this message
Ian Booth (wallyworld) wrote :

What version of juju are you using?

The root cause looks to be this error

ops.model.ModelError: ERROR getting cluster client: unable to determine legacy status for namespace "dev1": namespaces "dev1" is forbidden: User "system:serviceaccount:dev:holder" cannot get resource "namespaces" in API group "" in the namespace "dev1"

From memory, at one point there was a missing role binding that was needed but that should have been fixed.

Changed in juju:
status: New → Incomplete
Revision history for this message
Marcelo Henrique Neppel (neppel) wrote :

I'm currently using 3.1.6.

Revision history for this message
Marcelo Henrique Neppel (neppel) wrote :
Download full text (4.9 KiB)

I tried to hack a bit with the rolebinding using the following commands (the CTX_CLUSTER2 environment variable points to the cluster with the model that consumes the secret).

kubectl --context="${CTX_CLUSTER2}" create namespace dev1

kubectl --context="${CTX_CLUSTER2}" create clusterrole holder --verb get,list --resource secret --resource namespace

kubectl --context="${CTX_CLUSTER2}" create rolebinding holder -n dev1 --serviceaccount dev:holder --clusterrole holder

But still got some errors related to the secret not being found.

unit-holder-0: 10:34:43 ERROR unit.holder/0.juju-log secret_id:10: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-get', 'secret://548fc04a-1364-4f8f-875c-f9f39c7bed38/clu5bqurg8jc758htov0', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2920, in secret_get
    result = self._run('secret-get', *args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR secret "clu5bqurg8jc758htov0-1" not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 282, in get_secret
    content = self._backend.secret_get(id=id, label=label)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2923, in secret_get
    raise SecretNotFoundError() from e
ops.model.SecretNotFoundError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-info-get', 'secret://548fc04a-1364-4f8f-875c-f9f39c7bed38/clu5bqurg8jc758htov0', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2930, in _run_for_secret
    return self._run(*args, return_output=return_output, use_json=use_json)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR secret "clu5bqurg8jc758htov0" not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./src/charm.py", line 83, in <module>
    main(Cons...

Read more...

Revision history for this message
Paulo Machado (paulomachado) wrote :

It's possible to reproduce this on the same k8s controller, through a cross-model relation.

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

I've moved back to "New" as Marcelo and Paulo provided additional information.

Changed in juju:
status: Incomplete → New
Revision history for this message
Paulo Machado (paulomachado) wrote (last edit ):

Retested with juju-3.1.7/microk8s-1.28.3 and Pietro's secret demo charms, with same outcome.

For quick reproduction in a bootstrapped controller, verbatim steps are:

# Owner
git clone https://github.com/PietroPasotti/secrets-demo-charms
cd secrets-demo-charms/owner
charmcraft pack
juju add-model owner
juju deploy ./*.charm
juju offer owner:secret_id

# Holder
cd ../holder
charmcraft pack
juju add-model holder
juju deploy ./*.charm
juju consume owner.owner
juju relate owner holder:secret_id

# edited to fix charmcraft call

Revision history for this message
Ian Booth (wallyworld) wrote :

I tried the above steps and it worked perfectly.

As expected, the holder unit is updated with the secret content

App Version Status Scale Charm Channel Rev Address Exposed Message
holder active 1 holder 0 10.152.183.222 no admin/admin

And I can manually get the secret

$ juju exec --unit holder/0 -- secret-get secret://535ceafd-33e0-418d-8525-a28ec72949a5/cmflh89bauss7b4klmcg
password: admin
username: admin

No errors in the logs.

This is on microk8s 1.28.3

The only thing I can think of is maybe the owner charm "grant" config has been set to false?

Changed in juju:
status: New → Incomplete
Revision history for this message
Ian Booth (wallyworld) wrote :

Something to check would be, can you access the secret using juju show-secret. You could also double check that the permission is granted - in 3.1 you'd need to look at the permissions collection.

Revision history for this message
Paulo Machado (paulomachado) wrote (last edit ):

Hi Ian, i was puzzled by this, but talking with Marcelo, I think he nailed it.

Is your microk8s setup with rbac enabled?

Just retested disabling rbac and indeed works perfectly.
Unfortunately we cannot not enable rbac.

Changed in juju:
status: Incomplete → New
Revision history for this message
Ian Booth (wallyworld) wrote (last edit ):

Ah thank you, that was the missing piece.

The root cause might not be related to secrets. There's a missing role/role binding that's needed. This probably will affect other parts of Juju as well.

--

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 282, in get_secret
    content = self._backend.secret_get(id=id, label=label)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2920, in secret_get
    result = self._run('secret-get', *args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR getting cluster client: finding controller namespace with non legacy labels: namespaces is forbidden: User "system:serviceaccount:holder:holder" cannot list resource "namespaces" in API group "" at the cluster scope

Changed in juju:
milestone: none → 3.1.8
importance: Undecided → High
status: New → Triaged
Revision history for this message
Ian Booth (wallyworld) wrote :

The error in the previous comment was due to a different problem with the app's cluster role rules; this still needs to be fixed.

What it looks like is happening is that the controller service which handles cross model secrets is failing to create the cluster role attenuated to the viewable secrets. If this is done by hand, things start to work. So if we fix that, it should address the problem.

Ian Booth (wallyworld)
Changed in juju:
assignee: nobody → Ian Booth (wallyworld)
Revision history for this message
Ian Booth (wallyworld) wrote :
Changed in juju:
status: Triaged → In Progress
Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Paulo Machado (paulomachado) wrote :

Thank you!

Revision history for this message
Marcelo Henrique Neppel (neppel) wrote :

Thank you, Ian!

Revision history for this message
Marcelo Henrique Neppel (neppel) wrote :
Download full text (5.3 KiB)

Hi Ian, I tested the fix using the 3.1/edge channel version and an OCI image for jujud-operator built locally from the 3.1 branch.

It worked for the cross-model relation between the owner and the holder charms residing in two different models in the same controller. I also tested with two models in two separate controllers from the same microk8s installation.

However, when testing the setup from the description of this bug (a cross-cloud relation using AKS and GKE, which were connected through Istio), I got the following error just after relating the charms:

unit-holder-0: 17:24:44 ERROR unit.holder/0.juju-log secret_id:0: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-get', 'secret://f05df6b7-5a33-492e-8b4f-ffac782ed231/cmkoh6qn4cfs75fdl500', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 282, in get_secret
    content = self._backend.secret_get(id=id, label=label)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2920, in secret_get
    result = self._run('secret-get', *args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR getting cluster client: annotations map[controller.juju.is/id:68f9f46a-1325-4b01-821e-b604eb5e9a7e model.juju.is/id:674895e7-0593-4e33-8487-560b02640a63] for namespace "dev" not valid must include map[controller.juju.is/id:604d769c-fc90-4556-84d2-7d2a96064094 model.juju.is/id:f05df6b7-5a33-492e-8b4f-ffac782ed231]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2564, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-holder-0/secret-info-get', 'secret://f05df6b7-5a33-492e-8b4f-ffac782ed231/cmkoh6qn4cfs75fdl500', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2930, in _run_for_secret
    return self._run(*args, return_output=return_output, use_json=use_json)
  File "/var/lib/juju/agents/unit-holder-0/charm/venv/ops/model.py", line 2570, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: ERROR secret "cmkoh6qn4cfs75fdl500" not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./src/charm.py", lin...

Read more...

Revision history for this message
Ian Booth (wallyworld) wrote (last edit ):

Interesting. I tested on microk8s and like for you it seemed to be ok. Specifically, the root cause of the issue (missing role rules to allow a service account token handed to the unit agent of the consuming unit to read the secret) was addressed in the patch which landed.

We can take another look and see what else might need fixing. Since the root cause was fairly obvious, I was expecting the fix to work everywhere.

Revision history for this message
Marcelo Henrique Neppel (neppel) wrote :

I have an environment where you can test it if you want (with Istio connecting both clusters). Please let me know if wanna access to it.

tags: added: canonical-data-platform-eng
Revision history for this message
Ian Booth (wallyworld) wrote (last edit ):

I have found the root cause. Because it's a different issue, I've raised a new bug for it.

https://bugs.launchpad.net/juju/+bug/2051109

Note that I was not able to reproduce this error at all:

ops.model.ModelError: ERROR getting cluster client: annotations map[controller.juju.is/id:68f9f46a-1325-4b01-821e-b604eb5e9a7e model.juju.is/id:674895e7-0593-4e33-8487-560b02640a63] for namespace "dev" not valid must include map[controller.juju.is/id:604d769c-fc90-4556-84d2-7d2a96064094 model.juju.is/id:f05df6b7-5a33-492e-8b4f-ffac782ed231]

But I was able to make the demo "owner" and "holder" charms work properly in a cross model deployment on separate GKE and AKS clusters by fixing the k8s api address issue described in the new bug.

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.