juju azure auth stopped working

Bug #1735402 reported by Jacek Nykis
66
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Anastasia

Bug Description

I have an environment where I wanted to add new unit but I'm getting 401 errors:

$ juju add-unit --debug myservice
11:19:12 INFO juju.cmd supercommand.go:63 running juju [2.2.6 gc go1.8]
11:19:12 DEBUG juju.cmd supercommand.go:64 args: []string{"juju", "add-unit", "--debug", "myservice"}
11:19:12 INFO juju.juju api.go:67 connecting to API addresses: [1.2.3.4:17070 192.168.16.4:17070]
11:19:13 DEBUG juju.api apiclient.go:863 successfully dialed "wss://1.2.3.4:17070/model/<uuid>/api"
11:19:13 INFO juju.api apiclient.go:617 connection established to "wss://1.2.3.4:17070/model/<uuid>/api"
11:19:14 DEBUG juju.api monitor.go:35 RPC connection died
ERROR cannot add unit 1/1 to application "myservice": cannot add unit to application "myservice": getting instance types: listing VM sizes: autorest#WithErrorUnlessStatusCode: POST https://login.windows.net/<uuid2>/oauth2/token?api-version=1.0 failed with 401 Unauthorized: StatusCode=401
11:19:14 DEBUG cmd supercommand.go:459 error stack:
cannot add unit 1/1 to application "myservice": cannot add unit to application "myservice": getting instance types: listing VM sizes: autorest#WithErrorUnlessStatusCode: POST https://login.windows.net/<uuid2>/oauth2/token?api-version=1.0 failed with 401 Unauthorized: StatusCode=401
github.com/juju/juju/rpc/client.go:149:
github.com/juju/juju/api/apiclient.go:944:

I rerun "juju add-credential azure" in controller and workload models but that did not help.

I also wanted to upgrade juju since I'm not on the latest version but that's also blocked by the same problem:

$ juju upgrade-juju
no prepackaged tools available, using local agent binary 2.2.6.1
CRITICAL ********** SetModelAgentVersion: 2.2.6.1 false
ERROR cannot make API call to provider: autorest#WithErrorUnlessStatusCode: POST https://login.windows.net/<uuid>/oauth2/token?api-version=1.0 failed with 401 Unauthorized: StatusCode=401

My juju CLI version is 2.2.6-trusty-amd64 and the controller model is on 2.2.3

Tags: canonical-is
Jacek Nykis (jacekn)
tags: added: canonical-is
Revision history for this message
Nicholas Skaggs (nskaggs) wrote :

Juju is returning the cloud provider error. Given this worked in the past, I would suspect an issue with azure or your credentials.

Changed in juju:
status: New → Invalid
status: Invalid → Incomplete
Revision history for this message
Laurent Sesquès (sajoupa) wrote :

I had the same issue on the same environment, but with add-machine.

This is a bit off-topic but is it normal that juju would try to contact the provider to manually add a machine ?

After Jacek's actions, I did a 'juju update-credential' just in case.
Then tried another add-machine, and it worked.
I compared the credentials before and after, they were still the same.

So, either Azure had an issue, or juju did and update-credential solved it.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Laurent Sesques (sajoupa),

How did you 'compare' credentials?

'juju update-credential' updates the credential stored on the controller. However, we do not have the command for you to see it. The commands that we do have only look at your local credential store.

In other words, the only way to see the effect of 'update-credential' is to get a working Juju back - like the ability to add machines, etc. :D

Changed in juju:
status: Incomplete → Invalid
Revision history for this message
Paul Gear (paulgear) wrote :

I'm seeing this on attempting a juju upgrade-juju also:

$ juju upgrade-juju --agent-version=2.2.6
CRITICAL ********** SetModelAgentVersion: 2.2.6 false
ERROR cannot make API call to provider: autorest#WithErrorUnlessStatusCode: POST https://login.windows.net/ab6fb1fe-3542-44aa-97bf-5e7040a00b90/oauth2/token?api-version=1.0 failed with 401 Unauthorized: StatusCode=401

This is an environment which has been working fine for many months. Is it possible that Azure credentials can expire in ways which don't affect credentials for other providers?

Changed in juju:
status: Invalid → New
description: updated
Revision history for this message
John A Meinel (jameinel) wrote :

I wonder if something else changed in this process.
I know there have been changes to things like Azure Resource Manager support vs the old Azure APIs. Its also possible that we're now using a new API that we didn't use as much in the past (storage comes to mind), which means that a token that used to have enough permissions to do everything now also needs permissions to access a different API.

Changed in juju:
importance: Undecided → High
status: New → Incomplete
Revision history for this message
John A Meinel (jameinel) wrote :

@axwalk any ideas here?

Revision history for this message
David Lawson (deej) wrote :

I'm seeing this while trying to upgrade juju in the Azure JaaS environments, which are fully functional otherwise, is there debugging we can collect to help diagnose this? Obviously it's blocking juju upgrade we'd like to perform. I've tried with the latest published client, 2.3.1.

Changed in juju:
status: Incomplete → New
Revision history for this message
Anastasia (anastasia-macmood) wrote :

How did you determine that this environment is "fully functional"?

If the problem is indeed with this credential, you would not know that it has expired until you do something that requires it like upgrade-juju or add-machine or similar...

So if you cannot upgrade juju or add a machine, you'd need to:
(1) get a new credential (they expire after some period of time, say a year... you'd know best when this one is),
(2) update credential's copy on the controller using update-credential command (we do not have means at the moment to see what credential controller has unless you want to examine database. Most of CRUD credential commands look at your local copy. 'update-credential' is the ONLY one that interacts with the controller's copy... It's the gap we wanted to fill for a while);
(3) then you should be able to do your desired operation - upgrade-juju, etc...

Changed in juju:
status: New → In Progress
assignee: nobody → Anastasia (anastasia-macmood)
Revision history for this message
Paul Gear (paulgear) wrote :

I suspect @anastasia-macmood is right and this is a problem with invalid credentials; I tested deploying a new unit in the existing environment and also bootstrapping a new controller using the same credentials, and both failed.

Changed in juju:
status: In Progress → Incomplete
assignee: Anastasia (anastasia-macmood) → nobody
Revision history for this message
David Lawson (deej) wrote :

Okay, we need to really get this sorted out, we've got an haproxy unit we can't manage (it changed IPs) in jaas-west-europe and we can't add a new one there. Here's what I did in jaas-azure-west-europe which is demonstrating this issue:

1. juju update-credentials azure jaas-prodstack-cdo-azure, went through the portal login and OAuth process, add-unit still fails, this is all using the correct Azure subscription ID, I can see all the JaaS resources under this subscription.
2. juju add-credentials azure jaas-test-credentials, went through the portal login and OAuth process
3. juju set-default-credentials azure jaas-test-credentials, add-unit still fails
4. Locally, juju add-credentials azure jaas-test-credentials, went through the processs, juju bootstrap works.

To my mind that "proves" the credentials work, at least in a clean environment. Is it possible the controller is stuck using an old URL to do the VM sizes query, which is what's (apparently) failing? Are there controller logs or information on the controller that I can compare against my locally generated secrets to make sure it's using the same set?

Changed in juju:
status: Incomplete → New
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@David Lawson (deej),

To clarify...

When we bootstrap a controller, a local credential is uploaded and cached there. This credential would be used when you create models, add units/machines, etc... You can specify a different credential for a model at 'add-model' by using '--credential' option and referring to your local credential.

The *only* command that affects that cached credential is 'update-credential'. ALL other credential commands work on your locally stored credentials.

So...

(4) will work because you have updated your local credential and it was used in bootstrap;
(3) will fail because 'set-default-credentials' works on your local credentials cache, the one on you client machine. It does not change anything on the controller credentials cache.

(1) is a different case... When credentials are cached on the controller they are specific to a user. So the same user that added a model (or bootstrapped a controller) would be able to update the credential that 'controller' model uses.

I would have expected the following workflow to work:

i. Add a new azure credential to your local cache using 'add-credential';
ii. From that client, update the credential on the controller using 'update-credential' (you may need to be switched to the model which needs this new credential).

If this is not happening, please ping me on irc. We may need to work f-2-f to figure it out.

Changed in juju:
status: New → Incomplete
Revision history for this message
David Lawson (deej) wrote :

I've worked through this with Anastasia on IRC, the process to fix the situation we're in is:

1. juju add-credential azure
2. juju credentials azure --format yaml --show-secrets
3. Take the output of the above, paste it into azure-creds.yaml, update the values of appication-id and application-password for your old credential to match those of the one you just added.
4. juju add-credential azure -f azure-creds.yaml --replace
5. juju update-credential azure <name of your old creds>

Changed in juju:
status: Incomplete → Invalid
Revision history for this message
Junien F (axino) wrote :

Hi,

I wouldn't consider this bug invalid. As far as I know, we're not using Azure in a special way or anything, so this is bound to hit other juju users.

So either :
a) the juju Azure documentation must be updated to provide a way to create credentials that don't expire
b) the juju UI should trivialize a credentials update (and it should be documented as well)

Thanks :)

Changed in juju:
status: Invalid → New
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Junien Fridrick (axino),

Cloud credential that never expires seems like something the cloud provider should handle :) I cannot see how our documentation can be concerned with that :D

As for the fact that credentials UX needs improvement - we know \o/ There is work scheduled in other teams to handle this and I will keep the conversation alive. This is not a simple fix that can be isolated to one PR, this is a re-work/renovation...

Since we have more generic bugs and work items to track the actual improvement in credential UX, this particular report, pertaining to difficulties for a particular production environment has been resolved. I agree with the assessment that it is Invalid.

Changed in juju:
status: New → Invalid
Revision history for this message
Junien F (axino) wrote :

> Cloud credential that never expires seems like something the cloud provider should handle :) I cannot see how our documentation can be concerned with that :D

You do have a pretty detailed documentation on how to set up Azure at https://jujucharms.com/docs/stable/help-azure. I feel like it'd be worth mentioning something there. But I guess these pages would be changed during the creds UX rework ? :)

Revision history for this message
Dean Henrichsmeyer (dean) wrote :

No, we can do better. This is a real issue and therefore not invalid. If it requires significant refactoring outside the context of a single bug that's fine but we shouldn't mark this as invalid and move on.

At the very least it should be covered in the documentation properly as Junien mentions unless and until it's addressed properly.

Changed in juju:
status: Invalid → Confirmed
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Dean Henrichsmeyer (dean),

I completely agree re: work around cloud credential UX improvement must be done. The gap analysis and the improved design has been done months ago \o/
Priorities shifted since then, and, in addition, the responsibility for implementing these improvements moved outside of the team that is looking after 'juju" project on launchpad.

This report was to help out with a particular production environment and as such does not cover the extend of changes that need to take place to "do better" :D Hence, it was marked as Invalid.

I will bring up with the team this topic again and will keep the report open for now, although not much can be done to improve user experience immediately and the issue on the production environment in question here has been resolved.

FWIW - doc issue https://github.com/juju/docs/issues/2441

Changed in juju:
status: Confirmed → In Progress
assignee: nobody → Anastasia (anastasia-macmood)
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Since this is the latest report, I will update here and will clean up other bugs related to credential management confusion once the work is completed.

Here is what I am doing to improve users experience around credentials:

1. [DONE] *Add more clarity around credentials commands*, i.e. which credential is being added/updated/deleted. This should help users to realise that there is a distinction between credentials stored locally on a client and credentials stored remotely on the controller. PR against develop that deals with locally stored credentials and their commands - https://github.com/juju/juju/pull/8363

2. [DOING] *Be clear what credential a model uses*. This involves showing users (with controller or model admin access) which credential is currently in use in 'show-model' output. As a drive-by, this work corrects api output for ModelInfo to filter out this information for non-authorised users.

3. [TODO] Add show-credential command to allow the owner of the credential to see the contents for it stored on the controller (secrets will be omitted). This command will provide clear messaging that it deals with *controller* stored credentials (not to be confused with locally stored).

4. [TODO] Renovate 'update-credential' command to operate at the model-scope, allowing authorised users to update/replace model credential without knowing its name. This command will provide clear messaging that it deals with *controller* stored credentials (not to be confused with locally stored).
In addition, this work may require a check of suitability for the new credential - will it work for this model, i.e. can Juju still see existing machines? As a consequence of this exercise, there is a potential to add a new command to check validity of a current model credential - IS operators have mentioned the need for the command to do this check on a few occasions :D

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Part 2 Update...

There is a debate around whether it's ok to show model credential name to non-admin users. Meanwhile, the PR that get cloud credetials to admins only on ModelManager v3 has landed in develop https://github.com/juju/juju/pull/8364 and the bug ModelManager v4 - https://bugs.launchpad.net/juju/+bug/1749348

The PR that allows to see model credential in 'show-model' and 'models --format=yaml/json' output - https://github.com/juju/juju/pull/8372

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Part another update - we've decided not to be too restrictive and to return model credential name to any model user. I've 'Invalid'ated the bug for MMv4 and reverted the change for MMv3- : https://github.com/juju/juju/pull/8373

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Part 2 is done and has landed in develop.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Part 3 (update):

'show-credential(s)' command is currently being proposed [1]. It allows the owner of the credential to inspect its content and discover what models use it. If the credential name and cloud is not specified, the command will show all controller-stored credentials for the logged on user. It has been decided that secrets can be shown here as well, if asked, since the asking user is the credential owner that gave Juju these details in the first place.

[1]

* [Proposed] 'show-credential(s)', CLI command and api layer: https://github.com/juju/juju/pull/8429;

* [Landed in develop] list user credential(s) and its usage, apiserver layer: https://github.com/juju/juju/pull/8422;

* [Landed in develop] credential(s) models and credential owner model access, state call: https://github.com/juju/juju/pull/8418;

* [Landed in develop] list user credential(s) content, state call: https://github.com/juju/juju/pull/8399.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

The most recent work that is going into 2.5 has been to detect that a cloud credential that a model uses has become invalid, i.e. cloud rejects calls with auth errors, and to stop communicating with the cloud until the credential is changed.

This should minimise the amount of logging that is happening on the model with an invalid credential -most of the logging was related to a model making and failing a cloud call.

In addition to allowing model and controller administrators to update cloud credential as per existing process described in this bug, Juju 2.5 allows to replace model credential to a different credential, see 'set-credential' help for more details.

The fact that a cloud credential has become invalid is discoverable too - there are additional stanzas in 'show-model' command as well as the commands to 'show-credential <name>' [shows specified credential] and 'show-credentials' [shows all credentials stored on the controller for the currently logged on user].

I am closing this bug as 'Fix Committed' and targeting 2.5 milestone.

Changed in juju:
status: In Progress → Fix Committed
milestone: none → 2.5-rc1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.