juju credentials stuck as invalid for vsphere cloud

Bug #2049917 reported by Nishant Dash
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Caner Derici

Bug Description

when dealing with a vsphere cloud, we have credentials loaded into juju under a vsphere cloud.

Adding machines work one day, but sometime(not sure whats the period of time here) later, they get marked as invalid in juju possibly due something from what the vsphere endpoint replies?

```
$ juju show-credential
controller-credentials:
  vsphere:
    vsphere:
      content:
        auth-type: userpass
        validity-check: invalid
        user: <REDACTED>
        vmfolder: <REDACTED>
      models:
        controller: admin
        default: admin
        <REDACTED>: admin
        <REDACTED>: admin
client-credentials:
  vsphere:
    vsphere:
      content:
        auth-type: userpass
        user: <REDACTED>
        vmfolder: K8S-PRD_VMs
```

notice, this line
```
validity-check: invalid
```

juju add-machine silently fails and enabling DEBUG via model-config logging-config for the module `juju.worker.provisioner` does not yield any info, implying that it does not even get started.

juju_engine_report mentions that
```
              environ-tracker:
                error: '"valid-credential-flag" not set: dependency not available'
                inputs:
                - api-caller
                - is-responsible-flag
                - valid-credential-flag
                state: stopped
```

And when I get into mongo, I can see
```
juju:PRIMARY> db.cloudCredentials.find().pretty()
{
        "_id" : "vsphere#admin#vsphere",
        "owner" : "admin",
        "cloud" : "vsphere",
        "name" : "vsphere",
        "revoked" : false,
        "auth-type" : "userpass",
        "attributes" : {
                "password" : <REDACTED>,
                "user" : <REDACTED>,
                "vmfolder" : "K8S-PRD_VMs"
        },
        "invalid" : true,
        "txn-revno" : NumberLong(25),
        "txn-queue" : [ ],
        "invalid-reason" : "cloud denied access: ServerFaultCode: Cannot complete login due to an incorrect user name or password."
}
```

This invalid reason, never gets mentioned anywhere in any logs or credential related commands.

NOTE: that the credentials are actually valid (as tested with govc cli and using vsphere UI).

Some Troubleshooting:
- Tried to `juju update-credential vsphere vsphere` but that does not help
- Restarted all juju controller agents after the credential update but that does not help.
- Update creds again after juju controller was bounced still did not help.

Even after these steps above, the credential still shows up as invalid.

Workaround:

Stop the juju controllers (here, with `systemctl stop jujud-machine-*.service`) and update the cloudCredentials doc as such,
```
db.cloudCredentials.updateOne({"_id": "vsphere#admin#vsphere"}, {$set: {"invalid": false}})
db.cloudCredentials.updateOne({"_id": "vsphere#admin#vsphere"}, {$unset: {"invalid-reason": ""}})
```

after doing so, juju add-machine works perfectly fine.

Revision history for this message
Nishant Dash (dash3) wrote :

also, this is with juju 2.9.44

Changed in juju:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Joseph Phillips (manadart) wrote :

When attempting to re-validate using update-credential, we succeeded on the client, but failed on the controller with the message:

couldn't find instance "juju-72e7e0-2" for machine 2

There happened to be an odd issue with this machine where it had powered off state, as far as Juju was aware, but was actually live. This is a chicken/egg problem, because the instance-poller can't poll and update machine status without a valid credential for the cloud.

If we look in domain/credential/service/modelcredential.go, we can see where this error comes from.

There is no need to cross-reference all machine instances to confirm credential validity. The fact that we can make any call at all to the cloud is sufficient verification. That should be all we try to do.

Changed in juju:
milestone: none → 3.1.8
assignee: nobody → Caner Derici (cderici)
Nishant Dash (dash3)
description: updated
Harry Pidcock (hpidcock)
Changed in juju:
milestone: 3.1.8 → 3.3.3
Ian Booth (wallyworld)
Changed in juju:
milestone: 3.3.3 → 3.3.4
Caner Derici (cderici)
Changed in juju:
status: Triaged → In Progress
Revision history for this message
Caner Derici (cderici) wrote :
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.