After keystone admin password changed, user account locked

Bug #1853017 reported by Peng Peng on 2019-11-18
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
High
yong hu

Bug Description

Brief Description
-----------------
changing keystone admin password. After 180 secs, all system CMD failed by user account locked.

Severity
--------
Major

Steps to Reproduce
------------------
as description

TC-name: security/test_keystone_admin_psswd_change.py::test_admin_password

Expected Behavior
------------------
system CMD all working with new password

Actual Behavior
----------------
user account locked

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Multi-node system

Lab-name: WCP_71-75

Branch/Pull Time/Commit
-----------------------
2019-11-15_20-00-00

Last Pass
---------
2019-11-08_20-00-00

Timestamp/Logs
--------------
[2019-11-17 04:38:28,728] 311 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-identity-api-version 3 --os-interface internal --os-region-name RegionOne user set --password '!Li69nux*9' admin'

[2019-11-17 04:41:31,564] 311 DEBUG MainThread ssh.send :: Send 'keyring get CGCS admin'
[2019-11-17 04:41:32,142] 433 DEBUG MainThread ssh.expect :: Output:
!Li69nux*9

[2019-11-17 04:41:32,246] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list'
[2019-11-17 04:41:33,067] 433 DEBUG MainThread ssh.expect :: Output:
The account is locked for user: dcd90f0c830f467b92ff4cf3e6c4bb5a. (HTTP 401) (Request-ID: req-1b6dc762-f417-4801-9900-040b0e5e39e7)

Test Activity
-------------
Regression Testing

Peng Peng (ppeng) wrote :
Ghada Khalil (gkhalil) wrote :

Marking as stx.3.0 / high priority - appears to have been broken in the last week.

Changed in starlingx:
importance: Undecided → High
status: New → Triaged
tags: added: stx.3.0 stx.security
Ghada Khalil (gkhalil) on 2019-11-18
Changed in starlingx:
assignee: nobody → yong hu (yhu6)
tags: added: stx.distro.openstack
Yang Liu (yliu12) on 2019-11-19
tags: added: stx.retestneeded
yong hu (yhu6) wrote :

The issue was reproduced at:

the password for admin was indeed changed by following command: openstack user set --password 'newpassword' admin, and it also updated to "keyring". However it was not timely reflected to sysinv, so auth for "system" commands would fail if the user name and password were not *explicitly" set by:
--os-username 'admin' --os-password 'newpassword'

Will look into the cause and what recent change led to this issue.

yong hu (yhu6) wrote :

The root cause was found:
After changing the password for "admin", it took effect in keyring. That's why "keyring get CGCS admin" returns the correct password.
However, the local environment OS_PASSWORD (which was set by "source /etc/platform/openrc") still held the old password.

The solution is to re-apply "source /etc/platform/openrc", which should update OS_PASSWORD by
```
export OS_PASSWORD=`TERM=linux /opt/platform/.keyring/19.09/.CREDENTIAL 2>/dev/null`
```

In addition, I checked STX.2.0, the same behavior was there as what we are seeing now.

So, this won't be an issue.

Changed in starlingx:
assignee: yong hu (yhu6) → Peng Peng (ppeng)
Peng Peng (ppeng) wrote :

The TC does not use "source /etc/platform/openrc" prior to run system cmd and it used new password as log shows,
system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list

Ghada Khalil (gkhalil) wrote :

Marking as Invalid based on Yong's investigation.

Changed in starlingx:
status: Triaged → Invalid
assignee: Peng Peng (ppeng) → yong hu (yhu6)
Ghada Khalil (gkhalil) wrote :

Assigning back to Yong since our policy is to keep the bug assigned to the development prime

yong hu (yhu6) wrote :

@peng, by specifying the updated password in commands explicitly, did your TCs work or not?

I tried this way as well on my side and the command worked.

In addition, in the bash history from the log tarball you attached, I saw the new password was "xxxxxx". Was it expected?

Peng Peng (ppeng) wrote :

Reproduced on 2019-11-19_20-00-00 (wcp_63-66)

[sysadmin@controller-1 ~(keystone_admin)]$ openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-identity-api-version 3 --os-interface internal --os-region-name RegionOne user set --password '!Li69nux*9' admin
[sysadmin@controller-1 ~(keystone_admin)]$ keyring get CGCS admin
!Li69nux*9
[sysadmin@controller-1 ~(keystone_admin)]$ sudo vi /var/log/bash.log
[sysadmin@controller-1 ~(keystone_admin)]$ openstack user list
The request you have made requires authentication. (HTTP 401) (Request-ID: req-59069eea-2bf5-43db-88f8-1bc6e08277a6)
[sysadmin@controller-1 ~(keystone_admin)]$ system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list
The account is locked for user: 3480356374d4409bab26d72d1fdf4bee. (HTTP 401) (Request-ID: req-1ff0cc2c-c4ef-4c61-af75-f5bef496e62b)

And did not see "xxxxx" in bash.log

2019-11-20T19:00:22.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-identity-api-version 3 --os-interface internal --os-region-name RegionOne user set --password '!Li69nux*9' admin
2019-11-20T19:00:34.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 keyring get CGCS admin
2019-11-20T19:00:58.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 sudo vi /var/log/bash.log
2019-11-20T19:01:22.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 openstack user list
2019-11-20T19:01:37.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list
2019-11-20T19:01:51.000 controller-1 -sh: info HISTORY: PID=231945 UID=42425 sudo vi /var/log/bash.log

Peng Peng (ppeng) on 2019-11-20
Changed in starlingx:
status: Invalid → Confirmed
yong hu (yhu6) wrote :

It turned out this is a security enhancement done by this patch (merged on Sept 18):
https://review.opendev.org/#/c/682137

After trying over 5 times with incorrect (old) password, the account will be locked for 1800 seconds.

+ keystone_config {
+ 'security_compliance/lockout_duration': value => 1800;
+ 'security_compliance/lockout_failure_attempts': value => 5;
+ }

Inside your log tarball, keystone-all.log indicated there were 6 authorization failures before the account locked. See the attachment.

to avoid the issue, right after the password is changed, you can apply the new password in your TC by:
export OS_PASSWORD=`TERM=linux /opt/platform/.keyring/19.09/.CREDENTIAL 2>/dev/null`

or explicitly put the updated password in all following test commands.

=================================================================================
BTW: the reason I didn't reproduce this behavior (of account locked) a few days ago was that I did not run commands for over 5 times with obsolete password. At that time, I only tried 1~2 times.

=================================================================================

So in summary, this is not an issue, but an enhanced security feature.

ANIRUDH GUPTA (anyrude10) wrote :

I am facing the Account locked Issue on StarlingX 2.0 Release Branch, even if I have not used any Incorrect Password.

Can someone please update how to disable this feature?

Currently my account is locked, how can I unlock it?

yong hu (yhu6) wrote :

with root permission, you can remove these 2 lines in /etc/keystone/keystone.conf:

lockout_failure_attempts = 5
lockout_duration = 1800

After that, restart keystone services by killing the first process searched by the following grep.
$ ps aux | grep keystone-public

Yang Liu (yliu12) wrote :

Hi Yong,

The problem is after admin password change, the account got locked itself without any user operations.

Yes something was trying to use the old password that caused the account lockout, and investigation is needed on which stx component is doing that.

Jerry Sun (jerry-sun-u) wrote :

Looking at the tarball for the logs attached by Peng, it looks like after the password was changed in bash.log, there is no more activity from registry-token-server in daemon.log. This leads me to believe that something else must be triggering the locking of the account. There are some activity from token server in daemon.log but that was before the password change.

I also tried authenticating to the token server with incorrect credentials on a system without changing the password. This is to try and create an environment where the registry/token server holds incorrect keystone credentials. The admin account did not get locked which means token server does not spam requests at keystone with incorrect credentials until it locks.

Yang Liu (yliu12) wrote :

Note that this issue seems to be only happening on the first admin password change.
Account will be locked for some time and then unlock itself.

Workaround is just to wait...

After that, the subsequent admin password changes are working as expected.

yong hu (yhu6) wrote :

Thanks for update, @Yang.
While making the first time of admin password change, have we already done "system application-apply stx-openstack" in the background?

In addition, the lock period of time should be 30 mins, isn't it?

yong hu (yhu6) wrote :

The issue was root-caused.
In short, password for "admin" in 2 k8s secrets ("default-registry-key" and "registry-local-secret" ) was not updated after the operator "sysadmin" changed the password for "admin" user by "openstack" client.

Though the updated password in keyring and keystone (:5000), there was never chance to refresh these 2 secrets, and they kept using the default password set in ansible playbook (say. localhost.yml).
So, whenever docker client pulls image and requires authentication via "registry-token-server" which furthers turns to keystone (:5000), old/default password for "admin" triggers to authentication failure.

The attachment #1 is the packet I captured by TCPDUMP when the failures happened. "GopherCloud" inside "registry-token-server/keystone/access.go" failed to get auth from keystone because it was using the default (and obsolete) password "Local.123" (set from Ansible playbook).

The attachment #2 is the code pieces in "~/containers/registry-token-server/src/keystone/access.go" which was using the obsolete password from request (from k8s secret "default-registry-key").

After updating these passwords in 2 secrets above, the authentication went on correctly.

yong hu (yhu6) wrote :
yong hu (yhu6) wrote :

If the password for "admin" is changed, any deployment with "default-registry-key" secret or "registry-local-secret" will fail to authenticate.

for example, in "charts/ingress/charts/helm-toolkit/templates/snippets/_kubernetes_pod_rbac_serviceaccount.tpl", line 47:
imagePullSecrets:
  - name: default-registry-key

Peng Peng (ppeng) wrote :

Issue reproduced on DC labs at load: 2019-11-21_20-00-00
After admin pw changed,

openstack user set --password '!Li69nux*9' admin
[sysadmin@controller-1 ~(keystone_admin)]$ keyring get CGCS admin
!Li69nux*9

[sysadmin@controller-1 ~(keystone_admin)]$ system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[fd01:1::2]:5000/v3 - -os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list
The account is locked for user: 27596cc96f034c34b5632c8d8fa52837. (HTTP 401) (Request-ID: req-c70ebb25-8cf7-4756-94ac-8fbcf555c6e9)
[sysadmin@controller-1 ~(keystone_admin)]$ date
Sat Nov 30 00:22:10 UTC 2019

After more that 2 days, the account still showed locked.
[sysadmin@controller-1 ~(keystone_admin)]$ system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[fd01:1::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne servicegroup-list
The account is locked for user: 27596cc96f034c34b5632c8d8fa52837. (HTTP 401) (Request-ID: req-afbc26ea-16ac-4123-8753-bd284e2fbdbf)
[sysadmin@controller-1 ~(keystone_admin)]$ date
Mon Dec 2 16:14:58 UTC 2019

yong hu (yhu6) on 2019-12-03
tags: added: stx.config
Yang Liu (yliu12) wrote :

In Distributed Cloud environment mentioned in Peng's comments, the account was never unlocked. Perhaps it's doing something differently than standalone systems.

To answer previous question from Yong in #16, stx-openstack was not applied when this was seen.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers