After keystone admin password changed, user account locked
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Lin Shuicheng |
Bug Description
Brief Description
-----------------
changing keystone admin password. After 180 secs, all system CMD failed by user account locked.
Severity
--------
Major
Steps to Reproduce
------------------
as description
TC-name: security/
Expected Behavior
------------------
system CMD all working with new password
Actual Behavior
----------------
user account locked
Reproducibility
---------------
Reproducible
System Configuration
-------
Multi-node system
Lab-name: WCP_71-75
Branch/Pull Time/Commit
-------
2019-11-15_20-00-00
Last Pass
---------
2019-11-08_20-00-00
Timestamp/Logs
--------------
[2019-11-17 04:38:28,728] 311 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-
[2019-11-17 04:41:31,564] 311 DEBUG MainThread ssh.send :: Send 'keyring get CGCS admin'
[2019-11-17 04:41:32,142] 433 DEBUG MainThread ssh.expect :: Output:
!Li69nux*9
[2019-11-17 04:41:32,246] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-
[2019-11-17 04:41:33,067] 433 DEBUG MainThread ssh.expect :: Output:
The account is locked for user: dcd90f0c830f467
Test Activity
-------------
Regression Testing
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #1 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #2 |
Changed in starlingx: | |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: stx.3.0 stx.security |
Changed in starlingx: | |
assignee: | nobody → yong hu (yhu6) |
tags: | added: stx.distro.openstack |
tags: | added: stx.retestneeded |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #3 |
The issue was reproduced at:
the password for admin was indeed changed by following command: openstack user set --password 'newpassword' admin, and it also updated to "keyring". However it was not timely reflected to sysinv, so auth for "system" commands would fail if the user name and password were not *explicitly" set by:
--os-username 'admin' --os-password 'newpassword'
Will look into the cause and what recent change led to this issue.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #4 |
The root cause was found:
After changing the password for "admin", it took effect in keyring. That's why "keyring get CGCS admin" returns the correct password.
However, the local environment OS_PASSWORD (which was set by "source /etc/platform/
The solution is to re-apply "source /etc/platform/
```
export OS_PASSWORD=
```
In addition, I checked STX.2.0, the same behavior was there as what we are seeing now.
So, this won't be an issue.
Changed in starlingx: | |
assignee: | yong hu (yhu6) → Peng Peng (ppeng) |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #5 |
The TC does not use "source /etc/platform/
system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #6 |
Marking as Invalid based on Yong's investigation.
Changed in starlingx: | |
status: | Triaged → Invalid |
assignee: | Peng Peng (ppeng) → yong hu (yhu6) |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #7 |
Assigning back to Yong since our policy is to keep the bug assigned to the development prime
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #8 |
- collected-bash-history.png Edit (390.2 KiB, image/png)
@peng, by specifying the updated password in commands explicitly, did your TCs work or not?
I tried this way as well on my side and the command worked.
In addition, in the bash history from the log tarball you attached, I saw the new password was "xxxxxx". Was it expected?
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #9 |
- ALL_NODES_20191120.190415.tar Edit (22.7 MiB, application/x-tar)
Reproduced on 2019-11-19_20-00-00 (wcp_63-66)
[sysadmin@
[sysadmin@
!Li69nux*9
[sysadmin@
[sysadmin@
The request you have made requires authentication. (HTTP 401) (Request-ID: req-59069eea-
[sysadmin@
The account is locked for user: 3480356374d4409
And did not see "xxxxx" in bash.log
2019-11-
2019-11-
2019-11-
2019-11-
2019-11-
2019-11-
Changed in starlingx: | |
status: | Invalid → Confirmed |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #10 |
- 6_auth_failures.png Edit (576.4 KiB, image/png)
It turned out this is a security enhancement done by this patch (merged on Sept 18):
https:/
After trying over 5 times with incorrect (old) password, the account will be locked for 1800 seconds.
+ keystone_config {
+ 'security_
+ 'security_
+ }
Inside your log tarball, keystone-all.log indicated there were 6 authorization failures before the account locked. See the attachment.
to avoid the issue, right after the password is changed, you can apply the new password in your TC by:
export OS_PASSWORD=
or explicitly put the updated password in all following test commands.
=======
BTW: the reason I didn't reproduce this behavior (of account locked) a few days ago was that I did not run commands for over 5 times with obsolete password. At that time, I only tried 1~2 times.
=======
So in summary, this is not an issue, but an enhanced security feature.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
ANIRUDH GUPTA (anyrude10) wrote : | #11 |
I am facing the Account locked Issue on StarlingX 2.0 Release Branch, even if I have not used any Incorrect Password.
Can someone please update how to disable this feature?
Currently my account is locked, how can I unlock it?
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #12 |
with root permission, you can remove these 2 lines in /etc/keystone/
lockout_
lockout_duration = 1800
After that, restart keystone services by killing the first process searched by the following grep.
$ ps aux | grep keystone-public
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Yang Liu (yliu12) wrote : | #13 |
Hi Yong,
The problem is after admin password change, the account got locked itself without any user operations.
Yes something was trying to use the old password that caused the account lockout, and investigation is needed on which stx component is doing that.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Jerry Sun (jerry-sun-u) wrote : | #14 |
Looking at the tarball for the logs attached by Peng, it looks like after the password was changed in bash.log, there is no more activity from registry-
I also tried authenticating to the token server with incorrect credentials on a system without changing the password. This is to try and create an environment where the registry/token server holds incorrect keystone credentials. The admin account did not get locked which means token server does not spam requests at keystone with incorrect credentials until it locks.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Yang Liu (yliu12) wrote : | #15 |
Note that this issue seems to be only happening on the first admin password change.
Account will be locked for some time and then unlock itself.
Workaround is just to wait...
After that, the subsequent admin password changes are working as expected.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #16 |
Thanks for update, @Yang.
While making the first time of admin password change, have we already done "system application-apply stx-openstack" in the background?
In addition, the lock period of time should be 30 mins, isn't it?
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #17 |
The issue was root-caused.
In short, password for "admin" in 2 k8s secrets ("default-
Though the updated password in keyring and keystone (:5000), there was never chance to refresh these 2 secrets, and they kept using the default password set in ansible playbook (say. localhost.yml).
So, whenever docker client pulls image and requires authentication via "registry-
The attachment #1 is the packet I captured by TCPDUMP when the failures happened. "GopherCloud" inside "registry-
The attachment #2 is the code pieces in "~/containers/
After updating these passwords in 2 secrets above, the authentication went on correctly.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #18 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #19 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #20 |
If the password for "admin" is changed, any deployment with "default-
for example, in "charts/
imagePullSecrets:
- name: default-
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #21 |
- ALL_NODES_20191202.155620.tar Edit (111.3 MiB, application/x-tar)
Issue reproduced on DC labs at load: 2019-11-21_20-00-00
After admin pw changed,
openstack user set --password '!Li69nux*9' admin
[sysadmin@
!Li69nux*9
[sysadmin@
The account is locked for user: 27596cc96f034c3
[sysadmin@
Sat Nov 30 00:22:10 UTC 2019
After more that 2 days, the account still showed locked.
[sysadmin@
The account is locked for user: 27596cc96f034c3
[sysadmin@
Mon Dec 2 16:14:58 UTC 2019
tags: | added: stx.config |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Yang Liu (yliu12) wrote : | #22 |
In Distributed Cloud environment mentioned in Peng's comments, the account was never unlocked. Perhaps it's doing something differently than standalone systems.
To answer previous question from Yong in #16, stx-openstack was not applied when this was seen.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #23 |
As mentioned, local registry key was not updated after admin's password was changed.
In this case, whoever tried to pull docker image with "imagePullSecrets" would trigger the authentication failure.
In the attached log ~/var/log/
@Yang and Peng, while we are working on the fixing patch, if you want, you can take following steps to update k8s secrets for local registry: default-
#1. list out secrets for local registry.
kubectl -n kube-system get secrets | grep registry
#2. for encode your new user and password, by the cmd below, for example, my new password is !Li69nux*9
echo -n 'admin:!Li69nux*9' | base64
#3. for updating default-
echo -n '{"auths": {"registry.
#4.Use step#3 encoded auth_data to replace value of ".dockerconfigj
kubectl -n kube-system edit secret default-
#5.for updating registry-
echo -n '{"auths"
#4.Use step5 encoded auth_data to replace value of ".dockerconfigj
kubectl -n kube-system edit secret registry-
Changed in starlingx: | |
status: | Confirmed → In Progress |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
yong hu (yhu6) wrote : | #24 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Change abandoned on config (master) | #25 |
Change abandoned by Lin Shuicheng (<email address hidden>) on branch: master
Review: https:/
Reason: New patch is uploaded: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master) | #26 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit a36b4823b7dbacd
Author: Shuicheng Lin <email address hidden>
Date: Fri Dec 27 11:52:05 2019 +0800
Enable keystone to send out event notification
notification driver need be set for keystone, in order to send out
notification. The driver value could be "messaging, messagingv2,
routing, log, test, noop (multi valued)".
This is in order to monitor admin password change in sysinv.
Partial-Bug: 1853017
Change-Id: Ie55a16723e92ea
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (master) | #27 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit d1294d7e6794606
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 18 12:47:23 2019 +0800
Update Keyring password info before sending out notification
Need update password before send out notification. Otherwise, any
process which monitors the "updated" notification will still get old
password from Keyring.
Partial-Bug: 1853017
Change-Id: Id1c94fedca41ab
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to upstream (f/centos8) | #28 |
Fix proposed to branch: f/centos8
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master) | #29 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 8ab1e2d7c624f83
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 11 16:37:03 2019 +0800
Audit local registry secret info when there is user update in keystone
local registry uses admin's username&password for authentication.
And admin's password could be changed by openstack client cmd. It will
cause auth info in secrets obsolete, and lead to invalid authentication
in keystone.
To keep secrets info updated, keystone event notification is enabled.
And event notification listener is added in sysinv. So when there is
user password change, a user update event will be sent out by keystone.
And sysinv will call function audit_local_
whether kubernetes secret info need be updated or not.
A periodic task is added also to ensure secrets are always synced, in
case notification is missed or there is failure in handle notification.
oslo_messaging is added to tox's requirements.txt to avoid tox failure.
The version is based on global-
Test:
Pass deployment and secrets could be updated automatically with new auth
info.
Pass host-swact in duplex mode.
Closes-Bug: 1853017
Depends-On: https:/
Depends-On: https:/
Change-Id: I959b65288e0834
Signed-off-by: Shuicheng Lin <email address hidden>
Changed in starlingx: | |
status: | In Progress → Fix Released |
tags: | added: in-f-centos8 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (f/centos8) | #30 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: f/centos8
commit 333380daef7623e
Author: Kristal Dale <email address hidden>
Date: Fri Jan 17 13:30:49 2020 -0800
Update landing pages for docs and release notes:
- Use updated project name in titles/text
- Correct text for link to Storyboard (docs)
- Correct capitalization in section headings
- Correct formatting for section headings
- Update project name in link to release notes, api-ref
- Update project name in config for docs/releasenot
Story:2007193
Task:38347
Change-Id: I52a53260042e69
Signed-off-by: Kristal Dale <email address hidden>
commit 8c7def7074be1a5
Author: Don Penney <email address hidden>
Date: Wed Jan 1 18:38:19 2020 -0500
Skip UT in python-
The python-
2020, which causes a failure as of that date. Skip running the tests
as part of the build to avoid this issue.
Change-Id: I85e780c6f40beb
Closes-Bug: 1858049
Signed-off-by: Don Penney <email address hidden>
commit d1294d7e6794606
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 18 12:47:23 2019 +0800
Update Keyring password info before sending out notification
Need update password before send out notification. Otherwise, any
process which monitors the "updated" notification will still get old
password from Keyring.
Partial-Bug: 1853017
Change-Id: Id1c94fedca41ab
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (r/stx.3.0) | #31 |
Fix proposed to branch: r/stx.3.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to upstream (r/stx.3.0) | #32 |
Fix proposed to branch: r/stx.3.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.3.0) | #33 |
Fix proposed to branch: r/stx.3.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (r/stx.3.0) | #34 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.3.0
commit f26899071befc63
Author: Shuicheng Lin <email address hidden>
Date: Fri Dec 27 11:52:05 2019 +0800
Enable keystone to send out event notification
notification driver need be set for keystone, in order to send out
notification. The driver value could be "messaging, messagingv2,
routing, log, test, noop (multi valued)".
This is in order to monitor admin password change in sysinv.
Partial-Bug: 1853017
Change-Id: Ie55a16723e92ea
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit a36b4823b7dbacd
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (r/stx.3.0) | #35 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.3.0
commit 52d7be2f5947d67
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 18 12:47:23 2019 +0800
Update Keyring password info before sending out notification
Need update password before send out notification. Otherwise, any
process which monitors the "updated" notification will still get old
password from Keyring.
Partial-Bug: 1853017
Change-Id: Id1c94fedca41ab
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit d1294d7e6794606
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.3.0) | #36 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.3.0
commit 1c3ba7706559eb9
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 11 16:37:03 2019 +0800
Audit local registry secret info when there is user update in keystone
local registry uses admin's username&password for authentication.
And admin's password could be changed by openstack client cmd. It will
cause auth info in secrets obsolete, and lead to invalid authentication
in keystone.
To keep secrets info updated, keystone event notification is enabled.
And event notification listener is added in sysinv. So when there is
user password change, a user update event will be sent out by keystone.
And sysinv will call function audit_local_
whether kubernetes secret info need be updated or not.
A periodic task is added also to ensure secrets are always synced, in
case notification is missed or there is failure in handle notification.
oslo_messaging is added to tox's requirements.txt to avoid tox failure.
The version is based on global-
Test:
Pass deployment and secrets could be updated automatically with new auth
info.
Pass host-swact in duplex mode.
Closes-Bug: 1853017
Depends-On: https:/
Depends-On: https:/
Change-Id: I959b65288e0834
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit 8ab1e2d7c624f83
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to upstream (r/stx.2.0) | #37 |
Fix proposed to branch: r/stx.2.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0) | #38 |
Fix proposed to branch: r/stx.2.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : | #39 |
Fix proposed to branch: r/stx.2.0
Review: https:/
tags: | added: in-r-stx30 stx.4.0 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #40 |
As recommended by Yong Hu in https:/
tags: | added: stx.2.0 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0) | #41 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.2.0
commit 35d8ccb8a7adc9b
Author: Shuicheng Lin <email address hidden>
Date: Thu Feb 13 10:58:21 2020 +0800
Enable keystone to send out event notification
notification driver need be set for keystone, in order to send out
notification. The driver value could be "messaging, messagingv2,
routing, log, test, noop (multi valued)".
This is in order to monitor admin password change in sysinv.
Partial-Bug: 1853017
Partial-Bug: 1853093
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit a36b4823b7dbacd
Change-Id: Ia6661eaf294f97
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to upstream (r/stx.2.0) | #42 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.2.0
commit dfe155136d3337a
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 18 12:47:23 2019 +0800
Update Keyring password info before sending out notification
Need update password before send out notification. Otherwise, any
process which monitors the "updated" notification will still get old
password from Keyring.
Partial-Bug: 1853017
Partial-Bug: 1853093
Change-Id: Id1c94fedca41ab
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit d1294d7e6794606
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #43 |
Verified on
Lab: WCP_112
Load: 2020-02-20_20-00-00
[sysadmin@
Li69nux*
[sysadmin@
[sysadmin@
[sysadmin@
[sysadmin@
[sysadmin@
[sysadmin@
!Li69nux*9
[sysadmin@
+------
| uuid | service_group_name | hostname | state |
+------
| d14f859a-
| 3cc235b9-
| 2c70019c-
| 3ce317e2-
| 76b5f3c0-
| 3d10d9ef-
| 67f02d0c-
| ca8beaad-
| 1b893b0b-
+------
[sysadmin@
tags: | removed: stx.retestneeded |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0) | #44 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.2.0
commit 7e5e887eb38042a
Author: Shuicheng Lin <email address hidden>
Date: Wed Dec 11 16:37:03 2019 +0800
Audit local registry secret info when there is user update in keystone
local registry uses admin's username&password for authentication.
And admin's password could be changed by openstack client cmd. It will
cause auth info in secrets obsolete, and lead to invalid authentication
in keystone.
To keep secrets info updated, keystone event notification is enabled.
And event notification listener is added in sysinv. So when there is
user password change, a user update event will be sent out by keystone.
And sysinv will call function audit_local_
whether kubernetes secret info need be updated or not.
A periodic task is added also to ensure secrets are always synced, in
case notification is missed or there is failure in handle notification.
oslo_messaging is added to tox's requirements.txt to avoid tox failure.
The version is based on global-
Test:
Pass deployment and secrets could be updated automatically with new auth
info.
Pass host-swact in duplex mode.
We lack of info how LP1853093 was triggered by the user, but this patch
can address the issue that local registry secrets are not updated
accordingly after the password of "admin" is changed.
And this fix will help technically.
Closes-Bug: 1853017
Closes-Bug: 1853093
Depends-On: https:/
Depends-On: https:/
Change-Id: I959b65288e0834
Signed-off-by: Shuicheng Lin <email address hidden>
(cherry picked from commit 8ab1e2d7c624f83
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #45 |
- ALL_NODES_20200223.025201.tar Edit (20.5 MiB, application/x-tar)
Issue reproduced on
Lab: WCP_3_6
Load: 2020-02-22_04-10-00
Log: attached
[2020-02-23 02:36:27,094] 314 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-02-23 02:39:29,512] 314 DEBUG MainThread ssh.send :: Send 'keyring get CGCS admin'
[2020-02-23 02:39:30,118] 436 DEBUG MainThread ssh.expect :: Output:
!Li69nux*9
[2020-02-23 02:39:31,833] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://
[2020-02-23 02:40:58,798] 314 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://
[2020-02-23 02:44:01,811] 314 DEBUG MainThread ssh.send :: Send 'keyring get CGCS admin'
[2020-02-23 02:44:02,399] 436 DEBUG MainThread ssh.expect :: Output:
Li69nux*
fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2020-02-23 02:44:03,214] 436 DEBUG MainThread ssh.expect :: Output:
Must provide Keystone credentials or user-defined endpoint and token, error was: The account is locked for user: c52f573e07d24a3
Changed in starlingx: | |
status: | Fix Released → Confirmed |
tags: | added: stx.retestneeded |
Changed in starlingx: | |
assignee: | yong hu (yhu6) → Lin Shuicheng (shuicheng) |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lin Shuicheng (shuicheng) wrote : | #46 |
Hi Peng,
There is controller-1 only in the log tarball, controller-0 is missed.
Could you share me the detail step to reproduce the issue?
When do you change the password? And what operation before and after the password change?
From the log, the failure is still due to authentication failure with registry-
I could find when password is changed, secrets are updated also. And no application is in applying stage.
I need to reproduce the issue to check where does the registry-
Here is some log from controller-1:
Password change cmd at 2:40:58:
2020-02-
Secrets update at 2:41:01:
sysinv 2020-02-23 02:41:01.613 238507 INFO sysinv.
sysinv 2020-02-23 02:41:01.645 238507 INFO sysinv.
Authentication failure at 2:41:04:
./var/log/
Then Account lock happen at 2:41:06 after 5 time invalid authentication:
2020-02-23 02:41:06.091 239370 WARNING keystone.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Hrishit Mazumder (hmazumde) wrote : | #47 |
- ALL_NODES_20200310.185444.tar Edit (34.9 MiB, application/x-tar)
Issue reproduced on lab wcp_76_77 at load: StarlingX_
After admin pw changed,
Timestamp of password change:
2020-03-
Details: CLI 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
I have attached logs for your perusal.
Best regards,
Hrishit Mazumder
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master) | #48 |
Fix proposed to branch: master
Review: https:/
Changed in starlingx: | |
status: | Confirmed → In Progress |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master) | #49 |
Fix proposed to branch: master
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lin Shuicheng (shuicheng) wrote : | #50 |
Hi Peng,
Could you help update test case to avoid password update immediately after host-swact? Please try to wait 3 minutes before password change after host-swact.
The issue is that, after swact, sysinv in active controller will try to check k8s network upgrade, and need pull image from registry.
I have submitted patch to avoid authentication failure caused by password cache, but cannot fix the issue totally. The issue will still occur if password change is happened just after sysinv get password, but before keystone authentication.
Here is the sysinv/sm log from ALL_NODES_
host-swact start at 17:09:53 and finish at 17:10:23
2020-03-
2020-03-
sysinv try to do k8s network upgrade at 17:10:25
2020-03-10 17:10:25.854 681994 INFO sysinv.
k8s secret is already updated with new password at 17:11:15
2020-03-10 17:11:15.367 681994 INFO sysinv.
Keystone report authentication failure due to receive old password at 17:11:19:
2020-03-10 17:11:19.438 682342 WARNING keystone.
sysinv reports ansible failure due to fail download imagae at 17:11:19:
"stderr": "time=\
sysinv 2020-03-10 17:17:07.929 681994 ERROR sysinv.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (f/centos8) | #51 |
Fix proposed to branch: f/centos8
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (f/centos8) | #52 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: f/centos8
commit 16477935845e1c2
Author: Steven Webster <email address hidden>
Date: Sat Mar 28 17:19:30 2020 -0400
Fix SR-IOV runtime manifest apply
When an SR-IOV interface is configured, the platform's
network runtime manifest is applied in order to apply the virtual
function (VF) config and restart the interface. This results in
sysinv being able to determine and populate the puppet hieradata
with the virtual function PCI addresses.
A side effect of the network manifest apply is that potentially
all platform interfaces may be brought down/up if it is determined
that their configuration has changed. This will likely be the case
for a system which configures SR-IOV interfaces before initial
unlock.
A few issues have been encountered because of this, with some
services not behaving well when the interface they are communicating
over suddenly goes down.
This commit makes the SR-IOV VF configuration much more targeted
so that only the operation of setting the desired number of VFs
is performed.
Closes-Bug: #1868584
Depends-On: https:/
Change-Id: Ie162380d3732eb
Signed-off-by: Steven Webster <email address hidden>
commit 45c9fe2d3571574
Author: Zhipeng Liu <email address hidden>
Date: Thu Mar 26 01:58:34 2020 +0800
Add ipv6 support for novncproxy_
For ipv6 address, we need url with below format
[ip]:port
Partial-Bug: 1859641
Change-Id: I01a5cd92deb9e8
Signed-off-by: Zhipeng Liu <email address hidden>
commit d119336b3a3b24d
Author: Andy Ning <email address hidden>
Date: Mon Mar 23 16:26:21 2020 -0400
Fix timeout waiting for CA cert install during ansible replay
During ansible bootstrap replay, the ssl_ca_
removed. It expects puppet platform:
during system CA certificate install to re-generate it. So this commit
updated conductor manager to run that puppet manifest even if the CA cert
has already installed so that the ssl_ca_
and makes ansible replay to continue.
Change-Id: Ic9051fba9afe5d
Closes-Bug: 1868585
Signed-off-by: Andy Ning <email address hidden>
commit 24a533d800b2c57
Author: Zhipeng Liu <email address hidden>
Date: Fri Mar 20 23:10:31 2020 +0800
Fix rabbitmq could not bind port to ipv6 address issue
When we use Armada to deploy openstack service for ipv6, rabbitmq
pod could not start listen on [::]:5672 and [::]:15672.
For ipv6, we need an override for configuration file.
Upstream patch link is:
https:/
Test pass for deploying rabbitmq service on both ipv...
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master) | #53 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit d6cff0496dcf526
Author: Shuicheng Lin <email address hidden>
Date: Thu Mar 12 14:34:09 2020 +0800
Refresh local registry auth info each time when access local registry
Local registry uses admin account password as authentication info.
And this password may be changed by openstack client at any time.
When try to download images from local registry, auth info cannot
be cached, otherwise it may lead to authentication failure in keystone,
and account be locked at the end.
For this specific case, there is host-swact first, then function
"_upgrade_
And upgrade-
kube network images from local registry. During this period, admin
account password is changed. And lead to account be locked due to
authentication failure in keystone.
With this update, there is still possibility that password be changed
just after get operation. And due to the images download are run in
parallel with multi threads, so account lock may still hit. This
change could minimize the issue rate, but cannot fix all.
Closes-Bug: 1853017
Change-Id: I686616937031a3
Signed-off-by: Shuicheng Lin <email address hidden>
Changed in starlingx: | |
status: | In Progress → Fix Released |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master) | #54 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 423a475aff4f9ea
Author: Shuicheng Lin <email address hidden>
Date: Thu Mar 12 14:06:08 2020 +0800
Refresh local registry auth info each time when access local registry
Local registry uses admin account password as authentication info.
And this password may be changed by openstack client at any time.
When sysinv tries to download images from local registry, it cannot
cache the auth info, otherwise it may lead to authentication failure
in keystone, and account be locked at the end.
Partial-Bug: 1853017
Change-Id: I07f273a05a1bc3
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #55 |
Shuicheng, There are recent commits in master related to this fix that haven't been cherrypicked to the stx.2.0 & stx.3.0 branches. Are these commits applicable to those releases?
tags: | removed: in-r-stx30 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.3.0) | #56 |
Fix proposed to branch: r/stx.3.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (r/stx.3.0) | #57 |
Fix proposed to branch: r/stx.3.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0) | #58 |
Fix proposed to branch: r/stx.2.0
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.3.0) | #59 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.3.0
commit 9bcd1b066bff4b5
Author: Shuicheng Lin <email address hidden>
Date: Thu Mar 12 14:06:08 2020 +0800
Refresh local registry auth info each time when access local registry
(cherry picked from commit 423a475aff4f9ea
Local registry uses admin account password as authentication info.
And this password may be changed by openstack client at any time.
When sysinv tries to download images from local registry, it cannot
cache the auth info, otherwise it may lead to authentication failure
in keystone, and account be locked at the end.
Partial-Bug: 1853017
Change-Id: I07f273a05a1bc3
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (r/stx.3.0) | #60 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.3.0
commit 75b5edfa6ce1ea3
Author: Shuicheng Lin <email address hidden>
Date: Thu Mar 12 14:34:09 2020 +0800
Refresh local registry auth info each time when access local registry
(cherry picked from commit d6cff0496dcf526
(cherry picked from commit 1b50022d55a9da2
Local registry uses admin account password as authentication info.
And this password may be changed by openstack client at any time.
When try to download images from local registry, auth info cannot
be cached, otherwise it may lead to authentication failure in keystone,
and account be locked at the end.
For this specific case, there is host-swact first, then function
"_upgrade_
And upgrade-
kube network images from local registry. During this period, admin
account password is changed. And lead to account be locked due to
authentication failure in keystone.
With this update, there is still possibility that password be changed
just after get operation. And due to the images download are run in
parallel with multi threads, so account lock may still hit. This
change could minimize the issue rate, but cannot fix all.
Closes-Bug: 1853017
Change-Id: I686616937031a3
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #61 |
Issue was reproduced on
Lab: WCP_71_75
Load: 2020-04-28_20-00-00
all nodes collect log added
test log:
=======
[2020-04-29 18:07:44,424] 314 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-
=======
=======
[2020-04-29 18:10:46,870] 314 DEBUG MainThread ssh.send :: Send 'keyring get CGCS admin'
[2020-04-29 18:10:47,477] 436 DEBUG MainThread ssh.expect :: Output:
!Li69nux*9
=======
[2020-04-29 18:10:47,583] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password '!Li69nux*9' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-
[2020-04-29 18:10:48,435] 436 DEBUG MainThread ssh.expect :: Output:
The account is locked for user: 7fb2fa710fca4ff
[sysadmin@
Changed in starlingx: | |
status: | Fix Released → Confirmed |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0) | #62 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: r/stx.2.0
commit a70ecf4baa3809c
Author: Shuicheng Lin <email address hidden>
Date: Thu Mar 12 14:06:08 2020 +0800
Refresh local registry auth info each time when access local registry
(cherry picked from commit 423a475aff4f9ea
Local registry uses admin account password as authentication info.
And this password may be changed by openstack client at any time.
When sysinv tries to download images from local registry, it cannot
cache the auth info, otherwise it may lead to authentication failure
in keystone, and account be locked at the end.
Partial-Bug: 1853017
Change-Id: I07f273a05a1bc3
Signed-off-by: Shuicheng Lin <email address hidden>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lin Shuicheng (shuicheng) wrote : | #63 |
Hi Peng,
Could you share me the collected log?
Thanks.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #64 |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lin Shuicheng (shuicheng) wrote : | #65 |
Hi Peng,
The cause is different with previous, it is not caused by registry-
From the log I could find error log in pod platform-
"tis-lab-
controller-
"
2020-04-
2020-04-
...
"
Please help ask WR guy help confirm whether admin password is used or not in "cloud-
Thanks.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lin Shuicheng (shuicheng) wrote : | #66 |
@yong please help assign the issue to WR, since it is caused by WR specific image.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ghada Khalil (gkhalil) wrote : | #67 |
@Lin Shuicheng, I followed up on this and you are correct. The most recent issue reported by Peng Peng is tied to a wr lab specific pod that continues to use the old password to access the config REST API, resulting in the admin account getting locked after a password change. Therefore, we should consider this Launchpad as Fixed. I'm putting it back to "Fix Released".
@Peng Peng, Please do not re-open this Launchpad again. Please also note that there are issues with admin password changes for Distributed Cloud. These are unrelated to this original issue and will be tracked separately. Please do not test admin password changes on Distributed Cloud.
Changed in starlingx: | |
status: | Confirmed → Fix Released |
tags: | removed: stx.retestneeded |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Peng Peng (ppeng) wrote : | #68 |
Verified on
Lab: WP_8_12
Load: 2020-05-19_20-00-00
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (f/centos8) | #69 |
Fix proposed to branch: f/centos8
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (f/centos8) | #70 |
Fix proposed to branch: f/centos8
Review: https:/
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (f/centos8) | #71 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: f/centos8
commit 320cc40de851878
Author: Don Penney <email address hidden>
Date: Wed May 13 13:06:11 2020 -0400
Add auto-versioning to starlingx/config packages
This update makes use of the PKG_GITREVCOUNT variable to auto-version
the packages in this repo.
Change-Id: I3a2c8caeb4b464
Depends-On: https:/
Story: 2006166
Task: 39766
Signed-off-by: Don Penney <email address hidden>
commit d9f2aea0fb228ed
Author: Sharath Kumar K <email address hidden>
Date: Wed Apr 22 16:22:22 2020 +0200
De-branding in starlingx/config: CGCS -> StarlingX
1. Rename CGCS to StarlingX for .spec files
Test:
After the de-brand change, bootimage.iso has been built in the flock
Layer and installed on the dev machine to validate the changes.
Please note, doing de-brand changes in batches, this is batch9 changes.
Story: 2006387
Task: 39524
Change-Id: Ia1fe0f2baafb78
Signed-off-by: Sharath Kumar K <email address hidden>
De-branding in starlingx/config: CGCS -> StarlingX
1. Rename CGCS to StarlingX for .spec file
2. Rename TIS to StarlingX for .service files
Test:
After the de-brand change, bootimage.iso has been built in the flock
Layer and installed on the dev machine to validate the changes.
Please note, doing de-brand changes in batches, this is batch10 changes.
Story: 2006387
Task: 36202
Change-Id: I404ce0da262149
Signed-off-by: Sharath Kumar K <email address hidden>
commit d141e954fa6bbf6
Author: Teresa Ho <email address hidden>
Date: Tue Mar 31 10:08:57 2020 -0400
Sysinv extensions for FPGA support
This update adds cli and restapi to support FPGA device
programming.
CLI commands:
system device-image-apply
system device-image-create
system device-image-delete
system device-image-list
system device-image-remove
system device-image-show
system device-
system device-label-list
system host-device-
system host-device-
system host-device-
system host-device-
system host-device-
Story: 2006740
Task: 39498
Change-Id: I556c2e7a51b393
Signed-off-by: Teresa Ho <email address hidden>
commit 491cca42ed854d2
Author: Elena Taivan <email address hidden>
Date: Wed Apr 29 11:25:26 2020 +0000
Qcow2 conversion to raw can be done using 'image-conversion' filesystem
1. Conversion filesystem can be added before/after
2. If conversion filesystem is added after stx-openstack
is applied, changes to stx-openstack will only take effec...
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (f/centos8) | #72 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: f/centos8
commit 55c9afd075194f7
Author: Dan Voiculeasa <email address hidden>
Date: Wed May 13 14:19:52 2020 +0300
Restore: disconnect etcd from ceph
At the moment etcd is restored only if ceph data is kept.
Etcd should be restored regardless if ceph data is kept or wiped.
Story: 2006770
Task 39751
Change-Id: I9dfb1be0a83c3f
Signed-off-by: Dan Voiculeasa <email address hidden>
commit 003ddff574c74ad
Author: Don Penney <email address hidden>
Date: Fri May 8 11:35:58 2020 -0400
Add playbook for updating static images
This commit introduces a new playbook, upgrade-
for downloading updating images and pushing to the local registry.
Change-Id: I8884440261a5a4
Story: 2006781
Task: 39706
Signed-off-by: Don Penney <email address hidden>
commit 26fd273cf5175ba
Author: Matt Peters <email address hidden>
Date: Thu May 7 14:29:02 2020 -0500
Add kube-apiserver port to calico failsafe rules
An invalid GlobalNetworkPolicy or NetworkPolicy may prevent
calico-node from communicating with the kube-apiserver.
Once the communication is broken, calico-node is no longer
able to update the policies since it cannot communicate to
read the updated policies. It can also prevent the pod
from starting since the policies will prevent it from
reading the configuration.
To ensure that this scenario does not happen, the kube-apiserver
port is being added to the failsafe rules to ensure communication
is always possible, regardless of the network policy configuration.
Change-Id: I1b065a74e7ad0b
Closes-Bug: 1877166
Related-Bug: 1877383
Signed-off-by: Matt Peters <email address hidden>
commit bd0f14a7dfb206c
Author: Robert Church <email address hidden>
Date: Tue May 5 15:11:15 2020 -0400
Provide an update strategy for Tiller deployment
In the case of a simplex controller configuration the current patching
strategy for the Tiller environment will fail as the tiller ports will
be in use when the new deployment is attempted to be applied. The
resulting tiller pod will be stuck in a Pending state.
This will be observed if the node becomes ready after 'helm init'
installs the initial deployment and before the deployment is patched for
environment checks.
The deployment strategy provided by 'helm init' is unspecified. This
change will allow one additional pod (current + new) and one unavailable
pod (current) during an update. The maxUnavailable setting allows the
tiller pod to be deleted which will release its ports, thus allowing the
patch deployment to spin up an new pod to a Running state.
Change-Id: I83c43c52a77...
tags: | added: in-r-stx30 |
Marking as stx.3.0 / high priority - appears to have been broken in the last week.