[master] devstack setup fails randomly running osc commands

Bug #2042943 reported by yatin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
devstack
Fix Released
Undecided
Unassigned
openstacksdk
New
Undecided
Unassigned

Bug Description

Failing since https://review.opendev.org/c/openstack/devstack/+/886339

Fails like:-
From:- https://6c40da3fb44111d52eec-5a0483cb68bfeeccc28b5042e69d7a01.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-os-ken-master/c3aaeb6/job-output.txt
2023-11-07 02:36:54.532637 | controller | ++ functions-common:get_or_add_group_project_role:1030 : oscwrap --os-cloud devstack-system-admin role assignment list --role anotherrole --group c3ffedfd42df4012a40f7d571754799a --project 6d75ee64565e452aaf96c29e048cefdb -f value -c Role
2023-11-07 02:36:54.532697 | controller | Could not clean up: While reading from '/opt/stack/.local/share/python_keyring/keyring_pass.cfg' [line 27]: option 'kiqsaq5ha0h8n8eemdb36soduzlgd51hfbl_2fxpk_2btre_3d' in section 'openstacksdk' already exists
2023-11-07 02:36:54.532747 | controller | Exception ignored in atexit callback: <bound method Connection.close of <openstack.connection.Connection object at 0x7fdd6d5b4820>>
2023-11-07 02:36:54.532797 | controller | Traceback (most recent call last):
2023-11-07 02:36:54.532828 | controller | File "/opt/stack/data/venv/lib/python3.10/site-packages/openstack/connection.py", line 577, in close
2023-11-07 02:36:54.532855 | controller | self.config.set_auth_cache()
2023-11-07 02:36:54.532881 | controller | File "/opt/stack/data/venv/lib/python3.10/site-packages/openstack/config/cloud_region.py", line 666, in set_auth_cache
2023-11-07 02:36:54.532937 | controller | keyring.set_password('openstacksdk', cache_id, state)
2023-11-07 02:36:54.532971 | controller | File "/opt/stack/data/venv/lib/python3.10/site-packages/keyring/core.py", line 61, in set_password
2023-11-07 02:36:54.532998 | controller | get_keyring().set_password(service_name, username, password)
2023-11-07 02:36:54.533023 | controller | File "/opt/stack/data/venv/lib/python3.10/site-packages/keyrings/alt/file_base.py", line 129, in set_password
2023-11-07 02:36:54.533051 | controller | self._write_config_value(service, username, password_base64)
2023-11-07 02:36:54.533073 | controller | File "/opt/stack/data/venv/lib/python3.10/site-packages/keyrings/alt/file_base.py", line 141, in _write_config_value
2023-11-07 02:36:54.533097 | controller | config.read(self.file_path)
2023-11-07 02:36:54.533290 | controller | File "/usr/lib/python3.10/configparser.py", line 699, in read
2023-11-07 02:36:54.533318 | controller | self._read(fp, filename)
2023-11-07 02:36:54.580061 | controller | File "/usr/lib/python3.10/configparser.py", line 1098, in _read
2023-11-07 02:36:54.580104 | controller | raise DuplicateOptionError(sectname, optname,
2023-11-07 02:36:54.580116 | controller | configparser.DuplicateOptionError: While reading from '/opt/stack/.local/share/python_keyring/keyring_pass.cfg' [line 27]: option 'kiqsaq5ha0h8n8eemdb36soduzlgd51hfbl_2fxpk_2btre_3d' in section 'openstacksdk' already exists
2023-11-07 02:36:54.580126 | controller | ++ functions-common:oscwrap:2443 : return 0
....
2023-11-07 02:37:22.064737 | controller | ++ functions-common:get_or_create_service:1053 : oscwrap --os-cloud devstack-system-admin service create image --name glance '--description=Glance Image Service' -f value -c id
2023-11-07 02:37:22.064770 | controller | While reading from '/opt/stack/.local/share/python_keyring/keyring_pass.cfg' [line 27]: option 'kiqsaq5ha0h8n8eemdb36soduzlgd51hfbl_2fxpk_2btre_3d' in section 'openstacksdk' already exists
2023-11-07 02:37:22.064798 | controller | ++ functions-common:oscwrap:2443 : return 1
2023-11-07 02:37:22.064823 | controller | + functions-common:get_or_create_service:1049 : service_id=
2023-11-07 02:37:22.064849 | controller | + functions-common:get_or_create_service:1051 : remove_token_cache
2023-11-07 02:37:22.064874 | controller | + functions-common:remove_token_cache:2525 : rm /opt/stack/.local/share/python_keyring/keyring_pass.cfg
.....
2023-11-07 02:37:22.120403 | controller | ++ lib/glance:configure_glance_quotas:312 : oscwrap --os-cloud devstack-system-admin endpoint list --service glance -f value -c ID
2023-11-07 02:37:22.120425 | controller | No service with a type, name or ID of 'glance' exists.

From:-
https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d2a/periodic/opendev.org/openstack/neutron/master/devstack-enforce-scope/d2a5e4a/job-output.txt
2023-11-05 03:04:26.307892 | controller | +++ functions-common:_get_or_create_endpoint_with_interface:1067 : oscwrap --os-cloud devstack-system-admin endpoint create network public https://217.182.142.230:9696/networking --region RegionOne -f value -c id
2023-11-05 03:04:26.307944 | controller | Source contains parsing errors: '/opt/stack/.local/share/python_keyring/keyring_pass.cfg'
2023-11-05 03:04:26.307982 | controller | [line 24]: 'ZXJmYWNlIjogInB1YmxpYyIsICJy\n'
2023-11-05 03:04:26.308016 | controller | +++ functions-common:oscwrap:2443 : return 1
2023-11-05 03:04:26.308050 | controller | ++ functions-common:_get_or_create_endpoint_with_interface:1067 : endpoint_id=
2023-11-05 03:04:26.308084 | controller | ++ functions-common:_get_or_create_endpoint_with_interface:1069 : echo
...
2023-11-05 03:06:56.149993 | controller | ++ lib/neutron_plugins/services/l3:create_neutron_initial_network:202 : oscwrap --os-cloud devstack --os-region RegionOne network create private -f value -c id
2023-11-05 03:06:57.714129 | controller | public endpoint for network service in RegionOne region not found
2023-11-05 03:06:57.903165 | controller | ++ functions-common:oscwrap:2443 : return 1
2023-11-05 03:06:57.906626 | controller | + lib/neutron_plugins/services/l3:create_neutron_initial_network:202 : NET_ID=
2023-11-05 03:06:57.910304 | controller | + lib/neutron_plugins/services/l3:create_neutron_initial_network:1 : exit_trap

As per opensearch there are multiple hits across jobs since that patch merged:-
https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:build_status,negate:!f,params:(query:FAILURE),type:phrase),query:(match_phrase:(build_status:FAILURE)))),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22Source%20contains%20parsing%20errors:%22%20OR%20message:%22While%20reading%20from%20!'%2Fopt%2Fstack%2F.local%2Fshare%2Fpython_keyring%2Fkeyring_pass.cfg!'%22'),sort:!())

Revision history for this message
Artem Goncharov (gtema) wrote :

- I looked into the logs of those failures - they are not related to that change
- there are indeed bad messages written by the keyring lib from time to time, but they are written to the stderr what should not influence output parsing
- in case of every failure I found there were other issues (i.e. from keystone)
   - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_23d/899157/1/check/cinder-tempest-lvm-multibackend/23d8799/controller/logs/screen-keystone.txt
   - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d2a/periodic/opendev.org/openstack/neutron/master/devstack-enforce-scope/d2a5e4a/controller/logs/screen-keystone.txt
   - https://6c40da3fb44111d52eec-5a0483cb68bfeeccc28b5042e69d7a01.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-os-ken-master/c3aaeb6/controller/logs/screen-keystone.txt
- opensearch query is a bit misleading. It selects occurences when certain string is present in the log and the build status is failure (independent whether in this job or not)

So far I have not seen a certain proof that it token caching change caused those failures

Revision history for this message
yatin (yatinkarel) wrote :

@Artem hi, thanks for checking.

Yes there are keystone errors but those looks not related to the issue we seeing in those jobs. Like in keystone we see like glance/nova user(or endpoint or service etc) not found but that's normal as before creating user/service it checks if that exists and it returns 404 + Trace.
What i see in those failure whenever those keyring warning/errors are seen there is no requests(PUT/POST) on keystone side. IAnd all these are seen since the patch merged that made feel it's related. I am not aware on how all these works with those keyring/cache bits so not sure if anything other that revert and test could confirm it.

Also in success cases those keyring warning/errors are not seen atleast in the the bits which are tested.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to devstack (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/devstack/+/900395

yatin (yatinkarel)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by "yatin <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/devstack/+/900395
Reason: restore

Revision history for this message
sean mooney (sean-k-mooney) wrote :

this sounds like this this an OSC, SDK or python-keystoneclient bug.

its sound like one of the layers there is not correctly handling parallel invocation which is entirely valid and this should be address not in devstack but in that chain as it a real bug.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

speaking to artom downstream they noticed
"2023-11-07 02:36:54.532747 | controller | Exception ignored in atexit callback: <bound method Connection.close of <openstack.connection.Connection object at 0x7fdd6d5b4820>>"
so that is pointing to sdk or something it is calling

no longer affects: python-openstackclient
Revision history for this message
yatin (yatinkarel) wrote :
Download full text (3.8 KiB)

Ok finally reproduced[1] it with contents of keyring_pass.cfg captured[2]. This confirms that issue is triggered by https://review.opendev.org/c/openstack/devstack/+/886339. Something is causing such a state in keyring_pass.cfg and this causes issues. With such a state all openstack commands fails with the above mentioned errors until that file/content is removed/fixed.

[1] https://zuul.opendev.org/t/openstack/build/c1cf568a9fae4759889361b9a56f4146
[2] https://d2c96a7362be34407a1e-6910015cdc6b96c34eca0ab65a68e7f2.ssl.cf1.rackcdn.com/900574/2/check/cinder-tempest-plugin-lvm-lio-barbican-10/c1cf568/controller/logs/lvm/debug.txt
Issue time content of /opt/stack/.local/share/python_keyring/keyring_pass.cfg
[openstacksdk]
5yvvlz7c_2fla9ppy35m7vgfd47zt92nnolvq2_2f_2f4muhc_3d =
 eyJhdXRoX3Rva2VuIjogImdBQUFBQUJsVklRRG5DN0JONlBtUnN5dGF4aDlRQzVhYXZIVmc3NVFC
 VFZ5SmZJSkRuclBZamlZYmhPSWhFdXNCYWE3ME1DeGtUQkZyWHZmeE9VZkxodFBDZU1ROFh0ejc2
 THBlTlFUTF9mUmVpVUR1dmxWUHJTcnk0eXR4bUtoUEoyOXdDbXliZUhib3N1MUR5anlURUx2M3Nx
 S3pQY1A0USIsICJib2R5IjogeyJ0b2tlbiI6IHsibWV0aG9kcyI6IFsicGFzc3dvcmQiXSwgInVz
 ZXIiOiB7ImRvbWFpbiI6IHsiaWQiOiAiZGVmYXVsdCIsICJuYW1lIjogIkRlZmF1bHQifSwgImlk
 IjogIjYzYWE4NGM2ZGMxNTQzOWZiMzdmNTk0YTM0MjhmNjA5IiwgIm5hbWUiOiAiYWRtaW4iLCAi
 cGFzc3dvcmRfZXhwaXJlc19hdCI6IG51bGx9LCAiYXVkaXRfaWRzIjogWyJ2OWdyWEY4dVJTZW8w
 eW1oNk52OG93Il0sICJleHBpcmVzX2F0IjogIjIwMjMtMTEtMTVUMDk6NDA6MzUuMDAwMDAwWiIs
 ICJpc3N1ZWRfYXQiOiAiMjAyMy0xMS0xNVQwODo0MDozNS4wMDAwMDBaIiwgInJvbGVzIjogW3si
 aWQiOiAiNGUxNjkxY2I0YTU4NDZhMzhlNWQ4ZTU5YmE2YjAzZWEiLCAibmFtZSI6ICJhZG1pbiJ9
 LCB7ImlkIjogIjk0NDhjOGQxZmVlMjQwNGFiMWNiMWI0MWZjZjU4MzdkIiwgIm5hbWUiOiAibWFu
 YWdlciJ9LCB7ImlkIjogImEyOWIxNzUyZWZkNzRlYzc4OWRlNDA3Y2NhYTk3ZDNiIiwgIm5hbWUi
 OiAibWVtYmVyIn0sIHsiaWQiOiAiZGNjMTE2MWYzNTcwNGM4N2JlNTViZjUyNTIxN2UyNTAiLCAi
 bmFtZSI6ICJyZWFkZXIifV0sICJzeXN0ZW0iOiB7ImFsbCI6IHRydWV9LCAiY2F0YWxvZyI6IFt7
 ImVuZHBvaW50cyI6IFt7ImlkIjogIjQwNDUzODdmYmYxNTRkOTFhM2Q3NjM5OWYzMWE4YTFhIiwg
 ImludGVyZmFjZSI6ICJwdWJsaWMiLCAicmVnaW9uX2lkIjogIlJlZ2lvbk9uZSIsICJ1cmwiOiAi
 aHR0cHM6Ly8xNTguNjkuNjUuMTE1L2lkZW50aXR5IiwgInJlZ2lvbiI6ICJSZWdpb25PbmUifV0s
 ICJpZCI6ICI3ZmRhMTllZTdmNTU0NjUyYTk3NmVhNmYzMDI0MDUzNyIsICJ0eXBlIjogImlkZW50
 aXR5IiwgIm5hbWUiOiAia2V5c3RvbmUifV19fX0=

LjY1LjExNS9pZGVudGl0eSIsICJy
 ZWdpb24iOiAiUmVnaW9uT25lIn1dLCAiaWQiOiAiN2ZkYTE5ZWU3ZjU1NDY1MmE5NzZlYTZmMzAy
 NDA1MzciLCAidHlwZSI6ICJpZGVudGl0eSIsICJuYW1lIjogImtleXN0b25lIn1dfX19
5yvvlz7c_2fla9ppy35m7vgfd47zt92nnolvq2_2f_2f4muhc_3d =
 eyJhdXRoX3Rva2VuIjogImdBQUFBQUJsVklRRG5DN0JONlBtUnN5dGF4aDlRQzVhYXZIVmc3NVFC
 VFZ5SmZJSkRuclBZamlZYmhPSWhFdXNCYWE3ME1DeGtUQkZyWHZmeE9VZkxodFBDZU1ROFh0ejc2
 THBlTlFUTF9mUmVpVUR1dmxWUHJTcnk0eXR4bUtoUEoyOXdDbXliZUhib3N1MUR5anlURUx2M3Nx
 S3pQY1A0USIsICJib2R5IjogeyJ0b2tlbiI6IHsibWV0aG9kcyI6IFsicGFzc3dvcmQiXSwgInVz
 ZXIiOiB7ImRvbWFpbiI6IHsiaWQiOiAiZGVmYXVsdCIsICJuYW1lIjogIkRlZmF1bHQifSwgImlk
 IjogIjYzYWE4NGM2ZGMxNTQzOWZiMzdmNTk0YTM0MjhmNjA5IiwgIm5hbWUiOiAiYWRtaW4iLCAi
 cGFzc3dvcmRfZXhwaXJlc19hdCI6IG51bGx9LCAiYXVkaXRfaWRzIjogWyJ2OWdyWEY4dVJTZW8w
 eW1oNk52OG93Il0sICJleHBpcmVzX2F0IjogIjIwMjMtMTEtMTVUMDk6NDA6MzUuMDAwMDAwWiIs
 ICJpc3N1ZWRfYXQiOiAiMjAyMy0xMS0xNVQwODo0MDozNS4wMDAwMDBaIiwgInJvbGVzIjogW3si
 aWQiOiAiNGUxNjkxY2I0YTU4NDZhMz...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to devstack (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/devstack/+/900984

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to devstack (master)

Reviewed: https://review.opendev.org/c/openstack/devstack/+/900984
Committed: https://opendev.org/openstack/devstack/commit/82c30cd82ee00012d21bee94dad2bcbc2c047f78
Submitter: "Zuul (22348)"
Branch: master

commit 82c30cd82ee00012d21bee94dad2bcbc2c047f78
Author: yatin <email address hidden>
Date: Wed Nov 15 12:44:50 2023 +0000

    Revert "Enable keystone token caching by OSC"

    This reverts commit 67630d4c52aef5ddcb15cff4f3b6594d447e8992.

    Reason for revert: Seeing random failures across jobs as sometimes
    'keyring_pass.cfg' gets duplicated keys and that makes executions
    of any openstackclient command to fail until the file is removed.
    This should be handled before re enabling the token caching again.

    Change-Id: I3d2fe53a2e7552ac6304c30aa2fe5be33d77df53
    Related-Bug: #2042943

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by "yatin <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/devstack/+/900395
Reason: just for test, related patch was reverted

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

The patch that caused the issue in devstack has been reverted.

Changed in devstack:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.