Shaker on STEIN keystone authorization failures due to the new policy

Bug #1885724 reported by SK
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
shaker
New
Undecided
Unassigned

Bug Description

Environment

RHOSP STEIN in HA environment
- 3x controllers
- 4x computes
- 5x ceph

While running openstack scenario full_l2.yaml shaker failed here:

Jun 30 17:00:34 u20 shaker[65181]: 2020-06-30 17:00:34.679 65181 WARNING shaker.engine.quorum [-] Lost agents: {'shaker_dtfirk_slave_1', 'shaker_dtfirk_slave_0', 'shaker_dtfirk_master_1', 'sha
ker_dtfirk_master_0'}
Jun 30 17:00:34 u20 shaker[65181]: 2020-06-30 17:00:34.679 65181 INFO shaker.engine.quorum [-] Finished processing operation: <shaker.engine.quorum.JoinOperation object at 0x7f91ffc16e80> Jun 30 17:00:34 u20 shaker[65185]: 2020-06-30 17:00:34.682 65185 DEBUG shaker.agent.agent [-] Received: {'operation': 'none'} poll_task /home/skhan/.local/lib/python3.8/site-packages/shaker/a$
ent/agent.py:43
Jun 30 17:00:34 u20 shaker[65181]: 2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server [-] Agents failed to join: {'shaker_dtfirk_slave_1': 'lost', 'shaker_dtfirk_slave_0': 'lost', 'shak$r_dtfirk_master_0': 'lost', 'shaker_dtfirk_master_1': 'lost'}: Exception: Agents failed to join: {'shaker_dtfirk_slave_1': 'lost', 'shaker_dtfirk_slave_0': 'lost', 'shaker_dtfirk_master_0': '$
ost', 'shaker_dtfirk_master_1': 'lost'} 2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server Traceback (most recent call last):
                                   2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server File "/home/skhan/.local/lib/python3.8/site-packages/shaker/engine/server.py", line 188, in play_$
cenario 2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server quorum = quorum_pkg.make_quorum(
                                   2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server File "/home/skhan/.local/lib/python3.8/site-packages/shaker/engine/quorum.py", line 244, in make_$uorum
                                   2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server raise Exception('Agents failed to join: %s' % failed)
                                   2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server Exception: Agents failed to join: {'shaker_dtfirk_slave_1': 'lost', 'shaker_dtfirk_slave_0': 'lost'$
 'shaker_dtfirk_master_0': 'lost', 'shaker_dtfirk_master_1': 'lost'}
                                   2020-06-30 17:00:34.680 65181 ERROR shaker.engine.server

whereas from the keystone/heat-engine logs found the policy related exception, that is:

keystone.log
------------

2020-06-30 16:48:51.849 30 WARNING keystone.auth.plugins.core [req-a7c5e591-5a10-4587-9061-e51e5f4ca07d - - - - -] User is disabled: 01c67033fdee4073a0432c5c007264b5: AssertionError: User is disabled: 01c67033fdee4073a0432c5c007264b5
2020-06-30 16:48:51.851 30 WARNING keystone.server.flask.application [req-a7c5e591-5a10-4587-9061-e51e5f4ca07d - - - - -] Authorization failed. The request you have made requires authentication. from 10.133.128.143: keystone.exception.Unauthorized: The request you have made requires authentication.

heat-engine.log
---------------

2020-06-30 16:48:51.853 23 ERROR heat.engine.clients.keystoneclient [req-a7e4c70a-8732-4a85-9782-e913e4f2d7fc - sProject - default default] Domain admin client authentication failed: keystoneauth1.exceptions.http.Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-a7c5e591-5a10-4587-9061-e51e5f4ca07d)
2020-06-30 16:48:51.854 23 INFO heat.engine.stack [req-a7e4c70a-8732-4a85-9782-e913e4f2d7fc - sProject - default default] Stack CREATE FAILED (shaker_dtfirk): Authorization failed.

Seems like the shaker code is not ready for the changes made in the new identity policy.

nice to have a work around from the experts here.. Cheers.. :-)

Tags: stein
SK (msalmanmasood)
description: updated
SK (msalmanmasood)
description: updated
SK (msalmanmasood)
description: updated
Revision history for this message
Ilya Shakhat (shakhat) wrote :

Couple of hints to proceed:
1. It's strange that authentication fails so late when Heat stack is being created. During the initialization Shaker verifies credentials by requesting a token from Keystone (https://opendev.org/performa/shaker/src/branch/master/shaker/openstack/clients/openstack.py#L67). Is there a message 'Connection to OpenStack is initialized' in Shaker logs?
2. There is a message from keystone log '2020-06-30 16:48:51.849 30 WARNING keystone.auth.plugins.core [req-a7c5e591-5a10-4587-9061-e51e5f4ca07d - - - - -] User is disabled:' -- is it expected that the user is disabled? Could it be that the user does not have some of permissions (e.g. to create Heat stack)?

Revision history for this message
SK (msalmanmasood) wrote :

No this is the strange part; rc file sourced is of admin user and even tried with admin project yet the same.. for the user is disabled message that is about a user which is enabled not disabled.

Moreover, another keystone log message i found is about the identity policy that caught my eye at the very first place;

WARNING py.warnings [req-cb32dbe9-1264-4552-9f9c-77e815958758 8f5644850ee64d7a96f751b64a49a416 a7878ba89d7a498abd3998860819b644 - default default] /usr/lib/python3.6/site-packages/oslo_policy/policy.py:964: UserWarning: Policy identity:list_groups failed scope check. The token used to make the request was project scoped but the policy requires ['system', 'domain'] scope. This behavior may change in the future where using the intended scope is required
  warnings.warn(msg)

as I remember STEIN offers changes those were not available in early versions so thought may be the main reason but some how at the time of updating the case here didn't notice those messages are not there.. may be it'll give you some clue so shared now.

let me know where to look further.

Thanks.

Revision history for this message
SK (msalmanmasood) wrote :

Also noticed; the authorization failure found in the stack event list while the error was not thrown by the output of the stack creation resource which was showing successful completion of the stack.

Revision history for this message
SK (msalmanmasood) wrote :

yes the message is there in the shakers log;

2020-06-30 17:41:53.376 65321 INFO shaker.openstack.clients.openstack [-] Connection to OpenStack is initialized

Revision history for this message
SK (msalmanmasood) wrote :

ignore the message pointed in comment#2 - irrelevant.

Revision history for this message
SK (msalmanmasood) wrote :

Hi Shakhat,

Kindly help.

Thanks.

Revision history for this message
Ilya Shakhat (shakhat) wrote :

So it looks like the stack is created successfully, which means that user is authenticated in Heat and has enough privileges. But then the user is not authenticated when Heat tries to create certain resources.

Could you check some more things:
 * Can Heat create any stacks when the command is issued from command line?
 * At which resource does stack creation fail? What if you try to create Heat stack with that resource from the command line? (you can refer to the template of Shaker's stack at https://opendev.org/performa/shaker/src/branch/master/shaker/scenarios/openstack/l2.hot or you can keep the stack created by Shaker by providing --nocleanup-on-exit parameter)

Revision history for this message
SK (msalmanmasood) wrote :

Hi Shakhat,

Yes heat stack creates successfully.

However, no resource in the shaker stack shows any failure but the event shows following failure:

2020-07-16 05:31:33Z [shaker_krpfoz]: CREATE_FAILED Authorization failed.

any clue what could be the reason of this failure?

Thanks.

Revision history for this message
Ilya Shakhat (shakhat) wrote :

I would look towards Heat bugs or Heat service user configuration.

I don't see any recent bugs but, for instance, https://bugs.launchpad.net/charm-heat/+bug/1715465 has similar symptoms. The solution was to fix role/domain assignment for system 'heat' user.

What if you try to create any heat stack from command line with the same credentials as you use in Shaker? Does it work?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.