PY3 unicode text values change in ldap not taking care of py2 special charater unicode strings

Bug #1825867 reported by Abhishek Sharma M
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Undecided
Abhishek Sharma M

Bug Description

wrt to the recent community change https://github.com/openstack/keystone/commit/eca0829c4c65e6b64f08023ce2d5a55dc329248f related to py3 in ldap, we have used bytes_mode=False to support py2 ldap.
But while configuring ldap with user (say flügel), we are getting the below error.

[Mon Apr 22 08:04:36.723781 2019] [:error] [pid 46754] [remote ] ref = driver.authenticate(entity_id, password)
[Mon Apr 22 08:04:36.723802 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 62, in authenticate
[Mon Apr 22 08:04:36.724054 2019] [:error] [pid 46754] [remote ] user_ref = self._get_user(user_id)
[Mon Apr 22 08:04:36.724082 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 81, in _get_user
[Mon Apr 22 08:04:36.724122 2019] [:error] [pid 46754] [remote ] return self.user.get(user_id)
[Mon Apr 22 08:04:36.724145 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 309, in get
[Mon Apr 22 08:04:36.724184 2019] [:error] [pid 46754] [remote ] obj = super(UserApi, self).get(user_id, ldap_filter=ldap_filter)
[Mon Apr 22 08:04:36.724208 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1843, in get
[Mon Apr 22 08:04:36.724958 2019] [:error] [pid 46754] [remote ] ref = super(EnabledEmuMixIn, self).get(object_id, ldap_filter)
[Mon Apr 22 08:04:36.724989 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1545, in get
[Mon Apr 22 08:04:36.725031 2019] [:error] [pid 46754] [remote ] res = self._ldap_get(object_id, ldap_filter)
[Mon Apr 22 08:04:36.725055 2019] [:error] [pid 46754] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1453, in _ldap_get
[Mon Apr 22 08:04:36.725108 2019] [:error] [pid 46754] [remote ] six.text_type(object_id)),
[Mon Apr 22 08:04:36.725163 2019] [:error] [pid 46754] [remote ] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

It fails at six.text_type which is basically trying to convert the string to unicode. But what if we are getting a string value with special characters there, in those cases the execution fails. (six.text_type cannot convert special characters to unicode, it can just convert plain ascii to unicode type.)

IMO we should rather be using utf8_decode() from /usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py.

Once the above is fixed we get the below similar error,

[Mon Apr 22 05:00:30.699425 2019] [:error] [pid 121709] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 471, in get_all_filtered
[Mon Apr 22 05:00:30.699444 2019] [:error] [pid 121709] [remote ] for group in self.get_all(query, hints)]
[Mon Apr 22 05:00:30.699455 2019] [:error] [pid 121709] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1564, in get_all
[Mon Apr 22 05:00:30.711297 2019] [:error] [pid 121709] [remote ] for x in self._ldap_get_all(hints, ldap_filter)]
[Mon Apr 22 05:00:30.711359 2019] [:error] [pid 121709] [remote ] File "/usr/lib/python2.7/site-packages/keystone/common/driver_hints.py", line 42, in wrapper
[Mon Apr 22 05:00:30.720098 2019] [:error] [pid 121709] [remote ] return f(self, hints, *args, **kwargs)
[Mon Apr 22 05:00:30.720134 2019] [:error] [pid 121709] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1499, in _ldap_get_all
[Mon Apr 22 05:00:30.720188 2019] [:error] [pid 121709] [remote ] self.id_attr)
[Mon Apr 22 05:00:30.720259 2019] [:error] [pid 121709] [remote ] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 13: ordinal not in range(128)

Here,

query = u'(&%s(objectClass=%s)(%s=*))' % (ldap_filter or self.ldap_filter or '',
            self.object_class, self.id_attr) line fails coz ldap_filter is not a unicode string. Its a string having special characters. We need to decode the string in cases like these.

After resolving the above, we get the below error while trying to get the role assignments for the group after we configuring a ldap group with special character (flügel).

[Mon Apr 22 10:49:26.263074 2019] [:error] [pid 99824] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 129, in list_users_in_group
[Mon Apr 22 10:49:26.263097 2019] [:error] [pid 99824] [remote ] for user_id in self._transform_group_member_ids(group_members):
[Mon Apr 22 10:49:26.263108 2019] [:error] [pid 99824] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/core.py", line 123, in _transform_group_member_ids
[Mon Apr 22 10:49:26.263125 2019] [:error] [pid 99824] [remote ] user_id = self.user._dn_to_id(user_key)
[Mon Apr 22 10:49:26.263136 2019] [:error] [pid 99824] [remote ] File "/usr/lib/python2.7/site-packages/keystone/identity/backends/ldap/common.py", line 1298, in _dn_to_id
[Mon Apr 22 10:49:26.263153 2019] [:error] [pid 99824] [remote ] return ldap.dn.str2dn(dn)[0][0][1]
[Mon Apr 22 10:49:26.263163 2019] [:error] [pid 99824] [remote ] File "/usr/lib64/python2.7/site-packages/ldap/dn.py", line 53, in str2dn
[Mon Apr 22 10:49:26.263205 2019] [:error] [pid 99824] [remote ] return ldap.functions._ldap_function_call(None,_ldap.str2dn,dn,flags)
[Mon Apr 22 10:49:26.263236 2019] [:error] [pid 99824] [remote ] File "/usr/lib64/python2.7/site-packages/ldap/functions.py", line 66, in _ldap_function_call
[Mon Apr 22 10:49:26.263257 2019] [:error] [pid 99824] [remote ] result = func(*args,**kwargs)
[Mon Apr 22 10:49:26.263283 2019] [:error] [pid 99824] [remote ] UnicodeEncodeError: 'ascii' codec can't encode character u'\\xfc' in position 5: ordinal not in range(128)

Here while trying to list users (having special characters) in group (having special characters), the user dn is unicode & we are trying to convert that into a str (using _dn_to_id() in common.py). First we need to make sure that the user is string & not some unicode value which will cause UnicodeEncodeError.

Changed in keystone:
assignee: nobody → Abhishek Sharma M (abhi.sharma)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.opendev.org/654601

Changed in keystone:
status: New → In Progress
description: updated
Revision history for this message
Abhishek Sharma M (abhi.sharma) wrote :

py3 testcase fails coz of the change set proposed.
Considering a scenario for keystone.tests.unit.identity.backends.test_sql.TestIdentityDriver.test_authenticate_for_user_with_special_char with user_id='fl\xc3\xbcgel' (which actually is flügel, fl\xc3\xbcgel is the string representation in py2.)
In py3, as string/text means both ascii/unicode, fl\xc3\xbcgel as a user_id will mean flügel not mean flügel.

py3 console:
print ('fl\xc3\xbcgel') -> flügel
print (u'fl\xfcgel') -> flügel
bytes(u'fl\xfcgel', 'utf-8') = b'fl\xc3\xbcgel'
bytes('fl\xc3\xbcgel', 'utf-8') = b'fl\xc3\x83\xc2\xbcgel'

The string representation for flügel in py2 actually means flügel in string representation in py3. So, this unit test case might not be correct in the first place as user_id will not be 'fl\xc3\xbcgel' (can be this b'fl\xc3\xbcgel') when user is flügel.

Revision history for this message
Abhishek Sharma M (abhi.sharma) wrote :

One more solution to this bug can be that we make sure only unicode values reach the point of failure rather than string value. In my investigation I found that these string values are got after we fetch values from database.

Changed in keystone:
status: In Progress → New
Revision history for this message
Abhishek Sharma M (abhi.sharma) wrote :

I the queens release, we used to get unicode output from the database for ldap (keystone/identity/mapping_backends/sql.py:get_id_mapping()). In stein, we are getting string values. Need to check if this change was introduced to support py3 as it uses text/string.

Revision history for this message
Colleen Murphy (krinkle) wrote :

The patch to fix this has not had any updates and the failing tests make it seem inconclusive as to whether this is actually a problem. Marking this as incomplete in order to wait for confirmation from another user or more evidence from the reporter or an update to the proposed bugfix.

Changed in keystone:
status: New → Incomplete
Revision history for this message
Gauvain Pocentek (gpocentek) wrote :

I'm hitting a similar problem in Keystone rocky, but in a different function. The ldap_filter attribute is an <str>. Casting this attribute to <unicode> fixes the problem (see the attached test patch).

Revision history for this message
Gauvain Pocentek (gpocentek) wrote :
Changed in keystone:
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments