UnicodeDecodeError using ldap backend

Bug #1368097 reported by Alvaro Lopez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Incomplete
Low
Unassigned

Bug Description

Using the LDAP backend if any attribute contains accents, the ldap2py function in keystone/common/ldap/core.py fails with a UnicodeDecodeError.

https://github.com/openstack/keystone/blob/1e204483e5feebe489ecca409509ae31bacb0ce2/keystone/common/ldap/core.py#L110-L129

This function was introduced by commit cbf805161b84f13f459a19bfd46220c4f298b264 (https://review.openstack.org/#/c/82398/) . That commit encodes and decodes to and from utf8 the strings.

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Can you give more details about your sutuation?

tags: added: juno-rc-potential
Changed in keystone:
importance: Undecided → Low
description: updated
description: updated
Revision history for this message
Alvaro Lopez (aloga) wrote :

@prazumovsky I am running Keystone 2014.1 (Icehouse). As you can see, the culprit is commit cbf805161b84f13f459a19bfd46220c4f298b264 that is still in master.

We have several attributes in our LDAP backend that contain accents (such as names and surnames). Loading those attributes make Keystone fail when looking up the user. You can reproduce the failure like this (this is the code being used in keystone.common.ldap.core):

    >>> import codecs
    >>> codecs.getencoder('utf-8')('Álvaro')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
    >>> import six
    >>> six.text_type('Álvaro')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

Revision history for this message
David Stanek (dstanek) wrote :

What encoding is your data stored in? Also can you paste the full traceback that you get?

Changed in keystone:
status: New → Incomplete
Revision history for this message
David Stanek (dstanek) wrote :

Also I think your examples are flawed. I think:

    codecs.getencoder('utf-8')('Álvaro')

should really be:

    codecs.getencoder('utf-8')(u'Álvaro')

Revision history for this message
Alvaro Lopez (aloga) wrote :

To be honest I am not really into encodings, so it is true that the examples were flawed, sorry for that. At a first glance I thought that the culprit were the accents, but they're not.

I've narrowed down the problem, and it comes from some LDAP attributes containing binary data (for example kerberos principals or attributes containing a JPEG photo), that is, with characters that cannot be decoded.

I've changed the behavior of the decode function to ignore [1] (maybe replace is better?) those characters instead of failing. Anyway, if it fails, a better error should be raised, instead of "invalid user and password). I can contribute the code, but as I said I am not really an expert on this and maybe it is not the best solution...

[1] https://docs.python.org/2/library/codecs.html#codec-base-classes

Revision history for this message
David Stanek (dstanek) wrote :
Revision history for this message
Dolph Mathews (dolph) wrote :

Agree, and the fix has been backported to stable/icehouse and should be included in 2014.1.3

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.