client does not handle unicode in image name, metadata

Bug #1061150 reported by Brian Rosmaita
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance Client
Fix Released
Low
Flavio Percoco

Bug Description

The python-glanceclient chokes on unicode for image names, metadata keys, and metadata values. (The API is OK, this is a client problem. Examples are given below.)

Problem is twofold: using string() and using unicode() with no 'encoding' parameter (the default is to use the ascii encoding, which is why we get the 'out of range' exception).

Fix is to make sure the unicode() function is being used throughout the code with utf-8 specified as the encoding; replace uses of the string() function with the unicode() function.

Examples:
# unicode in name
$ MERC=$(echo -e "\u263f")
$ echo $MERC

$ glance image-create --name Freddy-$MERC
'ascii' codec can't decode byte 0xe2 in position 30: ordinal not in range(128)

$ BB=$(echo -e "\u2047")
$ CC=$(echo -e "\u2048")
$ echo $BB

$ echo $CC

# unicode in metadata value
$ glance image-update --property ascii_key=unicode-${BB} 7a40d9d3-ee1d-48a1-8e2a-6c8ae01b7b4c
'ascii' codec can't decode byte 0xe2 in position 45: ordinal not in range(128)

# unicode in metadata key
$ glance image-update --property unicode_key_${CC}=ascii-value 7a40d9d3-ee1d-48a1-8e2a-6c8ae01b7b4c
'ascii' codec can't decode byte 0xe2 in position 38: ordinal not in range(128)

NOTE: The glance API can handle unicode fine:
$ VAL=$(echo -e "\u2645")
$ KEY=$(echo -e "\u262e")
$ echo $VAL

$ echo $KEY

curl -X PUT \
  -H "User-Agent: curl-by-hand" \
  -H "X-Auth-Token: $AUTH_TOKEN" \
  -H "x-image-meta-name: unicode-$VAL" \
  -H "x-image-meta-property-$KEY: the-key-is-unicode" \
  -H "x-image-meta-property-unicode_value: my-$VAL" \
 http://50.57.98.195:9292/v1/images/7a40d9d3-ee1d-48a1-8e2a-6c8ae01b7b4c

{
    "image": {
        "checksum": "9490463a2a6db02c08fbe9016e168ca4",
        "container_format": "ami",
        "created_at": "2012-10-01T18:23:14",
        "deleted": false,
        "deleted_at": null,
        "disk_format": "ami",
        "id": "7a40d9d3-ee1d-48a1-8e2a-6c8ae01b7b4c",
        "is_public": false,
        "min_disk": 0,
        "min_ram": 0,
        "name": "unicode-\u2645",
        "owner": "637807f80e43444c9676abc18100838c",
        "properties": {
            "unicode_value": "my-\u2645",
            "\u262e": "the-key-is-unicode"
        },
        "protected": false,
        "size": 25165824,
        "status": "active",
        "updated_at": "2012-10-03T18:22:27"
    }
}

description: updated
Changed in python-glanceclient:
assignee: nobody → Brian Rosmaita (brian-rosmaita)
Changed in python-glanceclient:
importance: Undecided → Low
Changed in python-glanceclient:
status: New → Triaged
Changed in python-glanceclient:
assignee: Brian Rosmaita (brian-rosmaita) → nobody
Anita Kuno (anteaya)
Changed in python-glanceclient:
assignee: nobody → Anita Kuno (akuno)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-glanceclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/20807

Changed in python-glanceclient:
assignee: Anita Kuno (akuno) → Flavio Percoco Premoli (flaper87)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-glanceclient (master)

Reviewed: https://review.openstack.org/20807
Committed: http://github.com/openstack/python-glanceclient/commit/55cb4f4473a6fc429524e7c4848379013a4d2d1d
Submitter: Jenkins
Branch: master

commit 55cb4f4473a6fc429524e7c4848379013a4d2d1d
Author: Flaper Fesp <email address hidden>
Date: Wed Jan 30 15:18:44 2013 +0100

    Decode input and encode output

    Currently glanceclient doesn't support non-ASCII characters for images
    names and properties (names and values as well). This patch introduces 2
    functions (utils.py) that will help encoding and decoding strings in a
    more "secure" way.

    About the ensure_(str|unicode) functions:

        They both try to use first the encoding used in stdin (or python's
        default encoding if that's None) and fallback to utf-8 if those
        encodings fail to decode a given text.

    About the changes in glanceclient:

        The major change is that all inputs will be decoded and will kept as
        such inside the client's functions and will then be encoded before
        being printed / sent out the client.

        There are other small changes, all related to encoding to str,
        around in order to avoid fails during some conversions. i.e: quoting
        url encoded parameters.

    Fixes bug: 1061150

    Change-Id: I5c3ea93a716edfe284d19f6291d4e36028f91eb2

Changed in python-glanceclient:
status: In Progress → Fix Committed
Brian Waldon (bcwaldon)
Changed in python-glanceclient:
milestone: none → v0.8.0
Brian Waldon (bcwaldon)
Changed in python-glanceclient:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.