[Swift backend] Upload image hit error: Unicode-objects must be encoded before hashing

Bug #1805332 reported by wangxiyuan on 2018-11-27
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
glance_store
High
wangxiyuan

Bug Description

env: master branch, Glance using swift backend.

We hit a strange error, if we upload a large image (larger than 1G), the glance_store will hit a error:Unicode-objects must be encoded before hashing. But if the image is small enough, the error won't happen.

error log:
https://www.irccloud.com/pastebin/jP3DapNy/

After dig into the code, it appears that when chunk reading the image data, the date piece may be non-byte, so the checksum.updating will raise the error.

encoding the date piece to ensure it's byte can solve the problem.

wangxiyuan (wangxiyuan) wrote :
Changed in glance-store:
assignee: nobody → wangxiyuan (wangxiyuan)
Changed in glance-store:
status: New → Triaged
importance: Undecided → High
Brian Rosmaita (brian-rosmaita) wrote :

I really didn't like the utf-8 encoding, because if there's a byte with a value over 127, it will be encoded as 2 bytes. As far as we can tell, it looks like this is happening when a zero-byte read is requested (bytes!! so why are we getting a unicode object??). See this etherpad for more info: https://etherpad.openstack.org/p/glance_store-py3-swift-driver-problem

Brian Rosmaita (brian-rosmaita) wrote :

timburke in IRC noticed this: https://github.com/openstack/glance_store/blob/0.28.0/glance_store/common/utils.py#L138

Looks like that will cause the same problem for anything that uses the CooperativeReader.

Reviewed: https://review.openstack.org/620234
Committed: https://git.openstack.org/cgit/openstack/glance_store/commit/?id=1d25a2b7a21e95766f9fee378b3d0802d392a85f
Submitter: Zuul
Branch: master

commit 1d25a2b7a21e95766f9fee378b3d0802d392a85f
Author: wangxiyuan <email address hidden>
Date: Tue Nov 27 14:50:50 2018 +0800

    Prevent unicode object error from zero-byte read

    During large file uploads under py3, we are occasionally seeing a
    "unicode objects must be encoded before hashing" error even though
    we are reading from a byte stream. From what I can tell, it looks
    like it's happening when a zero-byte read is requested, so we handle
    that case explicitly. This is a band-aid fix; we still need to track
    down the source.

    Co-authored-by: wangxiyuan <email address hidden>
    Co-authored-by: Brian Rosmaita <email address hidden>

    Related-bug: #1805332
    Change-Id: Ia7653f9fcbe902abc203c10c80ab44a641a4d8f9

Reviewed: https://review.openstack.org/644839
Committed: https://git.openstack.org/cgit/openstack/glance_store/commit/?id=9c8364bacfbe831a755b096a92fb7da2ff3c878d
Submitter: Zuul
Branch: stable/stein

commit 9c8364bacfbe831a755b096a92fb7da2ff3c878d
Author: wangxiyuan <email address hidden>
Date: Tue Nov 27 14:50:50 2018 +0800

    Prevent unicode object error from zero-byte read

    During large file uploads under py3, we are occasionally seeing a
    "unicode objects must be encoded before hashing" error even though
    we are reading from a byte stream. From what I can tell, it looks
    like it's happening when a zero-byte read is requested, so we handle
    that case explicitly. This is a band-aid fix; we still need to track
    down the source.

    Co-authored-by: wangxiyuan <email address hidden>
    Co-authored-by: Brian Rosmaita <email address hidden>

    Related-bug: #1805332
    Change-Id: Ia7653f9fcbe902abc203c10c80ab44a641a4d8f9
    (cherry picked from commit 1d25a2b7a21e95766f9fee378b3d0802d392a85f)

tags: added: in-stable-stein
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers