Reconstructor raises UnicodeDecodeError when reverting an obj with non-ascii chars in path

Bug #1679175 reported by Alistair Coles
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Critical
Alistair Coles

Bug Description

First reported by Clay Gerrard in comment #1 here https://bugs.launchpad.net/swift/+bug/1678018

Quoting Clay from there:

"I'm having trouble duplicating this bug for replicated objects.

In my tests the handling of the object name and metadata in the object server is being treated consistently as utf-8 encoded bytes - not unicode strings.

object-6010: STDOUT: 'Content-Type': 'application/octet-stream'
object-6010: STDOUT: 'ETag': 'd41d8cd98f00b204e9800998ecf8427e'
object-6010: STDOUT: 'X-Object-Meta-Mtime': '1490985870.469097'
object-6010: STDOUT: 'X-Object-Meta-\xe2\x98\x83': '\xe2\x98\x83'
object-6010: STDOUT: 'X-Timestamp': '1490987405.84984'

On master, the EC reconstructor seems to blow up in an entirely different spot:

object-6030: STDERR: Traceback (most recent call last):
object-6030: STDERR: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
object-6030: STDERR: timer()
object-6030: STDERR: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
object-6030: STDERR: cb(*args, **kw)
object-6030: STDERR: File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
object-6030: STDERR: result = function(*args, **kwargs)
object-6030: STDERR: File "/vagrant/swift/swift/common/utils.py", line 2726, in _run_func
object-6030: STDERR: self._responses.put(func(*args, **kwargs))
object-6030: STDERR: File "/vagrant/swift/swift/obj/reconstructor.py", line 231, in _get_response
object-6030: STDERR: full_path = _full_path(node, part, path, policy)
object-6030: STDERR: File "/vagrant/swift/swift/obj/reconstructor.py", line 90, in _full_path
object-6030: STDERR: 'policy': policy,
object-6030: STDERR: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 20: ordinal not in range(128)

I'm creating objects from the command line using u'\N{SNOWMAN}' in the name & metadata

swift upload test ☃ -H 'x-object-meta-☃: ☃'
"

Changed in swift:
assignee: nobody → Alistair Coles (alistair-coles)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/452750

Changed in swift:
status: New → In Progress
Revision history for this message
clayg (clay-gerrard) wrote :

Alister says this is a regression - I don't think I've acctually released a swift that has this bug and it's probably good because it blows up alot in a bad way :\

Changed in swift:
importance: Undecided → High
description: updated
Revision history for this message
Alistair Coles (alistair-coles) wrote :

The reconstructor will hang if this exception gets raised - the get_repsonsemethod does not return a value so the queue of responses is never populated and the main thread waits for responses....forever.

Revision history for this message
Alistair Coles (alistair-coles) wrote :

I'm marking this critical - feel free to disagree, just feels like we shouldn't cut a release with this regression.

Changed in swift:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/452750
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=83750cf79c958810767f8a78e739df5d7e3f5345
Submitter: Jenkins
Branch: master

commit 83750cf79c958810767f8a78e739df5d7e3f5345
Author: Alistair Coles <email address hidden>
Date: Mon Apr 3 14:01:26 2017 +0100

    Fix UnicodeDecodeError in reconstructor _full_path function

    Object paths can have non-ascii characters. Device dicts will
    have unicode values. Forming a string using both will cause the
    object path to be coerced to UTF8, which currently causes a
    UnicodeDecodeError. This causes _get_response() to not return
    and the recosntructor hangs.

    The call to _full_path() is moved outside of _get_response()
    (where its result is used in the exception handler logging)
    so that _get_response() will always return even if _full_path()
    raises an exception.

    Unit tests are refactored to split out a new class with those
    tests using an object name and the _full_path method, so that
    the class can be subclassed to use an object name with non-ascii
    characters.

    Existing probe tests are subclassed to repeat using non-ascii
    chars in object paths.

    Change-Id: I4c570c08c770636d57b1157e19d5b7034fd9ed4e
    Closes-Bug: 1679175

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.14.0

This issue was fixed in the openstack/swift 2.14.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers