Unicode non-ascii characters in channel specification cause a 500 response from the store

Bug #1828600 reported by Daniel Manrique on 2019-05-10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Snap Store

Bug Description

snapcraft release hello-roadmr-unicrash-1 11 stable/#๐Ÿ

this results in a 500 response from the store with this traceback:

DataError: invalid byte sequence for encoding "UTF8": 0xed 0xb3 0xb0

  File "piston/resource.py", line 192, in __call__
    result = meth(request, *args, **kwargs)
  File "devportal/api/helpers.py", line 319, in wrapper
    return orig_func(self, request, *args, **kwargs)
  File "devportal/api/v1/handlers.py", line 920, in create
    _channel = get_or_create_channel(_name, snap, create=True)
  File "devportal/models/__init__.py", line 178, in get_or_create_channel
    risk = Risk.objects.get(name__in=channel_split)
  File "django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "django/db/models/query.py", line 374, in get
    num = len(clone)
  File "django/db/models/query.py", line 232, in __len__
  File "django/db/models/query.py", line 1118, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "django/db/models/query.py", line 53, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch)
  File "django/db/models/sql/compiler.py", line 899, in execute_sql
    raise original_exception

Daniel Manrique (roadmr) on 2019-05-10
information type: Private → Public
Matias Bordese (matiasb) wrote :

This seems to be an encoding issue related to terminal, click, snapcraft and store interaction:

Specifically, store code is exploding when trying to deal with a string with surrogate escapes[1], that ideally snapcraft shouldn't be propagating to the server, handling the issue as soon as possible[2].

We could also validate store-side the given channels payload are ASCII-only values in the handler (although note that the real issue is there for any payload we get from clients including surrogate escapes). Note branch name validation is already done at the model level, this is failing in an intermediate step while inferring risk/branch from the given string.

[1] https://click.palletsprojects.com/en/7.x/python3/#python-2-and-3-differences
[2] http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/#different-types-of-unicode-strings

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers