Snapcraft errors when releasing a snap

Bug #1627865 reported by Thomi Richards
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Snapcraft
New
Undecided
Unassigned

Bug Description

I've been releasing lots of snaps to the staging store by running:

snapcraft push <snap> --release <channel>

... and they've all gone fine, but one upload I got this:

```
Uploading ./thomir-says_0.0.1a_amd64.snap [===========================================================================================================================================] 100%
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3/dist-packages/snapcraft/storeapi/__init__.py", line 442, in _update_status
    for content in self._get_status():
  File "/usr/lib/python3/dist-packages/snapcraft/storeapi/__init__.py", line 452, in _get_status
    content = requests.get(self.__status_details_url).json()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 808, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 516, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```

This is when snapcraft is in the 'processing' phase, which I believe is when it's actually relasing the snap, rather than pushing it.

Snapcraft is then stuck in the 'Processing...' stage, and never exits.

The store should never send bad data (I'll try and find something in the staging store logs and report a bug there as well), but snapcraft should handle it better.

Revision history for this message
Sergio Schvezov (sergiusens) wrote : Re: [Bug 1627865] [NEW] Snapcraft errors when releasing a snap

El 26/09/16 a las 18:53, Thomi Richards escribió:
> Public bug reported:
>
> I've been releasing lots of snaps to the staging store by running:
>
> snapcraft push <snap> --release <channel>
>
> ... and they've all gone fine, but one upload I got this:
>
>
> ```
> Uploading ./thomir-says_0.0.1a_amd64.snap [===========================================================================================================================================] 100%
> Exception in thread Thread-1:
> Traceback (most recent call last):
> File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
> self.run()
> File "/usr/lib/python3.5/threading.py", line 862, in run
> self._target(*self._args, **self._kwargs)
> File "/usr/lib/python3/dist-packages/snapcraft/storeapi/__init__.py", line 442, in _update_status
> for content in self._get_status():
> File "/usr/lib/python3/dist-packages/snapcraft/storeapi/__init__.py", line 452, in _get_status
> content = requests.get(self.__status_details_url).json()
> File "/usr/lib/python3/dist-packages/requests/models.py", line 808, in json
> return complexjson.loads(self.text, **kwargs)
> File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 516, in loads
> return _default_decoder.decode(s)
> File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
> obj, end = self.raw_decode(s)
> File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
> return self.scan_once(s, idx=_w(s, idx).end())
> simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
> ```
>
>
> This is when snapcraft is in the 'processing' phase, which I believe is when it's actually relasing the snap, rather than pushing it.
>
> Snapcraft is then stuck in the 'Processing...' stage, and never exits.
>
> The store should never send bad data (I'll try and find something in the
> staging store logs and report a bug there as well), but snapcraft should
> handle it better.

Did it actually finish processing on the store side? With the data
returned from the store (which is a state) all snapcraft can do is
timeout but that doesn't feel like a good idea if the store just happens
to be slow to process the snap.

Revision history for this message
Thomi Richards (thomir-deactivatedaccount) wrote :

Hi,

On Tue, Sep 27, 2016 at 11:08 AM, Sergio Schvezov <
<email address hidden>> wrote:

> Did it actually finish processing on the store side? With the data
> returned from the store (which is a state) all snapcraft can do is
> timeout but that doesn't feel like a good idea if the store just happens
> to be slow to process the snap.
>

I don't know - I can't find anything in the staging store logs that looks
like the same error. However, ISTM that the store has returned something,
otherwise we wouldn't see the json decode error. I think it's safe to
assume that if the store returns something that's invalid snapcraft should
treat that as an error, and abort the release process?

Is there some way to make snapcraft log the response body in cases like
this? In any case, catching JSONDecodeError and adding a nice log message
is probably a valuable addition to snapcraft no matter what the eventual
outcome of this bug is.

Cheers,

--
Thomi Richards
<email address hidden>

Revision history for this message
Thomi Richards (thomir-deactivatedaccount) wrote :

Hi Sergio,

I've looked at this a bit closer, and I believe what's happening is that snapcraft is assuming the response from the store is a 200 response. The exact code in snapcraft is:

    content = requests.get(self.__status_details_url).json()

This code should probably check that the response has a 200 status code before trying to decode the contents - something like this perhaps:

    response = requests.get(self.__status_details_url)
    if response.status_code != 200:
        # do error handling here
    else:
        content = response.json()

I believe what's happening here is that there's a race condition between snapcraft pushing the snap to CUD and a record of that upload appearing in the SCA database. If you're unlucky snapcraft will try and get a status upload for the upload before SCA knows about it. In this case you will get a 404 response with the body b'Not Found', which causes the json decoding to blow up, as in the original bug report.

This happens about 1 time in 20 for me, but since all my packets have to travel around the world before they hit the store I suspect this will be worse for people in London (for example).

Let me know if this makes sense.

Cheers,

Revision history for this message
Colin Watson (cjwatson) wrote :

This bites Launchpad quite often, although the symptoms are different since Launchpad actually bothers to check the status code:

  https://bugs.launchpad.net/software-center-agent/+bug/1610948

There's certainly a race here, and your theory seems plausible. I think it's CreateApplicationTask that creates this row, and that's an asynchronous task scheduled by the preceding snap-push request.

Perhaps SnapPushHandler should create the ClickPackageUpload row itself rather than (or, for extra robustness, as well as) deferring the creation to CreateApplicationTask? The task would still fill in more detailed scan results when it has them, but this way we'd at least have a basic UNSCANNED row in SCA's database so that SnapPushStatusHandler is guaranteed to be able to see it immediately.

snapcraft should still of course check the status code as you suggest, even once SCA is fixed.

Revision history for this message
Facundo Batista (facundo) wrote :

The process of uploading is not synchronous, so after pushing a new upload, the client must "wait some time" before it can do the release.

This can be done in a very deterministic way.

The push response returns a 'status_details_url'. The client can hit that rightaway, but should check the 'processed' field in its response: if True, the server side processing ended; if False, it should wait some seconds (1? 3? 5?) and try again (meanwhile, it can even log or inform the user using the 'code' field in the response.

Note that after processing is done, it may not be able to release. The client should check the 'can_release' field in the response of the status_details_url: if False it has the 'errors' field with information for the user; if True it can proceed to the proper release.

The whole dance can be seen in the test_macaroons,py script (see lines 495-525 in http://bazaar.launchpad.net/~facundo/+junk/random/view/head:/test_macaroons.py ).

Any doubt, just ask!

Thanks,

Revision history for this message
Thomi Richards (thomir-deactivatedaccount) wrote :

After some more discussion in the OLS ML, we can confirm a race condition between the push response and the status_details_url returning a 200 response.

https://bugs.launchpad.net/software-center-agent/+bug/1610948 is being used to track the store side of things, but snapcraft should check the response code and not crash on a 404.

Revision history for this message
Thomi Richards (thomir-deactivatedaccount) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.