Artifacts not being saved

Bug #1661900 reported by Sergio Cazzolato
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
testflinger-cli
Fix Released
Medium
Paul Larson

Bug Description

In some cases the artifacts are comming empty and in other cases are not available. I attach two examples of the last executions.

-------------------------------------------------------------------------------------

First example: https://platform-qa-jenkins.ubuntu.com/view/performance/job/qa-tests-performance-desktop-intel-gfx-xenial/19/console

2017-02-04 16:40:15,657 frequent-shoes INFO: Running: cp ubuntu-performance-tests/tests/tests.log artifacts
2017-02-04 16:40:15,659 frequent-shoes INFO: END testrun

Running command wget http://testflinger.canonical.com/v1/result/5d79f2b8-11da-4864-8cd8-536af3cbfcf3/artifact -O artifacts.tgz
--2017-02-04 16:40:22-- http://testflinger.canonical.com/v1/result/5d79f2b8-11da-4864-8cd8-536af3cbfcf3/artifact
Resolving testflinger.canonical.com (testflinger.canonical.com)... 185.125.191.182
Connecting to testflinger.canonical.com (testflinger.canonical.com)|185.125.191.182|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 535 [application/octet-stream]
Saving to: 'artifacts.tgz'

     0K 100% 66.7M=0s

2017-02-04 16:40:22 (66.7 MB/s) - 'artifacts.tgz' saved [535/535]

-------------------------------------------------------------------------------------

Second example: https://platform-qa-jenkins.ubuntu.com/view/performance/job/qa-tests-performance-desktop-amd-gfx-xenial/37/console

OK (skipped=11)
2017-02-04 19:44:51,727 unkown-wealth INFO: Running: cp ubuntu-performance-tests/tests/tests.log artifacts
2017-02-04 19:44:51,729 unkown-wealth INFO: END testrun

Running command wget http://testflinger.canonical.com/v1/result/9244a882-369b-4303-8527-09f07496369c/artifact -O artifacts.tgz
--2017-02-04 19:44:58-- http://testflinger.canonical.com/v1/result/9244a882-369b-4303-8527-09f07496369c/artifact
Resolving testflinger.canonical.com (testflinger.canonical.com)... 185.125.191.182
Connecting to testflinger.canonical.com (testflinger.canonical.com)|185.125.191.182|:80... connected.
HTTP request sent, awaiting response... 204 NO CONTENT
Length: 0 [text/html]
Saving to: 'artifacts.tgz'

     0K 0.00 =0s

2017-02-04 19:44:58 (0.00 B/s) - 'artifacts.tgz' saved [0/0]

Revision history for this message
Paul Larson (pwlars) wrote :

First, I can't seem to see that jenkins job. In fact I can't find any like it at all if I search either: https://platform-qa-jenkins.ubuntu.com/search/?q=qa-tests-performance-desktop-amd-gfx-xenial

Can you also send me the job definition that you submitted?

If the artifacts directory exists on the test host system (not the target device, but the one that testflinger interacts with directly), then is should automatically tar up everything in that directory and save it.

One other side note, the way you are getting the artifacts is fine, but there's an easier way now! You can now use 'testflinger-cli artifacts <job_id>' and it will save it as artifacts.tgz. Or you can specify a different filename with --filename.

Revision history for this message
Paul Larson (pwlars) wrote :

I'm not able to reproduce this, even using your job_ids here that you said didn't work, can you check again?

$ testflinger-cli artifacts 5d79f2b8-11da-4864-8cd8-536af3cbfcf3Downloading artifacts tarball...
Artifacts downloaded to artifacts.tgz
$ ls -l artifacts.tgz
-rw-r--r-- 1 plars plars 535 Feb 7 08:42 artifacts.tgz
$ tar -tvzf artifacts.tgz
drwxr-xr-x ubuntu/ubuntu 0 2017-02-04 10:40 artifacts/
-rw-r--r-- ubuntu/ubuntu 792 2017-02-04 10:40 artifacts/tests.log
$ rm artifacts.tgz

$ wget http://testflinger.canonical.com/v1/result/5d79f2b8-11da-4864-8cd8-536af3cbfcf3/artifact -O artifacts.tgz
--2017-02-07 08:43:11-- http://testflinger.canonical.com/v1/result/5d79f2b8-11da-4864-8cd8-536af3cbfcf3/artifact
Resolving testflinger.canonical.com (testflinger.canonical.com)... 185.125.191.182
Connecting to testflinger.canonical.com (testflinger.canonical.com)|185.125.191.182|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 535 [application/octet-stream]
Saving to: ‘artifacts.tgz’

artifacts.tgz 100%[===================>] 535 --.-KB/s in 0s

2017-02-07 08:43:12 (34.1 MB/s) - ‘artifacts.tgz’ saved [535/535]
$ ls -l artifacts.tgz
-rw-r--r-- 1 plars plars 535 Feb 4 10:40 artifacts.tgz

Changed in testflinger:
status: New → Incomplete
assignee: nobody → Paul Larson (pwlars)
Revision history for this message
Paul Larson (pwlars) wrote :

One possibility you may want to look into. If your job is complete and you *immediately* try to download the artifacts, then you might have hit a race where the artifact tarball is still uploading, but you are already trying to get it. This might result in getting a HTTP/204 (no content) because there's nothing there (yet)

Revision history for this message
Sergio Cazzolato (sergio-j-cazzolato) wrote :

This is the an example, https://platform-qa-jenkins.ubuntu.com/view/performance/job/qa-tests-performance-desktop-amd-gfx-unity8-snap-zesty/6/console where the artifacts is empty and in the log you can see that the file is copied to the artifacts folder

Revision history for this message
Sergio Cazzolato (sergio-j-cazzolato) wrote :

If I do now "wget http://testflinger.canonical.com/v1/result/0741ed70-8915-48e5-aa79-8b00626d90cb/artifact -O artifacts.tgz" I can retrieve the artifacts, so the problem in that case seem to be that is being uploaded when I try to download it. It is possible to add a new state to testflinger like post-process where you upload the results? So when the testflinger execution ends I know that the artifacts are ready

Revision history for this message
Paul Larson (pwlars) wrote :

Discussed on irc, but following up here: Testflinger-agent already handles the upload as a separate piece from the execution, and that's exactly why you can hit something like this. Otherwise a test would run, and continue to say it's running until the results are all submitted, even if there's something that interferes with transmission of the results. Worse, it would block other tests from being run in the meantime. Instead, it caches the result (and artifacts) and tries to submit them right away, but also follows up later if it can't submit them then. This is why there's a 204 return, saying that the request looks ok, but it doesn't have any data for it.

What we talked about doing on IRC is to have testflinger-cli exit with a failure code if we get anything other than a 200 back, so you can detect the non-zero return code and retry as needed. Does that sound reasonable?

Revision history for this message
Paul Larson (pwlars) wrote :

The cli will now exit with a non-zero return code if there is no data. You can try it by pulling the testflinger-cli branch in your job and running it from a virtualenv, or by using the snap.

affects: testflinger → testflinger-cli
Changed in testflinger-cli:
importance: Undecided → Medium
Revision history for this message
Paul Larson (pwlars) wrote :

I believe this is fixed in the currently released version of testflinger-cli (see previous comments). Feel free to re-open it if you still have a problem. Thanks!

Changed in testflinger-cli:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.