cpu stress test output on failure is not helpful

Bug #1232077 reported by Jeff Lane 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Checkbox (Legacy)
Won't Fix
Undecided
Unassigned
PlainBox (Toolkit)
Fix Released
Undecided
Unassigned

Bug Description

for some reason, the cpu stress test failed on a system being certified, however, on looking at the output to try to sort out the issue, I was only presented with this:

stress: FAIL: [8086] (416) <-- worker 8093 got signal 9
stress: WARN: [8086] (418) now reaping child worker processes
stress: FAIL: [8086] (422) kill error: No such process
stress: FAIL: [8086] (416) <-- worker 8097 got signal 9
stress: WARN: [8086] (418) now reaping child worker processes
stress: FAIL: [8086] (422) kill error: No such process
stress: FAIL: [8086] (416) <-- worker 8100 got signal 9
stress: WARN: [8086] (418) now reaping child worker processes
stress: FAIL: [8086] (422) kill error: No such process
stress: FAIL: [8086] (416) <-- worker 8099 got signal 9
stress: WARN: [8086] (418) now reaping child worker processes
stress: FAIL: [8086] (422) kill error: No such process
stress: FAIL: [8086] (452) failed run completed in 7200s

apparently, the REST of the output was dumped to stdout, while these few lines appear in stderr, and because checkbox chooses stderr in favor of stdout on failed tests, I only get the last few lines of error output with no context to explain why it failed.

Revision history for this message
Daniel Manrique (roadmr) wrote :

So we either pipe stderr to stdout for this (and other) job(s), which should work because checkbox at least knows to use one if the other is empty, or we modify the core to output both stderr and stdout in all cases.

This was done to help the user discern useful output from debugging or useless stuff, and the problem is that, as you see, useful information may be lost despite scripts' best attempts to use stdout/stderr sensibly.

Daniel Manrique (roadmr)
tags: added: checkbox-core
Revision history for this message
Jeff Lane  (bladernr) wrote :

IMO, the correct behaviour should be that passing tests only retrieve stdout while failures capture both stdout and stdin.

I'm sure checkbox already caches these somehow, so that it can grab the appropriate one for the results, so should be a matter of maybe either concatenating them for failures, or simply changing the output to look something like this for failed tests:

STDOUT:
soem stuff here
more stuff
usual suff
lots o output

STDERR:
error message
more error messages

The problem there is that it's possible the error messages will appear out of sync with the non-error output.

The only way I can think of to solve that would really be to dump them together at the test level and just grab them all, or do something like this:

each line of output would become a tuple in an ordered list:

1:(stdout,'something')
2:(stdout,'something more')
3:(stderr,'error message')
4:(stdout,'more output')

then when the result is collected, they can be replayed in order and for failures, grab all, and for pass, ignore anything with stderr in the tuple...

though that may be too complex an answer.,..

Revision history for this message
Daniel Manrique (roadmr) wrote : Re: [Bug 1232077] Re: cpu stress test output on failure is not helpful

On 13-09-27 03:03 PM, Jeff Lane wrote:
> IMO, the correct behaviour should be that passing tests only retrieve
> stdout while failures capture both stdout and stdin.
>
> I'm sure checkbox already caches these somehow, so that it can grab the
> appropriate one for the results, so should be a matter of maybe either
> concatenating them for failures, or simply changing the output to look
> something like this for failed tests:
>
>
> STDOUT:
> soem stuff here
> more stuff
> usual suff
> lots o output
>
> STDERR:
> error message
> more error messages
>
> The problem there is that it's possible the error messages will appear
> out of sync with the non-error output.
>
> The only way I can think of to solve that would really be to dump them
> together at the test level and just grab them all, or do something like
> this:
>
> each line of output would become a tuple in an ordered list:
>
> 1:(stdout,'something')
> 2:(stdout,'something more')
> 3:(stderr,'error message')
> 4:(stdout,'more output')

^^ This is *exactly* what plainbox does :) (it even captures timing information).

I'll have a look at what checkbox knows about test output and see how we can
accomodate this.

>
> then when the result is collected, they can be replayed in order and for
> failures, grab all, and for pass, ignore anything with stderr in the
> tuple...
>
> though that may be too complex an answer.,..
>

Revision history for this message
Jeff Lane  (bladernr) wrote :
Download full text (3.4 KiB)

Oh that figures, every time I have a brilliant idea, someone else has
already had it. heh

On Fri, Sep 27, 2013 at 3:12 PM, Daniel Manrique
<email address hidden> wrote:
> On 13-09-27 03:03 PM, Jeff Lane wrote:
>> IMO, the correct behaviour should be that passing tests only retrieve
>> stdout while failures capture both stdout and stdin.
>>
>> I'm sure checkbox already caches these somehow, so that it can grab the
>> appropriate one for the results, so should be a matter of maybe either
>> concatenating them for failures, or simply changing the output to look
>> something like this for failed tests:
>>
>>
>> STDOUT:
>> soem stuff here
>> more stuff
>> usual suff
>> lots o output
>>
>> STDERR:
>> error message
>> more error messages
>>
>> The problem there is that it's possible the error messages will appear
>> out of sync with the non-error output.
>>
>> The only way I can think of to solve that would really be to dump them
>> together at the test level and just grab them all, or do something like
>> this:
>>
>> each line of output would become a tuple in an ordered list:
>>
>> 1:(stdout,'something')
>> 2:(stdout,'something more')
>> 3:(stderr,'error message')
>> 4:(stdout,'more output')
>
> ^^ This is *exactly* what plainbox does :) (it even captures timing
> information).
>
> I'll have a look at what checkbox knows about test output and see how we can
> accomodate this.
>
>>
>> then when the result is collected, they can be replayed in order and for
>> failures, grab all, and for pass, ignore anything with stderr in the
>> tuple...
>>
>> though that may be too complex an answer.,..
>>
>
> --
> You received this bug notification because you are a member of Checkbox
> Bug Wranglers, which is subscribed to checkbox.
> https://bugs.launchpad.net/bugs/1232077
>
> Title:
> cpu stress test output on failure is not helpful
>
> Status in Checkbox System Testing:
> New
>
> Bug description:
> for some reason, the cpu stress test failed on a system being
> certified, however, on looking at the output to try to sort out the
> issue, I was only presented with this:
>
> stress: FAIL: [8086] (416) <-- worker 8093 got signal 9
> stress: WARN: [8086] (418) now reaping child worker processes
> stress: FAIL: [8086] (422) kill error: No such process
> stress: FAIL: [8086] (416) <-- worker 8097 got signal 9
> stress: WARN: [8086] (418) now reaping child worker processes
> stress: FAIL: [8086] (422) kill error: No such process
> stress: FAIL: [8086] (416) <-- worker 8100 got signal 9
> stress: WARN: [8086] (418) now reaping child worker processes
> stress: FAIL: [8086] (422) kill error: No such process
> stress: FAIL: [8086] (416) <-- worker 8099 got signal 9
> stress: WARN: [8086] (418) now reaping child worker processes
> stress: FAIL: [8086] (422) kill error: No such process
> stress: FAIL: [8086] (452) failed run completed in 7200s
>
> apparently, the REST of the output was dumped to stdout, while these
> few lines appear in stderr, and because checkbox chooses stderr in
> favor of stdout on failed tests, I only get the last few lines of
> error output with no context to explain why it failed.
>
> To manag...

Read more...

Zygmunt Krynicki (zyga)
affects: checkbox → plainbox-provider-checkbox
Revision history for this message
Daniel Manrique (roadmr) wrote :

This would be a core issue, not one with the provider or job (the job *does* spit everything out). I'll move to plainbox, create a checkbox-legacy task and set as incomplete. Zygmunt, can you confirm that plainbox will capture both stdout and stderr even if the test fails? if so, we can set this as Fix Released for plainbox, while the checkbox-legacy task will be set to won't fix.

Changed in plainbox-provider-checkbox:
status: New → Incomplete
Changed in checkbox-legacy:
status: New → Won't Fix
affects: plainbox-provider-checkbox → plainbox
Revision history for this message
Jeff Lane  (bladernr) wrote :

Zyga, I believe this is already implemented in checkbox-ng and before that plainbox. Can you confirm that it at least works in plainbox and can be pulled into checkbox-ng?

Zygmunt Krynicki (zyga)
Changed in plainbox:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.