Failure to submit build to lava results in overall build failure

Bug #842452 reported by Frans Gifford
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro Android Build Tools
Fix Released
Low
Paul Sokolovsky

Bug Description

For example in https://android-build.linaro.org/builds/~linaro-android/panda/#build=270

Failure to connect to LAVA is annoying, but shouldn't cause the build to be marked as failed.

Traceback (most recent call last):
  File "/mnt/jenkins/workspace/linaro-android_panda/build-tools/build-scripts/post-build-lava.py", line 121, in <module>
    main()
  File "/mnt/jenkins/workspace/linaro-android_panda/build-tools/build-scripts/post-build-lava.py", line 98, in main
    lava_job_id = server.scheduler.submit_job(config)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1292, in single_request
    self.send_content(h, request_body)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1439, in send_content
    connection.endheaders(request_body)
  File "/usr/lib/python2.7/httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 811, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 773, in send
    self.connect()
  File "/usr/lib/python2.7/httplib.py", line 1154, in connect
    self.timeout, self.source_address)
  File "/usr/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 110] Connection timed out
Build step 'Execute shell' marked build as failure
Archiving artifacts
Finished: FAILURE

Revision history for this message
James Westby (james-w) wrote : Re: [Bug 842452] [NEW] Failure to submit build to lava results in overall build failure

On Tue, 06 Sep 2011 09:20:49 -0000, Frans Gifford <email address hidden> wrote:
> Public bug reported:
>
> For example in https://android-build.linaro.org/builds/~linaro-
> android/panda/#build=270
>
> Failure to connect to LAVA is annoying, but shouldn't cause the build to
> be marked as failed.

If it doesn't fail the build is there a danger that no-one will notice
that builds aren't being tested?

Thanks,

James

Revision history for this message
Alexander Sack (asac) wrote : Re: [Linaro-release] [Bug 842452] [NEW] Failure to submit build to lava results in overall build failure

On Tue, Sep 6, 2011 at 2:00 PM, James Westby <email address hidden>wrote:

> On Tue, 06 Sep 2011 09:20:49 -0000, Frans Gifford <
> <email address hidden>> wrote:
> > Public bug reported:
> >
> > For example in https://android-build.linaro.org/builds/~linaro-
> > android/panda/#build=270
> >
> > Failure to connect to LAVA is annoying, but shouldn't cause the build to
> > be marked as failed.
>
>
>
If it doesn't fail the build is there a danger that no-one will notice
> that builds aren't being tested?
>
>

My take on this is that as long as the artifacts still come out it's ok to
mark the build as FAILED. Why would there be any problem with that?

--

 - Alexander

Revision history for this message
Frans Gifford (fgiff) wrote :

Feel free to mark as invalid if you think this is the correct behaviour.

Revision history for this message
Alexander Sack (asac) wrote : Re: [Linaro-release] [Bug 842452] Re: Failure to submit build to lava results in overall build failure

On Tue, Sep 6, 2011 at 2:24 PM, Frans Gifford <email address hidden>wrote:

> Feel free to mark as invalid if you think this is the correct behaviour.
>
>
maybe a third state: SUCCESSWITHPROBLEMS would be another option? Not sure
how easy that is to do on jenkins. I guess this stuff ultimately needs to be
visualized on a lava dashboard with info like: build, lava submission, boot,
tests etc. in one shot.

--

 - Alexander

Revision history for this message
James Westby (james-w) wrote :

On Tue, 06 Sep 2011 13:24:49 -0000, Alexander Sack <email address hidden> wrote:
> On Tue, Sep 6, 2011 at 2:24 PM, Frans Gifford
> <email address hidden>wrote:
>
> > Feel free to mark as invalid if you think this is the correct behaviour.
> >
> >
> maybe a third state: SUCCESSWITHPROBLEMS would be another option? Not sure
> how easy that is to do on jenkins. I guess this stuff ultimately needs to be
> visualized on a lava dashboard with info like: build, lava submission, boot,
> tests etc. in one shot.

It's quite hard to visualise problems with submitting to LAVA on LAVA
itself :-)

Thanks,

James

Revision history for this message
Paul Sokolovsky (pfalcon) wrote :

On Tue, 06 Sep 2011 13:24:49 -0000
Alexander Sack <email address hidden> wrote:

> On Tue, Sep 6, 2011 at 2:24 PM, Frans Gifford
> <email address hidden>wrote:
>
> > Feel free to mark as invalid if you think this is the correct
> > behaviour.
> >
> >
> maybe a third state: SUCCESSWITHPROBLEMS would be another option? Not
> sure how easy that is to do on jenkins. I guess this stuff ultimately
> needs to be visualized on a lava dashboard with info like: build,
> lava submission, boot, tests etc. in one shot.

Actually, Jenkins exactly has those standard states: success, build
failed (failure), build succeeded, but tests failed (unstable). It's
just we don't use Jenkins right, so not clear how easy it would be to
do that quickly.

>
> --
>
> - Alexander
>

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Revision history for this message
Alexander Sack (asac) wrote :

On Tue, Sep 6, 2011 at 5:40 PM, Paul Sokolovsky
<email address hidden>wrote:

> On Tue, 06 Sep 2011 13:24:49 -0000
> Alexander Sack <email address hidden> wrote:
>
> > On Tue, Sep 6, 2011 at 2:24 PM, Frans Gifford
> > <email address hidden>wrote:
> >
> > > Feel free to mark as invalid if you think this is the correct
> > > behaviour.
> > >
> > >
> > maybe a third state: SUCCESSWITHPROBLEMS would be another option? Not
> > sure how easy that is to do on jenkins. I guess this stuff ultimately
> > needs to be visualized on a lava dashboard with info like: build,
> > lava submission, boot, tests etc. in one shot.
>
> Actually, Jenkins exactly has those standard states: success, build
> failed (failure), build succeeded, but tests failed (unstable). It's
> just we don't use Jenkins right, so not clear how easy it would be to
> do that quickly.
>
>
I think exporting build and test success/fail states through jenkins would
be "nice to have" :).

--

 - Alexander

Revision history for this message
Alexander Sack (asac) wrote :
Download full text (3.5 KiB)

On Tue, Sep 6, 2011 at 4:42 PM, James Westby <email address hidden>wrote:

> On Tue, 06 Sep 2011 13:24:49 -0000, Alexander Sack <email address hidden>
> wrote:
> > On Tue, Sep 6, 2011 at 2:24 PM, Frans Gifford
> > <email address hidden>wrote:
> >
> > > Feel free to mark as invalid if you think this is the correct
> behaviour.
> > >
> > >
> > maybe a third state: SUCCESSWITHPROBLEMS would be another option? Not
> sure
> > how easy that is to do on jenkins. I guess this stuff ultimately needs to
> be
> > visualized on a lava dashboard with info like: build, lava submission,
> boot,
> > tests etc. in one shot.
>
> It's quite hard to visualise problems with submitting to LAVA on LAVA
> itself :-)
>
>
:) ... thats true.

OTOH, LAVA is a high priority service of Linaro and I hope we can assume
that we don't get long term downtime there ... on build server having a
queuing mechanism etc. to submit and retry submitting jobs and results feels
worth investigating to make things more robust though.

> Thanks,
>
> James
>
> --
> You received this bug notification because you are a member of linaro-
> infrastructure-drivers, which is the registrant for Linaro Android Build
> Tools.
> https://bugs.launchpad.net/bugs/842452
>
> Title:
> Failure to submit build to lava results in overall build failure
>
> Status in Linaro Android Build Tools:
> New
>
> Bug description:
> For example in https://android-build.linaro.org/builds/~linaro-
> android/panda/#build=270
>
> Failure to connect to LAVA is annoying, but shouldn't cause the build
> to be marked as failed.
>
> Traceback (most recent call last):
> File
> "/mnt/jenkins/workspace/linaro-android_panda/build-tools/build-scripts/post-build-lava.py",
> line 121, in <module>
> main()
> File
> "/mnt/jenkins/workspace/linaro-android_panda/build-tools/build-scripts/post-build-lava.py",
> line 98, in main
> lava_job_id = server.scheduler.submit_job(config)
> File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
> return self.__send(self.__name, args)
> File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request
> verbose=self.__verbose
> File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request
> return self.single_request(host, handler, request_body, verbose)
> File "/usr/lib/python2.7/xmlrpclib.py", line 1292, in single_request
> self.send_content(h, request_body)
> File "/usr/lib/python2.7/xmlrpclib.py", line 1439, in send_content
> connection.endheaders(request_body)
> File "/usr/lib/python2.7/httplib.py", line 951, in endheaders
> self._send_output(message_body)
> File "/usr/lib/python2.7/httplib.py", line 811, in _send_output
> self.send(msg)
> File "/usr/lib/python2.7/httplib.py", line 773, in send
> self.connect()
> File "/usr/lib/python2.7/httplib.py", line 1154, in connect
> self.timeout, self.source_address)
> File "/usr/lib/python2.7/socket.py", line 571, in create_connection
> raise err
> socket.error: [Errno 110] Connection timed out
> Build step 'Execute shell' marked build as failure
> Archiving artifacts
> Finished: FAILURE
>
> To manage notifications about this bug go to:
>...

Read more...

Revision history for this message
James Westby (james-w) wrote :

On Tue, 06 Sep 2011 16:10:12 -0000, Alexander Sack <email address hidden> wrote:
> OTOH, LAVA is a high priority service of Linaro and I hope we can assume
> that we don't get long term downtime there ... on build server having a
> queuing mechanism etc. to submit and retry submitting jobs and results feels
> worth investigating to make things more robust though.

I proposed a system that would have this without having to keep ec2
slaves up (costing us money,) but that's not what was
implemented. Fixing that isn't in the pipeline on the Infrastructure
side at this time, I don't know if Android will revisit it.

It's vital that we solve this, as even if LAVA is usually available, we
must not assume that it is 100% available.

Thanks,

James

Changed in linaro-android-build-tools:
importance: Undecided → Low
Revision history for this message
Paul Sokolovsky (pfalcon) wrote :

Ok, recenetly we tackled this in lp:888547, so now we have per-job capability to set if LAVA submission failure is fatal for it or no. So, I'd like to close this ticket.

Changed in linaro-android-build-tools:
assignee: nobody → Paul Sokolovsky (pfalcon)
status: New → Triaged
Revision history for this message
Frans Gifford (fgiff) wrote :

Please close, the reason for raising this was the same as the reason for raising lp:888547, so this is no longer a problem.

Changed in linaro-android-build-tools:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.