ci.linaro.org AMI, etc. setup started to get disorganized and bitrot

Bug #1325604 reported by Paul Sokolovsky
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro CI
Confirmed
High
Unassigned

Bug Description

While working on ci.linaro.org migration, I find many aspects of its current Jenkins setup - both global and per-job to be unclear. There're many questions were communicated to Fathi, some of them are still unanswered.

Some jobs are failing for prolonged time (and affect stability of slaves they run on, like cause them hang), and apparently not being looked at, and my trying to debug them gives little outcome due to issues above (i.e. I don't see easy and complete picture, instead it's pretty complicated if not say obfuscated).

My trying to dig it and figure it out shows that there's divergence from some previous agreed maintenance principles. For example, we agreed that we maintain custom, quickly launchable AMIs, configuration for which are kept pretty well organized by linaro-ami-tool. However, looking at "Precise-64 3exec" build slave configuration, I see that its init script has grown to 70 lines, with lots of stuff being installed on top custom AMI (which, per convention mentioned, should really go into AMI itself). In particular, that the place "linaro-cp.py" appears to be installed - Fathi mentioned it previously, but I could findn't where it was install (nor Fathi gave a clear pointer). Besides that, at this time there're 11 different slave configurations, which is a bit too much to be possible to maintain well and problem-free.

Such situation (largely) complicates migrating, complicates debugging issues like lp:1324882 or recently reported "session timeout too short" issue. Unfortunately, I don't have many suggestions how to improved the situation besides slave setup to be redone to get it under control, job configs to be redone to be explicit, clear, and standalone, etc. Of course, that would take lot of effort and time, which again conflicts with the aim of migrating ci.l.o to ubuntu 14.04.

Changed in linaro-ci:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Fathi Boudra (fboudra) wrote :

* only one question is left and is related to linaro-cp
* All the addition on top of custom AMI are hot fixes, documented in init and definitely not 70 lines
* linaro-cp is used on dedicated slave, hence not on AMI
* session timeout is too short and is unrelated to any of the issues raised above

so if you feel that AMI setup starts to be disorganized, it's because you aren't in the loop of some of the changes.

Revision history for this message
Chase Qi (chase-qi) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.