Comment 4 for bug 1220414

Revision history for this message
Neil Williams (codehelp) wrote :

Further tests on community.validation.linaro.org show that a change of Priority would be insufficient. At the point where the code was calculating which devices could be assigned to which jobs, there were *no* health check jobs in the list returned by the status=TestJob.SUBMITTED filter. So the dispatcher, running lava_scheduler_daemon and with write access to the DB, saw that the devices were IDLE and assigned the MultiNode jobs. The very next run of the Refreshing Jobs() loop showed the health checks but by then, it was too late. The greedy scheduler model had assigned jobs to the devices, just as it should.

So the problem is that the device needs to be marked such that the first job assigned to the device *must* be a health check. A new health status of HEALTH_ASSIGN is proposed. _fix_device will refuse to assign a job to a device in health state Looping or Assign unless that job is a health check. On completion, the health status is updated and normal scheduling proceeds.