Activity log for bug #1397445

Date Who What changed Old value New value Message
2014-11-28 21:30:03 Berend Ozceri bug added bug
2014-11-28 21:30:43 Berend Ozceri description Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but it's epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached. Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.
2014-11-28 21:31:16 Berend Ozceri description Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached. Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so the worker receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no asynchronous communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.
2014-11-28 21:32:02 Berend Ozceri summary Premature NOOP wake-up for epoch jobs Premature/missing NOOP wake-up for epoch jobs
2014-12-09 06:36:26 chjgcn bug added subscriber chjgcn
2015-01-20 08:10:13 yunfei bug added subscriber yunfei