Gearman

Bug #1397445
Activity log

Activity log for bug #1397445

Date	Who	What changed	Old value	New value	Message
2014-11-28 21:30:03	Berend Ozceri	bug			added bug
2014-11-28 21:30:43	Berend Ozceri	description	Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but it's epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.	Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.
2014-11-28 21:31:16	Berend Ozceri	description	Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so it receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.	Workers that have notified the jobs server of their intention to go to sleep via the PRESLEEP command are woken up when a new epoch job is added to the queue, rather than when the job becomes runnable (due to reaching its scheduled epoch). This can create a scenario (on otherwise-quiet queues) where an epoch job that's schedule to run 1<=N<60 seconds from "now" may get stuck in the queue up to 60 seconds. This occurs because when the worker is woken up via the NOOP and issues a GRAB_JOB command, there are no eligible jobs (the job is in the queue, but its epoch is >=1 second in the future), so the worker receives a NO_JOB response, and promptly goes back to sleep. When the epoch time of the job arrives, there's no asynchronous communication from the server to the worker to wake it up, so the worker continues sleeping until its "sleep timeout," at which point it issues a GRAB_JOB and find the runnable job. It feels like in addition to sending NOOPs to workers that are sleeping at the time jobs are added, the same has to be done when epochs of otherwise ineligible jobs are reached.
2014-11-28 21:32:02	Berend Ozceri	summary	Premature NOOP wake-up for epoch jobs	Premature/missing NOOP wake-up for epoch jobs
2014-12-09 06:36:26	chjgcn	bug			added subscriber chjgcn
2015-01-20 08:10:13	yunfei	bug			added subscriber yunfei