Cannot cancel a job from the Gearman queue

Bug #1150071 reported by Khai Do
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gearman
Fix Released
Medium
Brian Aker

Bug Description

I have a use case where I would like to cancel jobs from the gearman queue. However there is no way to do this right now.

Use case:
My client will place the jobs on the gearman queue with an associated UUID. Some jobs take a long time to run so the gearman server can get backed up with many jobs still on its queue. Sometimes I'll want to cancel a job after placing it on the gearman queue because I've determined that it no longer needs to be run because some previous job that it dependend on failed.

Please add the ability to cancel jobs from the queue. Thanks.

Tags: queue server

Related branches

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 1150071] [NEW] Cannot cancel a job from the Gearman queue

Excerpts from Khai Do's message of 2013-03-06 18:18:14 UTC:
> Public bug reported:
>
> I have a use case where I would like to cancel jobs from the gearman
> queue. However there is no way to do this right now.
>
> Use case:
> My client will place the jobs on the gearman queue with an associated UUID. Some jobs take a long time to run so the gearman server can get backed up with many jobs still on its queue. Sometimes I'll want to cancel a job after placing it on the gearman queue because I've determined that it no longer needs to be run because some previous job that it dependend on failed.
>
> Please add the ability to cancel jobs from the queue. Thanks.

IMO this is not necessary. Any time something can be done in workers,
it will be more scalable to do it there.

Simply have a sentinel check on a known distributed location (like a
memcache or redis key) in your workers.

function do_thing($job)
{
    $id = json_decode($job->payload())['id'];
    $cancelled = $memcache_client->get("cancelled_$id");
    if ($cancelled) {
        return;
    }
    // .. do work
}

Then instead of needing to reach into gearmand to cancel a job, thus
locking the queue, searching it, and removing it.. you are just poking
a highly-read-scalable external service out of band of gearmand entirely.

Revision history for this message
James E. Blair (corvus) wrote :

That's a good suggestion and may be appropriate for some environments.

In our environment, I would rather not add a dependency on an otherwise unrelated service. In addition to needing to deploy memcache (or anything else) just for this task, both the dispatcher and workers would need to be configured to talk to this fourth system. Effectively there would be two paths of communication between the dispatcher and workers: gearman and memcache.

I can certainly see that in some extremely high-throughput situations, your solution would be an ideal and efficient implementation. But in our environment we're trying to build pluggable systems that talk to each other over a common protocol, and in this case, the protocol simply lacks one of the ideas that we need to express. I think the best solution to that is to add it, and if you choose to ignore it in favor of a different solution, that option will remain.

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 1150071] Re: Cannot cancel a job from the Gearman queue

Excerpts from James E. Blair's message of 2013-03-06 22:22:22 UTC:
> That's a good suggestion and may be appropriate for some environments.
>
> In our environment, I would rather not add a dependency on an otherwise
> unrelated service. In addition to needing to deploy memcache (or
> anything else) just for this task, both the dispatcher and workers would
> need to be configured to talk to this fourth system. Effectively there
> would be two paths of communication between the dispatcher and workers:
> gearman and memcache.
>
> I can certainly see that in some extremely high-throughput situations,
> your solution would be an ideal and efficient implementation. But in
> our environment we're trying to build pluggable systems that talk to
> each other over a common protocol, and in this case, the protocol simply
> lacks one of the ideas that we need to express. I think the best
> solution to that is to add it, and if you choose to ignore it in favor
> of a different solution, that option will remain.

Gearman doesn't make much sense in low throughput use cases IMO.

Perhaps you want AMQP (RabbitMQ, ActiveMQ, or QPID) or Redis, they're
more flexible and thus might be a better choice for your needs, and both
will support this use case.

Revision history for this message
Brian Aker (brianaker) wrote :

This comes up from time to time (and is useful for testing).

Solution is to allow KILL HANDLE and KILL UNIQUE

The simple solution for this is to close() the connection to the worker, forcing it to drop its work and reconnect.

Revision history for this message
Brian Aker (brianaker) wrote :

Sorry, didn't finish my thought.

The kill would happen if the job was already being processed (i.e. there is no reason for it to continue).

Brian Aker (brianaker)
Changed in gearmand:
status: New → In Progress
importance: Undecided → Medium
milestone: none → 1.0.4
Brian Aker (brianaker)
Changed in gearmand:
status: In Progress → Fix Committed
milestone: 1.0.4 → 1.1.6
Brian Aker (brianaker)
Changed in gearmand:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.