Admin function to cancel unique job ids

Bug #1339730 reported by Ricardo Branco
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Gearman
New
Undecided
Unassigned

Bug Description

We have encountered a problem where unique-jobs are stuck in a queue after the worker-function was removed prior to completing the jobs.
When a worker is dropped from GM it should remove all the unique jobs that were assigned to it.

Also when using --show-unique-jobs can it also show what functions they are related to.

Tags: gearadmin
Revision history for this message
chjgcn (chjgcn) wrote :

After patching on version 1.1.12 with this patch file, you can use these methods to cancel jobs of some functions or some job handles or some job uniques:
    gearadmin -S --cancel-jobs - TEST0,TEST1,TEST2
    gearadmin -S --cancel-jobs H:Linux:4,H:Linux:43,H:Linux:42,H:Linux:47
    gearadmin -S --cancel-unique-jobs - TEST0,TEST1,TEST2
    gearadmin -S --cancel-unique-jobs e919816c-2838-11e4-b83c-6c71d98bafd2,e9196c22-2838-11e4-b83c-6c71d98bafd2
You can also use these methods to show jobs of some functions or some job handles or some job uniques:
    gearadmin -S --show-jobs - TEST0,TEST1,TEST2
    gearadmin -S --show-jobs H:Linux:4,H:Linux:43,H:Linux:42,H:Linux:47
    gearadmin -S --show-unique-jobs - TEST0,TEST1,TEST2
    gearadmin -S --show-unique-jobs e919816c-2838-11e4-b83c-6c71d98bafd2,e9196c22-2838-11e4-b83c-6c71d98bafd2
The show-job(s) and show-unique-job(s) commands will output these columns respectively:
    job_handle, retries, is_ignored, is_queued, when_to_run, priority, is_running, numerator, denominator, unique, function
    unique, retries, is_ignored, is_queued, when_to_run, priority, is_running, numerator, denominator, job_handle, function

The patch file also contains other patches of bug correction and functionality enhancement, such as SSL connection, Epoch job, HTTP protocol, MySQL queue.

Revision history for this message
chjgcn (chjgcn) wrote :

The newer patch file is in
    https://bugs.launchpad.net/gearmand/+bug/1348865/comments/6
I forgot to add <cstring> header file into
    libgearman/interface/universal.hpp

Revision history for this message
chjgcn (chjgcn) wrote :

Today I post a new patch file which adds some new gearadmin functions such as configuring server and functions, cleaning and restoring canceled jobs, and fixing bug of dropping functions.

Revision history for this message
chjgcn (chjgcn) wrote :

"drop-functions" can drop all functions.

Revision history for this message
chjgcn (chjgcn) wrote :

All configurations through gearmand and gearadmin are implemented . And some small bugs are fixed. For example, when a client/worker disconnects with server, the line before last line will be:
        Gear connection disconnected: -:-
now the line will display host and port of client/worker. Another example is the comparison between max_queue_size and job_total.

Command 'gearmand --help' and 'gearadmin --help' will show all the options and their meanings.

Revision history for this message
chjgcn (chjgcn) wrote :

Two month ago, I found a bug of receiving text output from gearmand via gearadmin. If the sencond part of text output is larger than 8192 bytes, the rest beyond 8192 bytes will not be read and printed. The patch file will fix this bug.

Revision history for this message
chjgcn (chjgcn) wrote :

I had made a mistake in the patch file at line 3495. At this line,
    while (worker != job->function->worker_list && (worker_wakeup == 0 || worker_wakeup < noop_sent));
should be
    while (worker != job->function->worker_list && (worker_wakeup == 0 || worker_wakeup > noop_sent));
Thanks to yunfei !

Revision history for this message
chjgcn (chjgcn) wrote :

After some days's work in my spare time, I add more configurations and more timers for the server and functions, and the server has its own timer, and the job has a new property named 'skip_job' , meaning that the job is skipped or not, and can be skipped or turnbacked through gearadmin.
Command 'gearmand --help' and 'gearadmin --help' will show all the options and their meanings.

Revision history for this message
chjgcn (chjgcn) wrote :

There may be a bug related with 'timer_delete', which is called by system when there is an alive timer after gearmand is killed.
In this patch file, these timers will be deleted when object of gearman_server_options_st is destructed.
This patch file also fixes the bug in
        https://bugs.launchpad.net/gearmand/+bug/1390672

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.