ir.cron - simultaneous start of cron jobs

Bug #715418 reported by Ferdinand
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Odoo Server (MOVED TO GITHUB)
Fix Released
Wishlist
OpenERP's Framework R&D

Bug Description

We have not checked if this is fixed in v6

Batch-jobs defined via ir_cron may be started simultaneously/concurrently.
(We have seen up to 17 identical executions of the same job)

/base/ir/ir_cron.py starts timer (via netsvc.startTimer) without
checking, whether there is already a timer started for the same point in
time (or: without removing such simultaneous timers).
Hence, ir_cron._poolJobs will be invoked several times simultaneously.

I suggest to maintain a list of timers in /base/ir/ir_cron.py which
inhibits duplicate timers.
Alternatively "duplicate" timers could be removed - although this might
be difficult to implement.
Such a mechanism could also be implemented within netsvc.startTimer - on
the other hand: simultaneous jobs can also be useful (outside ir_cron).

When are jobs "simultaneous"?
I suggest to implement a "time-granularity" which is less or equal the
smallest granularity of ir_cron-Jobs, i.e. "minutes".
To be on the safe side for long-running jobs, the time-granularity may
also be the shortest interval currently defined via ir_cron (e.g. 10
minutes).

any ideas ?

Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :

V6 has the same problem.

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

Hello,

in v5 we have done a module named c2c_cron_audittrail, that will fix these problems, instances of the "same cron" will never be allowed to run concurrently, but other cron are really threaded. The module can be easily ported in v6 but it will better to integrate it in the heart series 6.1

Regards

Nicolas

Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :
Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :
Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :

To bring some objectivity to the behavior of ir_cron, I have developed a very simple test (see attachment "ir_cron_test.py") and the expected results (see attachment "serverlog.txt").

First, we have to agree, that the given test will produce the given result.

If we agree, then I have to report, that neither V5 nor V6 nor c2c_cron_audittrail will produce those results (each of them shows different results).

This test focuses on concurrency, timing and priority of cron-jobs.
The test consists of two cron-jobs that run every minute ("frequent_quick") and three cron-jobs that run every ten minutes ("rare_slow") and last 270secs. The latter have the same starting time but different priority.

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

The problem is that it will not be possible to have multiple launch of simultaneous cron in actual version of cron are they are not really threaded. Only the "master cron is threaded".

C2c_cron_audittrail propose threaded concurrent cron be the way it ensure uniqueness of cron is based on model/function instead of cron id is too limitative. This will not allows to have same cron running with different args. An other point is that is also right in the remarks of Gerhard is that neither OpenERP base cron management or c2c_cron_audittrail allows to have the same cron that can be concurrently run. This is a useful feature.

We should link this bug on expert-framework mailinglist

Nicolas

Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :
Changed in openobject-server:
assignee: nobody → OpenERP's Framework R&D (openerp-dev-framework)
status: New → Triaged
Revision history for this message
Ferdinand (office-chricar) wrote :
Download full text (3.2 KiB)

May we give our ideas:

********** Mr. Könighofer's input ****

The OpenERP-ir-cron behavior is - despite its similarity of names -
totally different from a Unix-cron behavior.
This is neither documented nor expected by the user.

Short explanation:
Unix-cron:
  A job is

    * started at a given point in time,
    * executed with a given (OS-)priority (which indicates which portion
      of CPU-time it will receive) and
    * rerun after the specified interval

This means that jobs run concurrently, if their execution intervals
overlaps.

OpenERP-ir-cron:
After a schedule-event, which occurs at least once per hour, a list of
jobs will be executed.
The list of jobs contains all jobs, which have their starting-time
("nextcall") expired.
The list of jobs is sorted according to their priority (which has
nothing to do with their CPU-time portion).
After the list of jobs has finished, a new schedule-event is calculated
(V5 and V6 differ in this calculation).
It looks like concurrency is sought to be avoided (a current bug-report
shows that this is not achieved:
https://bugs.launchpad.net/openobject-server/+bug/715418?comments=all).

This means, that the starting time and the interval of a single job
depends on the other jobs (their priority and runtime).
In the presence of other jobs, you can never know when a certain job is
executed.
This is unusable for many applications that we can imagine.

Before entering into a discussion about design- and
implementation-details, the requirements (=the behavior) shall be
discussed and specified.

   1. If the current OpenERP-ir-cron behavior shall be kept (for
      whatever reason) the name shall be changed, as it is maliciously
      misleading.
   2. There shall be a module that behaves similar to a Unix-cron
      (concurrent execution at a precise interval and point in time).

There may be additional constraints on concurrency, but they can only be
discussed after 1) and 2) are decided.

Design ideas for a Unix-cron-like module (ignore the following
paragraphs before the requirements are fixed):
  - concurrent jobs may be started using threads (Python module:
"threading").
  - threads allow a simple communication with the "main-thread" (e.g. a
GUI-refresh after a job has finished)
  - threads shall be "daemons" (i.e. their life-span depends on the
"main-thread").
  - there is no "priority", as threads do not support OS-priority (i.e.
CPU-time portion) and the start-time overrules any priority (i.e.
sequence) rules.
  - a time-granularity is needed for starting the "actual" job. This
time-granularity shall be smaller than the shortest cron interval (i.e.
1 minute)

Alternatively, instead of "light-weight" threads, "heavy-weight" tasks
may be designed (e.g.: start a new OpenERP-server for each job, that
executes exactly one function).
Such an approach has the advantage, that CPU-priority may be utilized
and the jobs could even run on different CPUs in a network.
The disadvantage is, that the management of this architecture is more
complex, the communication to the "main-server" is more difficult and it
can be doubted, that CPU-resources are preserved.

**********
see also PEP 3...

Read more...

Revision history for this message
Gerhard Könighofer (gerhard-koenighofer) wrote :

The attached file can be used to replace addons/base/ir/ir_cron.py (with V5; read inline comments for changes to V6).

This file solves the problems discussed (and more):
 - jobs are concurrent
 - jobs start precisely at their "Next execution date" and obey their "Interval"
 - duplicate start of jobs is inhibited
 - does not wait for pending jobs during openerp-server restart
 - jobs get terminated at openerp-server shutdown
 - vastely improved logging-information (e.g. CPU-time, elapsed-time)
 - removed usage of old python modules

A function "thread_watchdog" can be used as health monitor for threads:
Simply schedule a job with Object "ir_cron" and Function "thread_watchdog" (no arguments).
Threading anomalies are written to the server-log.

Revision history for this message
Olivier Dony (Odoo) (odo-openerp) wrote :

A little update on this topic: most of the above remarks have been taken into account in the new cron system that is included in v6.1/trunk. This new system starts concurrent cron jobs, respects their start time as much as possible, works with multiple servers (including avoiding duplication of execution!), etc.

This revamped cron system is available in trunk as of r.3670 rev-id: <email address hidden>

As this feature was clearly out of scope for past stable releases, I will completely close this bug now that it is done for 6.1.

Thanks to everyone for your suggestions and patches!

Changed in openobject-server:
importance: Undecided → Wishlist
milestone: none → 6.1
status: Triaged → Fix Released
summary: - [5.x] ir.cron - simultaneous start of cron jobs
+ ir.cron - simultaneous start of cron jobs
Revision history for this message
Kyle Waid (midwest) wrote : Re: [Bug 715418] Re: [5.x] ir.cron - simultaneous start of cron jobs

I would strongly recommend against simultaneous cron jobs for the reason of
serious problems with table locking. If I am running mrp scheduler for
example, it locks every user out of the system, and if you add multiple
actions running on top of this, it is bound to cause failures

On Tue, Nov 29, 2011 at 6:53 AM, Olivier Dony (OpenERP) <
<email address hidden>> wrote:

> A little update on this topic: most of the above remarks have been taken
> into account in the new cron system that is included in v6.1/trunk. This
> new system starts concurrent cron jobs, respects their start time as
> much as possible, works with multiple servers (including avoiding
> duplication of execution!), etc.
>
> This revamped cron system is available in trunk as of r.3670 rev-id:
> <email address hidden>
>
> As this feature was clearly out of scope for past stable releases, I
> will completely close this bug now that it is done for 6.1.
>
> Thanks to everyone for your suggestions and patches!
>
> ** Changed in: openobject-server
> Importance: Undecided => Wishlist
>
> ** Changed in: openobject-server
> Status: Triaged => Fix Released
>
> ** Changed in: openobject-server
> Milestone: None => 6.1
>
> ** Summary changed:
>
> - [5.x] ir.cron - simultaneous start of cron jobs
> + ir.cron - simultaneous start of cron jobs
>
> --
> You received this bug notification because you are a member of OpenERP
> Committers, which is subscribed to OpenERP Server.
> https://bugs.launchpad.net/bugs/715418
>
> Title:
> ir.cron - simultaneous start of cron jobs
>
> Status in OpenERP Server:
> Fix Released
>
> Bug description:
> We have not checked if this is fixed in v6
>
> Batch-jobs defined via ir_cron may be started simultaneously/concurrently.
> (We have seen up to 17 identical executions of the same job)
>
> /base/ir/ir_cron.py starts timer (via netsvc.startTimer) without
> checking, whether there is already a timer started for the same point in
> time (or: without removing such simultaneous timers).
> Hence, ir_cron._poolJobs will be invoked several times simultaneously.
>
> I suggest to maintain a list of timers in /base/ir/ir_cron.py which
> inhibits duplicate timers.
> Alternatively "duplicate" timers could be removed - although this might
> be difficult to implement.
> Such a mechanism could also be implemented within netsvc.startTimer - on
> the other hand: simultaneous jobs can also be useful (outside ir_cron).
>
> When are jobs "simultaneous"?
> I suggest to implement a "time-granularity" which is less or equal the
> smallest granularity of ir_cron-Jobs, i.e. "minutes".
> To be on the safe side for long-running jobs, the time-granularity may
> also be the shortest interval currently defined via ir_cron (e.g. 10
> minutes).
>
> any ideas ?
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/openobject-server/+bug/715418/+subscriptions
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.