tear apart the schedule for cron.daily/weeky/monthly

Bug #1424705 reported by Bjoern
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Wishlist
git-harry
Kilo
Fix Released
Wishlist
Kevin Carter
Trunk
Fix Released
Wishlist
git-harry

Bug Description

Currently the crontab for cron.daily/weeky/monthly seems to be the same for all containers, especially among the infra nodes.
That is causing intermittent service issues, in particular around Keystone since that one is restarted trough the logrotate/apache configuration. The goal should be to have different cron schedules across the infra nodes. Personally I don't know if randomizing will be enough since there is a chance that services will be still restarted at the same time. So I would recommend to have fixed times across infra01 to 0x. We probably want to add the possibility to define a start time where those crons can run, considering we have customers in different time zones.

Revision history for this message
Kevin Carter (kevin-carter) wrote :

I think we should research the best way to do this across multiple hosts referencing an architecture of at least 5 infra nodes. In this way we can ensure that we're building the right logic into the system for larger deployments.

Revision history for this message
git-harry (git-harry) wrote :

So from the bug description it sounds like there is a particular issue with keystone because rotating the logs is restarting apache. It sounds like copytruncate or piped logs would solve the problem here.

Are there any other know cases where the cron jobs cause issues?

I think the solution proposed is a low priority feature request and the keystone issue should be addressed as described above - why restart the service if there is no need to.

no longer affects: openstack-ansible/trunk
Changed in openstack-ansible:
status: Triaged → In Progress
Changed in openstack-ansible:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (kilo)

Reviewed: https://review.openstack.org/201142
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=d8216491a8b2c2167fde5c850cd22398c19dc721
Submitter: Jenkins
Branch: kilo

commit d8216491a8b2c2167fde5c850cd22398c19dc721
Author: git-harry <email address hidden>
Date: Fri Jun 26 13:09:53 2015 +0100

    Add role system-crontab-coordination

    Currently every host, both containers and bare metal, has a crontab
    configured with the same values for minute, hour, day of week etc. This
    means that there is the potential for a service interruption if, for
    example, a cron job were to cause a service to restart.

    This commit adds a new role which attempts to adjust the times defined
    in the entries in the default /etc/crontab to reduce the overlap
    between hosts.

    Change-Id: I18bf0ac0c0610283a19c40c448ac8b6b4c8fd8f5
    Closes-bug: #1424705
    (cherry picked from commit e148635e789568a2c886c3a669884bd0d82d9a72)

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 11.2.14

This issue was fixed in the openstack/openstack-ansible 11.2.14 release.

no longer affects: openstack-ansible/juno
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.