Unexplainable time jumps in CRON
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
rsyslog (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Trusty |
Fix Released
|
Medium
|
Seyeong Kim |
Bug Description
[Impact]
on Trusty
logging to syslog via cron causes timestamp abnormals after several hours.
[Test Case]
1. run below first,
while true ; do logger "hello syslog" ; sleep 1; done
2. register below script to crontab
for i in {1..100} ; do logger "hello syslog via cron" ; sleep 1; done
3. monitor syslog with below script
tail -f -n 100 /var/log/syslog
You can see weird timestamp in several hours.
[Regression Potential]
it's ratelimiting turn off code, could be issue with being off rate-limit
[Other Info]
below commit fixes this issue in my test
From 7a2e2473476d2b1
From: Tomas Heinrich <email address hidden>
Date: Thu, 24 Jul 2014 13:47:14 +0200
Subject: [PATCH 1/1] bugfix: plug a memleak in imuxsock
The hash table for the system socket was allocated twice. The other
one being in activateListene
This commit practically reverts: 34a77cde2423303
The issue seems to be that the hash table is not initialized (to NULL)
rather then not being allocated.
#######
## old description
On my main server I see unexplainable time jumps backwards in the syslog. Those jumps affect CRON.
Example:
Feb 10 06:48:01 nostromo CRON[20351]: (root) CMD ( /storage/
Feb 10 06:49:01 nostromo CRON[20364]: (root) CMD ( /storage/
Feb 10 06:50:01 nostromo CRON[20386]: (root) CMD ( /storage/
Feb 7 05:40:01 nostromo CRON[20389]: (root) CMD ( /storage/
Feb 10 06:50:01 nostromo CRON[20390]: (root) CMD ( /storage/
Feb 10 06:50:01 nostromo CRON[20391]: (root) CMD ( /storage/
For debugging I did the following:
Start xclock and watch xclock and tail -f /var/log/syslog in parallel. When CRON logged a wrong time, xclock did NOT show any time jump but seemed to freeze for a fraction of a second.
Open a screen and start a script that will once per second read the time (in unix seconds) and compare the read time with the time read a second ago. If the current time was smaller, the script would send an email with a process list from before and after the jump. The script also never detected any time jump.
In summary, my current impression is that there might be a bug in CRON because no other programm seems to be able to see the "wrong" time. The server in question is syslog server for 4 servers and 3 network devices. The time jumps exclusively show in syslog entries from the local CRON instance. Not in any remote syslog entry and not in any other local syslog entry, e.g. from DHCPD, bind, tftpd, etc. etc.
Also, after a reboot, things work ok for several days upto about 2 or 3 weeks. Then the "time jumps" start to occur with increasing frequency.
I don't use user crontabs but maintain all jobs in /etc/crontab. I have number of jobs which are triggered every minute and another number of jobs which are triggered every 5 minutes (maybe some CRON internal counter overflow problem?).
Hardware:
Asus P9D-V
Intel Xeon E3-1240L V3
16GB ECC RAM
128GB SSD System
3x3TB ZFS RaidZ2 storage
1x3TB Misc. data
CMOS battery already changed and board inspected.
nostromo:~ # lsb_release -rd
Description: Ubuntu 14.04.2 LTS
Release: 14.04
nostromo:~ # apt-cache policy cron
cron:
Installed: 3.0pl1-124ubuntu2
Candidate: 3.0pl1-124ubuntu2
Version table:
*** 3.0pl1-124ubuntu2 0
500 http://
100 /var/lib/
description: | updated |
Changed in rsyslog (Ubuntu Trusty): | |
assignee: | nobody → Seyeong Kim (xtrusia) |
importance: | Undecided → Medium |
tags: | added: sts-sru-needed |
Script used for monitoring:
#!/bin/bash
oldTime=$(date +%s) $(date +%s) ut=$(ps faux) $currentPsOutpu t $currentTime
oldPsOutput=$(ps faux)
while sleep 1
do
currentTime=
currentPsOutp
if [ "$currentTime" -lt "$oldTime" ] # clock change detected?
then
(
echo '========='
echo "$currentTime < $oldTime"
echo "$oldPsOutput"
echo ':::::::::'
echo "$currentPsOutput"
) | mail -s "Time jump" root
fi
oldPsOutput=
oldTime=
done