Comment 9 for bug 1442674

Revision history for this message
Dale Harris (rodmur-u) wrote : Re: [Bug 1442674] Re: sysstat log corruption, time isn't changing

Oh okay, I didn't know about -L being hard coded in there, missed
that. I just ended up making sure the two cron jobs won't conflict
with each other, which is why I was saying it was a configuration
error on my part.

I did have:

*/5 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 -L 1 1
59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 -L 60 2

But I've gone to:

2-58/2 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 -L 1 1
59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 -L 60 2

I'm pretty sure the two were conflicting. I haven't seen any problems
since making that change.

On Wed, Aug 19, 2015 at 8:39 PM, Bill Cole
<email address hidden> wrote:
> Note that the "-L" flag is hardcoded in the sa1 calls of sadc, so
> whatever locking that does is not fixing the issue. Seems like a
> possible upstream bug?
>
> I've had this issue in sysstat data files generated on machines ( n=9
> out of 48 sa?? files from 3 different machines) where the sysstat
> resolution has been changed to every 2 minutes by changing this line in
> /etc/cron.d/sysstat:
>
> 5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-
> sa1 1 1
>
> To:
>
> */2 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1
>
> I've been able to salvage the files by looking for repeating binary
> patterns to figure out the size and offsets of sadc records, finding the
> one which is a runt, and rebuilding the file by splicing together the
> sections before and after it, making sure the resulting file is
> identical in size to the other full-day files that are parseable. This
> is not an easily documented process, as it requires eyeballing hexdump
> output and making educated guesses, but there are useful tips:
>
> 1. With "-S XALL" in SADC_OPTIONS, each record is 8-9KB, and for some reason they seem to alternate between 2 sizes (!) 16 bytes different, e.g. 8528 bytes & 8544 bytes.
> 2. There is a header in the file between 850-900 bytes, so a good place to start looking for patterns at the end of records (useful!) is ~9K.
> 3. EVERY time I've had this problem, the 2nd record has been a runt, 500-1500 bytes shorter than the normal records.
>
> Because excising the runt record yields what seems to be a perfectly
> good file, my guess is that the root cause is collision between the
> 23:59 run and the 00:00 run, with the sadc -L flag for some reason
> failing to do its job.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1442674
>
> Title:
> sysstat log corruption, time isn't changing
>
> Status in sysstat package in Ubuntu:
> Invalid
>
> Bug description:
> I'm seeing some log file corruption in sysstat, the time stamp
> occasionally just stops changing. So the output I typically see is
> like:
>
>
> # sar
>
> 12:00:00 AM all 0.00 0.00 0.00 0.00 0.00 0.00
> 12:00:00 AM all 0.00 0.00 0.00 0.00 0.00 0.00
> 12:00:00 AM all 0.00 0.00 0.00 0.00 0.00 0.00
> End of system activity file unexpected
>
> Typical output should be something like:
>
> 11:58:01 PM all 0.11 0.00 0.02 0.00 0.00 99.87
> 11:59:01 PM all 0.06 0.00 0.02 0.00 0.00 99.92
> 12:00:01 AM all 0.03 0.00 0.02 0.00 0.00 99.95
> Average: all 0.10 0.00 0.04 0.00 0.00 99.86
>
>
> I'm attaching some example logs files, sa09 is good, sa10 is corrupted.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 14.04
> Package: sysstat 10.2.0-1
> ProcVersionSignature: Ubuntu 3.13.0-46.77-generic 3.13.11-ckt15
> Uname: Linux 3.13.0-46-generic x86_64
> ApportVersion: 2.14.1-0ubuntu3.8
> Architecture: amd64
> Date: Fri Apr 10 14:44:10 2015
> SourcePackage: sysstat
> UpgradeStatus: No upgrade log present (probably fresh install)
> mtime.conffile..etc.cron.d.sysstat: 2015-03-25T18:44:29.274292
> mtime.conffile..etc.sysstat.sysstat: 2015-03-25T18:44:29.198292
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/sysstat/+bug/1442674/+subscriptions

--
Dale Harris
<email address hidden>
<email address hidden>
/.-)