Activity log for bug #885944

Date Who What changed Old value New value Message
2011-11-04 01:49:05 Jordan Sissel bug added bug
2011-11-16 11:31:41 Dan Carley bug added subscriber Dan Carley
2012-02-19 06:32:54 Michael Leinartas graphite: status New Fix Committed
2012-02-19 06:32:54 Michael Leinartas graphite: milestone 0.9.10
2012-02-29 03:48:56 Linux Benchmark Solutions bug added subscriber Linux Benchmark Solutions
2012-02-29 04:14:35 Francois Mikus description I found my fresh graphite install abusing my disk pretty heavily. Upon investigation, I found carbon creating new whisper files and then issueing an 800mb write to that file. This is repeatable for all metrics not previously seen. strace output here: https://raw.github.com/gist/1338425/3da1dd9e541cc62f7419c706e5631b6774147624/gistfile1.txt I tracked this in whisper.py to here (this code is copied from 0.9.9's whisper) for secondsPerPoint,points in archiveList: archiveInfo = struct.pack(archiveInfoFormat, archiveOffsetPointer, secondsPerPoint, points) fh.write(archiveInfo) archiveOffsetPointer += (points * pointSize) zeroes = '\x00' * (archiveOffsetPointer - headerSize) fh.write(zeroes) This code, to me, says to write a few headers and then pads the rest of the file with zeroes. This zero-fill operation causes a huge amount of bytes to be written to disk and explains the heavy I/O usage I observed. I think the 'zeroing' action can be better written like this: fh.seek(archiveOffsetPointer - headerSize - 1) fh.write("\0") The above should achieve the same results as the original code but without incurring huge amounts of disk activity. I'm pretty sure this is a problem and am quite happy to write a patch for this. Thoughts? I found my fresh graphite install abusing my disk pretty heavily. Upon investigation, I found carbon creating new whisper files and then issueing an 800mb write to that file. This is repeatable for all metrics not previously seen. strace output here: https://raw.github.com/gist/1338425/3da1dd9e541cc62f7419c706e5631b6774147624/gistfile1.txt I tracked this in whisper.py to here (this code is copied from 0.9.9's whisper)   for secondsPerPoint,points in archiveList:     archiveInfo = struct.pack(archiveInfoFormat, archiveOffsetPointer, secondsPerPoint, points)     fh.write(archiveInfo)     archiveOffsetPointer += (points * pointSize)   zeroes = '\x00' * (archiveOffsetPointer - headerSize)   fh.write(zeroes) This code, to me, says to write a few headers and then pads the rest of the file with zeroes. This zero-fill operation causes a huge amount of bytes to be written to disk and explains the heavy I/O usage I observed. I think the 'zeroing' action can be better written like this: fh.seek(archiveOffsetPointer - headerSize - 1) fh.write("\0") The above should achieve the same results as the original code but without incurring huge amounts of disk activity. I'm pretty sure this is a problem and am quite happy to write a patch for this. Thoughts?
2012-06-01 00:05:00 Michael Leinartas graphite: status Fix Committed Fix Released