carbon/whisper performing 800mb writes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Graphite |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I found my fresh graphite install abusing my disk pretty heavily. Upon investigation, I found carbon creating new whisper files and then issueing an 800mb write to that file. This is repeatable for all metrics not previously seen.
strace output here: https:/
I tracked this in whisper.py to here (this code is copied from 0.9.9's whisper)
for secondsPerPoint
archiveInfo = struct.
fh.
archiveOffs
zeroes = '\x00' * (archiveOffsetP
fh.write(zeroes)
This code, to me, says to write a few headers and then pads the rest of the file with zeroes. This zero-fill operation causes a huge amount of bytes to be written to disk and explains the heavy I/O usage I observed.
I think the 'zeroing' action can be better written like this:
fh.seek(
fh.write("\0")
The above should achieve the same results as the original code but without incurring huge amounts of disk activity.
I'm pretty sure this is a problem and am quite happy to write a patch for this. Thoughts?
description: | updated |
Changed in graphite: | |
status: | Fix Committed → Fix Released |
(Keeping in mind that the' 800mb' is due to my carbon schema configuration)
My proposed fix above will do two things - first, prevent carbon from allocating an 800mb string and writing it, which will save it from ballooning in memory that python tends to not want to give back to the system - top(1) shows carbon is currently using 812MB in memory, which maps to the roughly 800mb of memory + the rest of things carbon will need (which is small).
Second, preventing disk thrashing from writing a massive chunk to disk.
If there is a strong reason why forced writing of the entire file full of zeroes is necessary, please let me know. If nothing else, we can certainly fix the first issue I mentioned (high memory usage due to massive strings) while leaving the second (disk thrashing on creates)