Corrupt / empty files causing problems
Bug #874826 reported by
Scott Smith
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
carbon |
New
|
Undecided
|
Unassigned |
Bug Description
Earlier today my graphite server OOM'ed (long story). At the same time, carbon was creating a bunch of new metrics. These were empty files and caused a lot of problems (data never got written to them, graphite-web metrics tree didn't show *any* stats in that directory, etc).
I'm not sure what you think should be proper behavior here. This may be acceptable to you. I would like to propose two things:
1. [most important] It seems like at least the tree browser should ignore the corrupt files.
2. It would also be nice if carbon recreated a file that was 0 bytes.
Obviously an edge case, but could prevent a lot of confusion.
affects: | graphite → carbon |
To post a comment you must log in.
I had something very similar affect me. From my (internal) bug report (which we haven't investigated yet):
"...one device's CPU whisper directory had 2 okay files, one partially-written file (maybe with a bad header as well?), and several zero-byte files. While in this state, metrics collection seemed to not be working. Metrics were coming in (possibly getting stored to the okay files, but I can't remember), but the problem didn't correct itself even after I removed the partial file and restarted daemons all over the place. Completely removing both nodes' whisper directories allowed things to work again once all the whisper files got created again."
In hindsight, maybe it was the zero-byte files causing the problem? It doesn't sound like I tried removing those.