Comment 2 for bug 874826

Revision history for this message
Nicholas Leskiw (nleskiw) wrote : Re: [Bug 874826] Re: Corrupt / empty files causing problems

Definitely rm any empty files, and check directory owners and permissions.

I've seen that cause problems. Check the logs for 'corrupt whisper file' errors.

-Nick

On Mar 5, 2012, at 11:24 AM, Darrell Bishop <email address hidden> wrote:

> I had something very similar affect me. From my (internal) bug report
> (which we haven't investigated yet):
>
> "...one device's CPU whisper directory had 2 okay files, one partially-
> written file (maybe with a bad header as well?), and several zero-byte
> files. While in this state, metrics collection seemed to not be
> working. Metrics were coming in (possibly getting stored to the okay
> files, but I can't remember), but the problem didn't correct itself even
> after I removed the partial file and restarted daemons all over the
> place. Completely removing both nodes' whisper directories allowed
> things to work again once all the whisper files got created again."
>
> In hindsight, maybe it was the zero-byte files causing the problem? It
> doesn't sound like I tried removing those.
>
> --
> You received this bug notification because you are subscribed to
> Graphite.
> https://bugs.launchpad.net/bugs/874826
>
> Title:
> Corrupt / empty files causing problems
>
> Status in Graphite - Enterprise scalable realtime graphing:
> New
>
> Bug description:
> Earlier today my graphite server OOM'ed (long story). At the same
> time, carbon was creating a bunch of new metrics. These were empty
> files and caused a lot of problems (data never got written to them,
> graphite-web metrics tree didn't show *any* stats in that directory,
> etc).
>
> I'm not sure what you think should be proper behavior here. This may
> be acceptable to you. I would like to propose two things:
>
> 1. [most important] It seems like at least the tree browser should ignore the corrupt files.
> 2. It would also be nice if carbon recreated a file that was 0 bytes.
>
> Obviously an edge case, but could prevent a lot of confusion.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/graphite/+bug/874826/+subscriptions