puppet report files fill up boot server

Bug #1098208 reported by Ian Wells
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cisco Openstack
Fix Released
Critical
Mark T. Voelker
Folsom
Fix Released
Critical
Mark T. Voelker

Bug Description

Since the installed machines are all running puppet on a schedule they drop a new puppet report every half hour. This quite rapidly (ca. 1 month) fills up the boot server.

Since the report files go under /var/lib/puppet and the apt-cacher is using /var/cache/apt-cacher-ng it's also quite common that this blocks up the /var filesystem and the first symptom is a complaint about package fetching from the installed machines.

Do something about disposing of or rotating the puppet reports. (And alerting people to their contents, while we're at it.)

Revision history for this message
Mark T. Voelker (mvoelker) wrote :

Assigning to Michael per earlier discussion.

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

This fills up the boot server to the point that the apt-cache fails (apt-get install fails on target systems) and no log files can be written on the boot server, and does it in the space of weeks. No logs on the bootserver seems critical to me.

I've had to mitigate it repeatedly by hand on end user systems.

Mitigation:

sudo rm -rf /var/lib/puppet/reports/<target machines>/* # Note: we never check or read these or direct end users to
sudo /etc/init.d/apt-cacher-ng stop
sudo rm -rf /var/cache/apt-cacher-ng/* # cache emptied, corruption deleted, will rebuild itself
sudo /etc/init.d/apt-cacher-ng start

Changed in openstack-cisco:
importance: High → Critical
Revision history for this message
Mark T. Voelker (mvoelker) wrote :

I believe the suggestion here was to simply add a logrotate config for these logs to our build. Does that satisfy your requirement? If not, what other suggestion do you have?

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

No problem with that (though, since it makes one file per report I don't know if logrotate will do it); I'm just upping the urgency based on the number of reports it's generated.

Revision history for this message
Mark T. Voelker (mvoelker) wrote :

If not a logrotate, what about a simple cron job? We could add puppetry to drop a file in /etc/cron.daily that looks something like:

mtvoelke@molas:/etc/cron.daily$ ls -lh /etc/cron.daily/puppet_logs
-rwxr-xr-x 1 root root 62 Feb 14 09:33 /etc/cron.daily/puppet_logs
mtvoelke@molas:/etc/cron.daily$
mtvoelke@molas:/etc/cron.daily$ cat /etc/cron.daily/puppet_logs
#!/bin/bash
find /var/lib/puppet/reports -mtime +7 | xargs rm
mtvoelke@molas:/etc/cron.daily$

This should delete logs older than 7 days which feels like a reasonable default. Adjust for whatever timeframe you like, obviously. Or change rm to something to gzip them, or whatever.

Revision history for this message
Ian Wells (ijw-ubuntu) wrote : Re: [Bug 1098208] Re: puppet report files fill up boot server

We should probably add a separate bug to get puppet report statuses into
the monitoring. We care if puppet is unhappy, but we probably don't
detect it right now.

Revision history for this message
Mark T. Voelker (mvoelker) wrote :
Revision history for this message
Michael DeHaan (mdehaan) wrote :

note the files in the reports directory were actually directories, hence the -rf

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

The top level is of directories (one per machine) but the next level down
is files and these are what we need to delete (probably the only thing we
need to delete, in fact, the directories don't multiply beyond the number
of machines).
--
Ian.

On 15/02/2013 22:49, "Michael DeHaan" <email address hidden> wrote:

>note the files in the reports directory were actually directories, hence
>the -rf
>
>--
>You received this bug notification because you are subscribed to the bug
>report.
>https://bugs.launchpad.net/bugs/1098208
>
>Title:
> puppet report files fill up boot server
>
>Status in Openstack @ Cisco:
> New
>Status in Cisco Openstack folsom series:
> In Progress
>
>Bug description:
> Since the installed machines are all running puppet on a schedule they
> drop a new puppet report every half hour. This quite rapidly (ca. 1
> month) fills up the boot server.
>
> Since the report files go under /var/lib/puppet and the apt-cacher is
> using /var/cache/apt-cacher-ng it's also quite common that this blocks
> up the /var filesystem and the first symptom is a complaint about
> package fetching from the installed machines.
>
> Do something about disposing of or rotating the puppet reports. (And
> alerting people to their contents, while we're at it.)
>
>To manage notifications about this bug go to:
>https://bugs.launchpad.net/openstack-cisco/+bug/1098208/+subscriptions

Revision history for this message
Mark T. Voelker (mvoelker) wrote :
Revision history for this message
Clifford Baeseman (cbaeseman) wrote :

current crontab entry does not contain the user thus fails with a syntax error

Revision history for this message
Mark T. Voelker (mvoelker) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.