[customer-bp] Shotgun should ensure enough disk space for diagnostic snapshot

Bug #1328879 reported by Tomasz Adam Jaroszewski
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
High
Bogdan Dobrelya

Bug Description

Based on customer's support ticket.

Customer has noticed that diagnostic snapshot generator is not checking free disk space before and while generating snaps. It can lead to filling whole available space and prevents snaps from being generated.
When space is filled customer is not receive any notification - 'never ending "Loading...' from Fuel (shotgun).

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote : Re: [Fuel] Shotgun should ensure enough disk space for diagnostic snapshot

This should be fixed in 2 parts:
1 - If any part of the shotgun task fails, mcollective needs to be notified that it failed and can report it back to the user.
2 - Ensure minimum 500mb free disk space wherever the dump directory is located. Ideally 5gb free, but this will require a little math to ensure we can accommodate large environments.

summary: - [Fuel] Diagnostic snapshot generator (shotgun) is not checking
- free/available space for snaps
+ [Fuel] Shotgun should ensure enough disk space for diagnostic snapshot
Changed in fuel:
milestone: none → 5.1
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Fuel Python Team (fuel-python)
tags: added: customer-found
Revision history for this message
Tomasz Bartos (tbartos) wrote :

2. Since diagnostic snapshot contain
/var/log/remote
Check shound't have fixed variable.

It should be at least size of /var/log/remote + some space for configs.
So /var/log/remote space + 500MB or something like
/var/log/remote size + 20-30MB * number of nodes in cluster.

Dima Shulyak (dshulyak)
tags: added: shotgun
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Vladimir Kozhukalov (kozhukalov)
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

It is not a shotgun issue to check if enough free space is available. Shotgun is a tool to gather some files from a bunch of nodes. It is like enforcing Mysql to check if enough free space is available before write data into a table. It is rather monitoring issue. We really need to have master node monitoring which will check free space among other things and will notify user.

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

About failing. Lots of shotgun tasks can fail during snapshot and it was one of the most important requirements not to fail if one of tasks fails. That is why we have try-except wrappers wherever possible. Anyway user can take a look in shotgun log in order to figure out what is going on.

As a fast workaround we can:
1) expose shotgun log on web UI
2) add additional parameter into shotgun config. something like minspace=1G and run df command before making snapshot.

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

Another major point here is to configure logrotate on master node so as to make sure old log directories are removed.

Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :
Changed in fuel:
status: Confirmed → Invalid
Dmitry Pyzhov (dpyzhov)
summary: - [Fuel] Shotgun should ensure enough disk space for diagnostic snapshot
+ [customer-bp] Shotgun should ensure enough disk space for diagnostic
+ snapshot
Changed in fuel:
milestone: 5.1 → 6.0
assignee: Vladimir Kozhukalov (kozhukalov) → Bogdan Dobrelya (bogdando)
status: Invalid → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

That is actually a feature request, and is superceeded by https://blueprints.launchpad.net/fuel/+spec/manage-logs-with-free-space-consideration

Changed in fuel:
status: Confirmed → Won't Fix
tags: added: release-notes
tags: added: module-shotgun
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.