Fuel for OpenStack

atop floods /run partition

Bug #1530167 reported by Alexey Lebedeff on 2015-12-30

This bug affects 2 people

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Released	High	Michael Polenchuk	Fuel for OpenStack 9.0
7.0.x	Won't Fix	High	Michael Polenchuk	Fuel for OpenStack 7.0-updates
8.0.x	Fix Released	High	Michael Polenchuk	Fuel for OpenStack 8.0

Bug Description

I'm running Mirantis OpenStack 7.0 in virtualbox using launch_8gb.sh script.
After few days /run partition became 100% full, and controller node became unfunctional due to this.
The root cause is atop that actively writes data to /run/atop/atop.acct - bad idea, given that /run is on tmpfs.
There is an unfixed upstream bug about this issue - https://bugs.launchpad.net/ubuntu/+source/atop/+bug/1393175

Tags:

Aleksey Zvyagintsev (azvyagintsev) on 2015-12-31

Changed in fuel:
milestone:	none → 7.0-updates
importance:	Undecided → High
status:	New → Confirmed
tags:	added: area-linux

Revision history for this message

Dmitry Teselkin (teselkin-d) wrote on 2016-01-12:

It looks a common problem.

atop uses /var/run/atop.acct (by default) to record accounting data about processes. Depending on system behaviour it may grow relatievely slow, or quite fast [1] (see screenshot [2]).

Although there is an opinion that accounting file should be rewritten daily I don't think it's a good idea, since in that case you loose some data that might be needed for debugging.

Anyway, the problem doesn't seems related to atop itself, but to the way it was started.

Reassigning to fuel-library team to let them fix puppet manifests.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=650222
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=650222;msg=26;filename=650222-screenshot.png

Revision history for this message

Dmitry Teselkin (teselkin-d) wrote on 2016-01-12:

atop.sh Edit (1.2 KiB, text/x-sh)

It is possible to use custom accounting file, passing it to atop via ATOPACCT environment variable (from man [1]):

---
With the environment variable ATOPACCT the name of a specific process accounting file can be specified (accounting should have been activated on beforehand). When this environment variable is present but its contents is empty, process accounting will not be used at all.
---

However, it's not that simple - it works as expected only when ATOPACCT set to empty variable - in this case accounting disabled. To collect accounting data in custom file additional actions required:
* file must be created first
* accounting must be turned on manually
* path to custom accounting file must be passed to atop via ATOPACCT environment varialble
* when atop stopped accounting must be turned off manually

In ubuntu 'accton' command (from package 'acct') is used to enable/disable accounting.

I'm attaching simple wrapper that automates all the step above.

[1] http://linux.die.net/man/1/atop

Bogdan Dobrelya (bogdando) on 2016-01-12

tags:	added: area-library removed: area-linux
tags:	added: team-bugfix

Revision history for this message

Michael Polenchuk (mpolenchuk) wrote on 2016-01-13:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=686329
<quote>
note that with 1.27, upstream changed the default location of the accounting file to /tmp/atop.d/atop.acct
</quote>

Revision history for this message

Dmitry Bilunov (dbilunov) wrote on 2016-01-13:

This issue may cause other problems if resolved in a straightforward manner just by moving acct file from tmpfs to an HDD — on systems with high process churn rate writing acct file causes high i/o usage. I have observed systems on which it caused production degradation; however the problems were caused by spawning thousands of processes each minute.

Another option is disabling acct mechanism — if you assign ATOPACCT environment variable to something like /dev/null, atop will run with accounting disabled ("no procacct" mode). It will still be able to collect valuable statistics; only events that happen either too fast or totally inside the sampling interval could possibly be dropped.

We should consider how much data would flow through acct interface (and written to the acct file if enabled).

Dmitry Pyzhov (dpyzhov) on 2016-01-13

no longer affects:

fuel/mitaka

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-01-14: Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/267643

Changed in fuel:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-01-20: Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/267643
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=57911011587fbf24b9e68b2a67fc3f6d1bef967e
Submitter: Jenkins
Branch: master

commit 57911011587fbf24b9e68b2a67fc3f6d1bef967e
Author: Michael Polenchuk <email address hidden>
Date: Thu Jan 14 17:57:57 2016 +0300

Disable accounting procacct mode

Turn off process accounting ('no procacct' mode).
But atop still has an ability to collect valuable stats.

    Bring in internal "custom_acct_file" option:
    * false - use atop default accounting file
    * /path_to/atop.acct - custom one
    * undef - disable accounting procacct mode

DocImpact: 'custom_accounting_file' is system wide process
accounting file (valid values is above).

Change-Id: Ida00dc663dd8c6494c479de2ae2f0f7ab6014a84
Closes-Bug: #1530167

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-01-21: Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/270707

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-01-25: Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/270707
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=45c261069386de07160d7416cdeceac09dc2f1c1
Submitter: Jenkins
Branch: stable/8.0

commit 45c261069386de07160d7416cdeceac09dc2f1c1
Author: Michael Polenchuk <email address hidden>
Date: Thu Jan 14 17:57:57 2016 +0300

Disable accounting procacct mode

Turn off process accounting ('no procacct' mode).
But atop still has an ability to collect valuable stats.

    Bring in internal "custom_acct_file" option:
    * false - use atop default accounting file
    * /path_to/atop.acct - custom one
    * undef - disable accounting procacct mode

DocImpact: 'custom_accounting_file' is system wide process
accounting file (valid values is above).

    Change-Id: Ida00dc663dd8c6494c479de2ae2f0f7ab6014a84
    Closes-Bug: #1530167
    (cherry picked from commit 57911011587fbf24b9e68b2a67fc3f6d1bef967e)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-01-27: Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/273025

Alexander Kurenyshev (akurenyshev) on 2016-02-03

tags:

added: on-verification

Revision history for this message

Alexander Kurenyshev (akurenyshev) wrote on 2016-02-08:

#10

Verified on iso 478.
Atop writes its logs to the /var/log/atop directory. Directory size increased insignificantly for 4 days.

tags:

removed: on-verification

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-04-07: Change abandoned on fuel-library (stable/7.0)

#11

Change abandoned by Michael Polenchuk (<email address hidden>) on branch: stable/7.0
Review: https://review.openstack.org/273025
Reason: outdated

Revision history for this message

Serhii Ovsianikov (sovsianikov) wrote on 2016-04-08:

#12

Why this fix was abandoned for 7.0? I always catch this bug on my test lab: Fuel 7.0, 1 controller, 3 ceph nodes, 1 LMA, 1 Elasticsearch, 1 Influxdb.

Please fix it and include the patch in the next MU for 7.0.

Thank you

root@node-15:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 990M 12K 990M 1% /dev
tmpfs 201M 2.9M 198M 99% /run
/dev/dm-3 15G 3.0G 11G 22% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 1001M 60M 942M 6% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda3 196M 39M 148M 21% /boot
/dev/mapper/logs-log 9.8G 3.1G 6.2G 34% /var/log
/dev/mapper/mysql-root 20G 2.9G 16G 16% /var/lib/mysql
/dev/mapper/mongo-mongodb 139G 11G 121G 9% /var/lib/mongo
You have new mail in /var/mail/root

tags:

added: support

Alexander Petrov (apetrov-n) on 2016-06-16

tags:

added: on-verification

Revision history for this message

Alexander Petrov (apetrov-n) wrote on 2016-06-16:

#13

Verified on MOS 9.0 ISO 479

Changed in fuel:
status:	Fix Committed → Fix Released

Evgeny Sikachev (esikachev) on 2016-06-16

tags:

removed: on-verification

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

atop.sh Edit

Add attachment

Remote bug watches

debbugs #650222
[done important upstream pending] Edit
debbugs #686329
[done minor] Edit

Bug watches keep track of this bug in other bug trackers.