lots of empty /tmp/profiledir directories not cleaned up

Bug #1673144 reported by Steven Hardy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Expired
Undecided
Unassigned

Bug Description

In my (freshly built by quickstart) environment I'm seeing a bunch of profiledir /tmp dirs created (not yet sure by what) and not cleaned up:

drwx------. 2 root root 6 Mar 15 16:14 profiledir.cdmNHc
drwx------. 2 root root 6 Mar 15 16:15 profiledir.BnXDaA
drwx------. 2 root root 6 Mar 15 16:15 profiledir.x5JnGD
drwx------. 2 root root 6 Mar 15 16:16 profiledir.z3BKAt
drwx------. 2 root root 6 Mar 15 16:16 profiledir.5wsNQd
drwx------. 2 root root 6 Mar 15 16:17 profiledir.tXwzLc
drwx------. 2 root root 6 Mar 15 16:17 profiledir.R6Fu04
drwx------. 2 root root 6 Mar 15 16:18 profiledir.wEH3Je
drwx------. 2 root root 6 Mar 15 16:18 profiledir.39hFMV
drwx------. 2 root root 6 Mar 15 16:19 profiledir.csEzBl
drwx------. 2 root root 6 Mar 15 16:19 profiledir.Z1aCT4
drwx------. 2 root root 6 Mar 15 16:20 profiledir.2R8nPm
drwx------. 2 root root 6 Mar 15 16:20 profiledir.4sljYT
drwx------. 2 root root 6 Mar 15 16:21 profiledir.ZKBkug

I rebuilt my undercloud and it's still happening, anyone else seeing this?

Revision history for this message
Steven Hardy (shardy) wrote :

To clarify, the snippet above shows how frequently they're getting created, after a while you end up with thousands of empty directories, and auditd etc hasn't yet shown what is doing it.

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → pike-1
Revision history for this message
Dougal Matthews (d0ugal) wrote :

http://codesearch.openstack.org/?q=profiledir&i=nope&files=&repos=

Looks like it might come from disk image builder?

Revision history for this message
Steven Hardy (shardy) wrote :

> Looks like it might come from disk image builder?

So, I've never run dib on this undercloud, and ps shows it's not running *but* there's a ML thread saying o-r-c depends on dib-run-parts, and running o-r-c manually does create a directory.

[stack@undercloud diskimage-builder]$ cd /usr/lib/python2.7/site-packages/diskimage_builder/
[stack@undercloud diskimage_builder]$ grep -R profiledir ./*
./lib/dib-run-parts:PROFILE_DIR=$(mktemp -d --tmpdir profiledir.XXXXXX)

os-collect-conifig is configured to run os-refresh-config, but it's not got any collectors, but it is running:

[stack@undercloud diskimage_builder]$ cat /etc/os-collect-config.conf
[DEFAULT]
command = os-refresh-config

[stack@undercloud diskimage_builder]$ sudo systemctl status os-collect-config
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-03-14 18:09:42 UTC; 1 day 15h ago

So my guess is o-c-c is running o-r-c which is creating (and not cleaning up a temporary directory).

However attempting to debug this shows there are three different copies of dib-run-parts on the undercloud:

[stack@undercloud diskimage_builder]$ pwd
/usr/lib/python2.7/site-packages/diskimage_builder
[stack@undercloud diskimage_builder]$ ls -li ./lib/ | grep dib-run-parts
92351388 -rw-r--r--. 1 root root 4159 Mar 16 09:55 dib-run-parts
[stack@undercloud diskimage_builder]$ ls -li /usr/local/bin/ | grep dib-run-parts
14722659 -rwxr-xr-x. 1 root root 4127 Mar 14 06:53 dib-run-parts
[stack@undercloud diskimage_builder]$ ls -li /usr/bin/ | grep dib-run-parts
14390549 -rwxr-xr-x. 1 root root 4159 Mar 16 10:04 dib-run-parts
[stack@undercloud diskimage_builder]$ ls -li /bin/ | grep dib-run-parts
14390549 -rwxr-xr-x. 1 root root 4159 Mar 16 10:04 dib-run-parts

/usr/local/bin/dib-run-parts appears to be the one running, and adding pstree to that shows:
systemd(1)---os-collect-conf(708)---os-refresh-conf(24634)---dib-run-parts(24635)---pstree(24643)

Which proves it I think - I guess we should be disabling os-collect-config, but I'm not sure why this hasn't been apparent before?

Revision history for this message
Steven Hardy (shardy) wrote :

[stack@undercloud diskimage_builder]$ rpm -qf /usr/local/bin/dib-run-parts
file /usr/local/bin/dib-run-parts is not owned by any package
[stack@undercloud diskimage_builder]$ rpm -qf /bin/dib-run-parts
dib-utils-0.0.11-1.el7.noarch
[stack@undercloud diskimage_builder]$ rpm -qf /usr/bin/dib-run-parts
dib-utils-0.0.11-1.el7.noarch
[stack@undercloud diskimage_builder]$ rpm -qf ./lib/dib-run-parts
diskimage-builder-2.0.1-0.20170314023517.756923c.el7.centos.noarch

I'm guessing the /usr/local version comes from the dib image build, but it'd be good to confirm that - the duplication here is pretty confusing.

Revision history for this message
Gregory Haynes (greghaynes) wrote :

I think youre correct about /usr/local/bin coming from the dib image build. That seems potentially buggy IMO and might be worth a separate DIB bug for 'dont leave a potentially conflicting dib-run-parts script in built images' /usr/local/bin' (I'm thinking, what if someone wants to pip install a different dib-run-parts as part of an image build?).

Revision history for this message
Ian Wienand (iwienand) wrote :

dib-run-parts gets copied into the image by the "dib-run-parts" element, but nothing ever removes it

https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/dib-run-parts/root.d/90-base-dib-run-parts

Revision history for this message
Ian Wienand (iwienand) wrote :

I have proposed [1] to clean up the dib-run-parts installation. Also see [2]

[1] https://review.openstack.org/446769
[2] http://lists.openstack.org/pipermail/openstack-dev/2017-March/114222.html

Changed in tripleo:
milestone: pike-1 → pike-2
Changed in tripleo:
milestone: pike-2 → pike-3
Changed in tripleo:
milestone: pike-3 → pike-rc1
Revision history for this message
Ben Nemec (bnemec) wrote :

Just checked on an undercloud I deployed today and I don't see this profiledir spam anymore.

Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

Description
===========
I still see a lot of profiledir.* in freshly built environment by quickstart (master)

Steps to reproduce
==================
* Install environment
* wait 1h or longer
* ls -ld ls -ld /tmp/profiledir.*

Expected result
===============
clean /tmp directory

Actual result
=============
/tmp is bloated by profiledir.

Environment
===========
1. master

Changed in tripleo:
milestone: pike-rc1 → none
status: Fix Released → Confirmed
importance: High → Medium
Changed in tripleo:
status: Confirmed → Triaged
milestone: none → rocky-3
milestone: rocky-3 → queens-3
Changed in tripleo:
status: Triaged → Confirmed
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

The bug is indeed related to dib-run-parts

There are couple ways how you can find that

1. Modify PROFILE_DIR=$(mktemp -d --tmpdir profiledir.XXXXXX) to PROFILE_DIR=$(mktemp -d profiledir.XXXXXX) and / will be bloated with profiledir.XXXXXX

2. Enable audit
auditctl -w /tmp/ -p war -k CATCH
Create some load in second window
stress -c 4 -i 4
catch with
tail -f /var/log/audit/audit.log | awk '/mktemp/ {split($12,a,"="); system("ps -fwhwp" a[2])}'

Output will be

[root@undercloud stack]# tail -f /var/log/audit/audit.log | awk '/mktemp/ {split($12,a,"="); system("ps -fwhwp" a[2])}'
28939 ? Z 0:00 [dib-run-parts] <defunct>

Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

What process does run dib-run-parts every minute?

Changed in tripleo:
milestone: queens-3 → queens-rc1
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
status: Confirmed → Triaged
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Emilien Macchi (emilienm) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (FUTURE, PIKE, QUEENS, ROCKY, STEIN).
  Valid example: CONFIRMED FOR: FUTURE

Changed in tripleo:
importance: Medium → Undecided
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.