Collecting Guru Meditaiton Report fails when on SELinux system with 'enforcing' mode on

Bug #1756044 reported by Abhishek Sharma M
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
oslo.reports
Fix Released
Medium
Ben Nemec

Bug Description

While collecting Guru Meditation report (GMR) using "kill -USR2 <pid>" command on a system configured with SELinux in 'enforcing' mode we get the below error stack trace

nova-api: Traceback (most recent call last):
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/guru_meditation_report.py", line 211, in handle_signa l
nova-api: res = cls(version, frame).run()
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/guru_meditation_report.py", line 259, in run
nova-api: return super(GuruMeditation, self).run()
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/report.py", line 77, in run
nova-api: return "\n".join(six.text_type(sect) for sect in self.sections)
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/report.py", line 77, in <genexpr>
nova-api: return "\n".join(six.text_type(sect) for sect in self.sections)
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/report.py", line 102, in __str__
nova-api: return self.view(self.generator())
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/report.py", line 131, in newgen
nova-api: res = gen()
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/generators/process.py", line 38, in __call__
nova-api: return pm.ProcessModel(psutil.Process(os.getpid()))
nova-api: File "/usr/lib/python2.7/site-packages/oslo_reports/models/process.py", line 66, in __init__
nova-api: children = process.children()
nova-api: File "/usr/lib64/python2.7/site-packages/psutil/__init__.py", line 336, in wrapper
nova-api: return fun(self, *args, **kwargs)
nova-api: File "/usr/lib64/python2.7/site-packages/psutil/__init__.py", line 913, in children
nova-api: **if p.ppid() == self.pid:**
nova-api: File "/usr/lib64/python2.7/site-packages/psutil/_common.py", line 293, in wrapper
nova-api: return fun(self)
nova-api: File "/usr/lib64/python2.7/site-packages/psutil/__init__.py", line 622, in ppid
nova-api: return self._proc.ppid()
nova-api: File "/usr/lib64/python2.7/site-packages/psutil/_pslinux.py", line 1092, in wrapper
nova-api: **raise AccessDenied(self.pid, self._name)**
nova-api: **AccessDenied: psutil.AccessDenied (pid=1)**
nova-api: **Unable to run Guru Meditation Report!**

This happens because the nova-api process is trying to locate a specific dir (pid number) in /proc & it compares every dir with the dir it wants to find. In this process, it encounters one directory for which it is not having access/search privilege according to SELinux rules and as psutils didnot catch this "AccessDenied" exception, the whole thing fails. Ideally, we would want to catch this exception and proceed with the search. For this I had raised a defect on psutils https://github.com/giampaolo/psutil/issues/1246. Giampolo has a valid point in the discussion we had there. I think this exception must be handled in oslo reports when its using psutils to collect GMR.

There have been other defects raised around this issue as well like https://bugzilla.redhat.com/show_bug.cgi?id=1292787, but they haven't come to conclusion.

Revision history for this message
Divya K Konoor (dikonoor) wrote :

Adding a fix in psutil might not be the right fix for this as psutil library should throw an error whenever it encounters one rather than curb it. The error could be because of different reasons. Fixing this at nova service level also might not be the right place as there are other Openstack services that uses oslo-reports and would run into the same problem. Considering this oslo-reports might be a suitable layer to provide the fix.

Revision history for this message
Ben Nemec (bnemec) wrote :

I don't believe we can fix this in oslo.reports. We could catch the exception, but because that means the call failed we don't have the data we need to proceed. I don't see any way to exclude pids from the children() call so there's no way to retry in a way that would possibly work.

Based on the response to https://bugzilla.redhat.com/show_bug.cgi?id=1292787 I think the recommendation should be to "setenforce 0" before running GMR and then turn it back on after they complete.

Changed in oslo.reports:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.reports (master)

Fix proposed to branch: master
Review: https://review.openstack.org/558858

Changed in oslo.reports:
assignee: nobody → Ben Nemec (bnemec)
status: Confirmed → In Progress
Revision history for this message
Divya K Konoor (dikonoor) wrote :

bnemec, Please see my comments in changeset https://review.openstack.org/558858 on this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.reports (master)

Reviewed: https://review.openstack.org/558858
Committed: https://git.openstack.org/cgit/openstack/oslo.reports/commit/?id=8d0e5fcc95ff2a2093373c35697f1ec8a9ff113b
Submitter: Zuul
Branch: master

commit 8d0e5fcc95ff2a2093373c35697f1ec8a9ff113b
Author: Ben Nemec <email address hidden>
Date: Wed Apr 4 15:47:22 2018 +0000

    Document workaround for AccessDenied error

    On SELinux-enabled platforms it is possible for the report process
    to fail with an AccessDenied error when it tries to read information
    about the process being debugged. Per [1], the recommended solution
    is to temporarily disable SELinux during debugging and then turn it
    on again once the report has completed successfully.

    1: https://bugzilla.redhat.com/show_bug.cgi?id=1292787

    Change-Id: Ic12d5658858bb085448e1b437b548111d3c79583
    Closes-Bug: 1756044

Changed in oslo.reports:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.reports 1.28.0

This issue was fixed in the openstack/oslo.reports 1.28.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.