ansible-hardening role runs out of memory when running for ~60 nodes

Bug #1800169 reported by Robert Varjasi on 2018-10-26
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
openstack-ansible
Undecided
Jesse Pretorius

Bug Description

I am using OSA 17.1.2 and the ansible-hardening role was eating up all my 11GB of memory in the deployment host which is weird. I found that the file ansible-hardening/tasks/rhel7stig/main.yml using include_tasks statements for including tasks which can be static imports instead I think. The affected lines:

- include_tasks: accounts.yml
- include_tasks: aide.yml
- include_tasks: auditd.yml
- include_tasks: auth.yml
- include_tasks: file_perms.yml
- include_tasks: graphical.yml
- include_tasks: kernel.yml
- include_tasks: lsm.yml
- include_tasks: misc.yml
- include_tasks: sshd.yml

I got this error message when I ran ansible-hardening:

2018-10-26 11:52:51,266 p=677 u=root | Friday 26 October 2018 11:52:51 +0000 (0:00:14.270) 0:21:45.680 ********
2018-10-26 11:52:51,941 p=677 u=root | TASK [ansible-hardening : include_tasks] *****************************************************************************************************************************************************************************************************
2018-10-26 11:52:51,941 p=677 u=root | Friday 26 October 2018 11:52:51 +0000 (0:00:00.675) 0:21:46.355 ********
2018-10-26 11:52:57,589 p=677 u=root | included: /etc/ansible/roles/ansible-hardening/tasks/rhel7stig/sshd.yml for ceph-osd8, ceph-osd9, ceph-osd2, ceph-osd3, ceph-osd1, ceph-osd6, ceph-osd7, ceph-osd4, ceph-osd5, compute6, compute26, compute11, compute36, compute37, compute10, compute35, compute34, compute33, compute32, compute17, compute19, compute15, compute39, compute38, compute31, compute30, compute1, compute3, compute2, compute5, compute4, compute7, compute18, compute9, compute8, compute29, compute12, compute20, compute21, compute22, compute23, compute24, compute25, compute40, haproxy1, haproxy2, controller3, controller2, controller1, ceph-mon2, ceph-mon3, ceph-mon1, network2, network1
2018-10-26 11:53:30,166 p=677 u=root | TASK [ansible-hardening : Copy login warning banner] *****************************************************************************************************************************************************************************************
2018-10-26 11:53:30,171 p=677 u=root | Friday 26 October 2018 11:53:30 +0000 (0:00:38.226) 0:22:24.585 ********
2018-10-26 11:53:30,209 p=677 u=root | ERROR! Unexpected Exception, this is probably a bug: [Errno 12] Cannot allocate memory
2018-10-26 11:53:30,209 p=677 u=root | to see the full traceback, use -vvv
2018-10-26 11:53:30,211 p=677 u=root | the full traceback was:

Traceback (most recent call last):
  File "/opt/ansible-runtime/bin/ansible-playbook", line 106, in <module>
    exit_code = cli.run()
  File "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/cli/playbook.py", line 122, in run
    results = pbex.run()
  File "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/playbook_executor.py", line 154, in run
    result = self._tqm.run(play=play)
  File "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py", line 290, in run
    play_return = strategy.run(iterator, play_context)
  File "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/strategy/linear.py", line 277, in run
    self._queue_task(host, task, task_vars, play_context)
  File "/etc/ansible/roles/plugins/strategy/linear.py", line 209, in _queue_task
    _play_context
  File "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/strategy/__init__.py", line 254, in _queue_task
    worker_prc.start()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 130, in start
    self._popen = Popen(self)
  File "/usr/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

I realized huge memory increases when the tasks was doing include_tasks. I checked the imports with static import_tasks and the memory usage remained 1.5GB.

Robert Varjasi (robert.varjasi) wrote :

The Ansible related issue discussed here: https://github.com/ansible/ansible/issues/30441 . I tried to use ansible v2.4.6 with OSA 17.1.2 but nothing changed, role still ran out of memory.

Fix proposed to branch: master
Review: https://review.openstack.org/614329

Changed in openstack-ansible:
assignee: nobody → Jesse Pretorius (jesse-pretorius)
status: New → In Progress

Reviewed: https://review.openstack.org/614329
Committed: https://git.openstack.org/cgit/openstack/ansible-hardening/commit/?id=f381cc02af7914bbf4bcb081a64baafabbbd2a31
Submitter: Zuul
Branch: master

commit f381cc02af7914bbf4bcb081a64baafabbbd2a31
Author: Jesse Pretorius <email address hidden>
Date: Tue Oct 30 18:33:07 2018 +0000

    Switch to using import_tasks for static inclusion

    Using dynamic inclusion (include_tasks) should only be done
    if the tasks to include are based on a conditional and there
    is no expectation for the tag on the include task to be applied
    to all included tasks. Using include_tasks for static inclusion
    dramatically raises memory consumption. Using include_tasks also
    breaks the ability to use a tag applied to the include.

    In this patch we fix all inclusions to ensure that they are set
    properly to dynamic or static inclusions where necessary.

    We also remove the unnecessary leading whitespace in the main
    task file.

    Change-Id: Idff86d4a90d3309f0e9ae3b9f0559b37e25dc26f
    Closes-Bug: #1800169

Changed in openstack-ansible:
status: In Progress → Fix Released

Change abandoned by Jesse Pretorius (odyssey4me) (<email address hidden>) on branch: stable/rocky
Review: https://review.opendev.org/630139
Reason: No longer working on this. Please restore this patch if it is still required.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.