overcloud deployment with ceph Failed to create temporary directory in /tmp/ceph_ansible_tmp

Bug #1884816 reported by John Fulton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Francesco Pantano

Bug Description

While deploying ceph and openstack with tripleo the ceph-ansible playbook run fails with:

        "Tuesday 23 June 2020 16:08:27 +0000 (0:00:00.232) 0:01:16.170 ********** ",
        "<192.168.24.12> SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o 'IdentityFile=\"/home/stack/.ssh/id_rsa_tripleo\"' -o KbdInteractiveAuthent
ication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User=\"tripleo-admin\"' -o ConnectTimeout=60 -o ControlPath=/home/sta
ck/ansible-ssh 192.168.24.12 '/bin/sh -c '\"'\"'( umask 77 && mkdir -p \"` echo /tmp/ceph_ansible_tmp `\"&& mkdir /tmp/ceph_ansible_tmp/ansible-tmp-1592928507.9015691-312110-107140094840620
&& echo ansible-tmp-1592928507.9015691-312110-107140094840620=\"` echo /tmp/ceph_ansible_tmp/ansible-tmp-1592928507.9015691-312110-107140094840620 `\" ) && sleep 0'\"'\"''",
        "<192.168.24.12> (1, b'', b'ControlSocket /home/stack/ansible-ssh already exists, disabling multiplexing\\r\\nmkdir: cannot create directory \\xe2\\x80\\x98/tmp/ceph_ansible_tmp/ansi
ble-tmp-1592928507.9015691-312110-107140094840620\\xe2\\x80\\x99: Permission denied\\n')",
        "mkdir: cannot create directory ‘/tmp/ceph_ansible_tmp/ansible-tmp-1592928507.9015691-312110-107140094840620’: Permission denied",
        "fatal: [control-plane-controller-0]: UNREACHABLE! => {",
        " \"msg\": \"Failed to create temporary directory.In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the
 remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\", for more error information use -vvv. Failed command was: ( umask 77 && mkdir -p \\\"` echo /tmp/ceph_ansible_tmp `\\\"&& mkd
ir /tmp/ceph_ansible_tmp/ansible-tmp-1592928507.9015691-312110-107140094840620 && echo ansible-tmp-1592928507.9015691-312110-107140094840620=\\\"` echo /tmp/ceph_ansible_tmp/ansible-tmp-1592
928507.9015691-312110-107140094840620 `\\\" ), exited with result 1\",",
        " \"unreachable\": true",

Revision history for this message
John Fulton (jfulton-org) wrote :

WORKAROUND

1. rm -rf /tmp/ceph_ansible_tmp/ on all overcloud nodes
2. run stack update

Revision history for this message
John Fulton (jfulton-org) wrote :

Explanation:

When ceph-ansible is run by tripleo-ansible the remote temp directory is set like this [1]:

 ANSIBLE_REMOTE_TEMP=/tmp/ceph_ansible_tmp

The problem is that before ceph-asnible is run witht he values above, something else is connecting first and then the same directory is getting created with permission 700 and owner:group of root:root

Then when tripleo-admin tries to connect that user cannot write to that directory.

[1] https://github.com/openstack/tripleo-ansible/blob/stable/train/tripleo_ansible/roles/tripleo-ceph-run-ansible/tasks/main.yml#L36

Revision history for this message
John Fulton (jfulton-org) wrote :

As per this error:

Jun 23 10:57:56 controller-0 ansible-lineinfile[48655]: [WARNING] Module remote_tmp /tmp/ceph_ansible_tmp did not exist and was created with a mode of 0700, this may cause issues when running as another user. To avoid this, create the remote_tmp dir with the correct permissions manually

Ansible is suggesting I "create the remote_tmp dir with the correct permissions".

This is because of a change introduced in ansible 2.9.10

 https://github.com/ansible/ansible/issues/68218

My plan is to patch tripleo-ansible to do what they recommend. In other words:

Have tripleo-ansible create the remote_tmp dir with the correct permissions before it runs ceph-ansible (which uses that remote_tmp dir).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/737660

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
assignee: John Fulton (jfulton-org) → Francesco Pantano (fmount)
Changed in tripleo:
assignee: Francesco Pantano (fmount) → nobody
assignee: nobody → Francesco Pantano (fmount)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/737759

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/737761

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/737660
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=1b440db7f69b1f0e69016ed5a409e84b6fab1f9d
Submitter: Zuul
Branch: master

commit 1b440db7f69b1f0e69016ed5a409e84b6fab1f9d
Author: John Fulton <email address hidden>
Date: Tue Jun 23 23:54:59 2020 +0000

    Make tripleo_ceph_run_ansible handle change in remote_tmp

    Ansible's remote_tmp module changed in 2.9.10 [1] causing
    the reported bug this patch addresses. The error from the
    module suggests creating the remote_tmp dir with the correct
    permissions. This patch does that within external_deploy_steps
    by having tripleo-ansible create the remote_tmp dir accordingly
    before running ceph-ansible (which uses that remote_tmp dir).

    [1] https://github.com/ansible/ansible/issues/68218

    Change-Id: I0350d5253571a2b0d12a0a2f25e5469c9d1fefe0
    Closes-Bug: #1884816

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/737761
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=c7571039ff738e71ad19e60aa344c2e1ed0781b9
Submitter: Zuul
Branch: stable/train

commit c7571039ff738e71ad19e60aa344c2e1ed0781b9
Author: John Fulton <email address hidden>
Date: Tue Jun 23 23:54:59 2020 +0000

    Make tripleo_ceph_run_ansible handle change in remote_tmp

    Ansible's remote_tmp module changed in 2.9.10 [1] causing
    the reported bug this patch addresses. The error from the
    module suggests creating the remote_tmp dir with the correct
    permissions. This patch does that within external_deploy_steps
    by having tripleo-ansible create the remote_tmp dir accordingly
    before running ceph-ansible (which uses that remote_tmp dir).

    [1] https://github.com/ansible/ansible/issues/68218

    Change-Id: I0350d5253571a2b0d12a0a2f25e5469c9d1fefe0
    Closes-Bug: #1884816
    (cherry picked from commit 1b440db7f69b1f0e69016ed5a409e84b6fab1f9d)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/ussuri)

Reviewed: https://review.opendev.org/737759
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=91f22bd56f8d829092850562701a2cf019b53512
Submitter: Zuul
Branch: stable/ussuri

commit 91f22bd56f8d829092850562701a2cf019b53512
Author: John Fulton <email address hidden>
Date: Tue Jun 23 23:54:59 2020 +0000

    Make tripleo_ceph_run_ansible handle change in remote_tmp

    Ansible's remote_tmp module changed in 2.9.10 [1] causing
    the reported bug this patch addresses. The error from the
    module suggests creating the remote_tmp dir with the correct
    permissions. This patch does that within external_deploy_steps
    by having tripleo-ansible create the remote_tmp dir accordingly
    before running ceph-ansible (which uses that remote_tmp dir).

    [1] https://github.com/ansible/ansible/issues/68218

    Change-Id: I0350d5253571a2b0d12a0a2f25e5469c9d1fefe0
    Closes-Bug: #1884816
    (cherry picked from commit 1b440db7f69b1f0e69016ed5a409e84b6fab1f9d)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 0.6.0

This issue was fixed in the openstack/tripleo-ansible 0.6.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

  • auto-github-ansible-ansible #68218
    [closed needs_info module needs_template support:community feature P3 verified files collection affects_2.10 collection:community.general needs_collection_redirect] Edit

Bug watches keep track of this bug in other bug trackers.