Instance creation fails with Permission denied: '/var/lib/nova/instances/<instance-id>

Bug #1799903 reported by Jose Luis Franco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Jose Luis Franco

Bug Description

Queens CI jobs tripleo-ci-centos-7-scenario001-multinode-oooq-container and tripleo-ci-centos-7-scenario004-multinode-oooq-container are failing during tempest validation steps after the overcloud installation. When checking nova-compute logs in the subnode, we can find the following log in both jobs:

2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [req-a0823e4d-c889-422b-bfe8-dad5a1bf34b8 0eeba3bfe4884ed298270ced47f55b2f e2a73114c65e4fc9b62f6446d21ed9e6 - default default] [instance: b899f92a-4d76-42ee-b73b-a4d343489879] Instance failed to spawn: OSError: [Errno 13] Permission denied: '/var/lib/nova/instances/b899f92a-4d76-42ee-b73b-a4d343489879'
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] Traceback (most recent call last):
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2239, in _build_resources
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] yield resources
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2019, in _build_and_run_instance
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] block_device_info=block_device_info)
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3091, in spawn
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] block_device_info=block_device_info)
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3405, in _create_image
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] fileutils.ensure_tree(libvirt_utils.get_instance_path(instance))
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib/python2.7/site-packages/oslo_utils/fileutils.py", line 41, in ensure_tree
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] os.makedirs(path, mode)
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] File "/usr/lib64/python2.7/os.py", line 157, in makedirs
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] mkdir(name, mode)
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879] OSError: [Errno 13] Permission denied: '/var/lib/nova/instances/b899f92a-4d76-42ee-b73b-a4d343489879'
2018-10-24 18:51:34.047 10 ERROR nova.compute.manager [instance: b899f92a-4d76-42ee-b73b-a4d343489879]

http://logs.openstack.org/29/611329/3/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/71d71be/logs/subnode-2/var/log/containers/nova/nova-compute.log.txt.gz#_2018-10-24_18_51_34_047

http://logs.openstack.org/29/611329/3/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/38cfc40/logs/subnode-2/var/log/containers/nova/nova-compute.log.txt.gz#_2018-10-24_18_09_10_044

Revision history for this message
Martin Schuppert (mschuppert) wrote :

it seems the host_prep_task to create /var/lib/nova/instances was not run [1] and therefore
/var/lib/nova/instances was not there when the nova_statdir script ran to change the owner to the correct nova user inside the container at [2]:

Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Applying nova statedir ownership
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Target ownership for /var/lib/nova: 42436:42436
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Checking uid: 42436 gid: 42436 path: /var/lib/nova/
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Ownership of /var/lib/nova already 42436:42436
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Checking uid: 42436 gid: 42436 path: /var/lib/nova/.ssh/
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Ownership of /var/lib/nova/.ssh already 42436:42436
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Checking uid: 42436 gid: 42436 path: /var/lib/nova/.ssh/config
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 dockerd-current[12459]: INFO:nova_statedir:Nova statedir ownership complete
Oct 24 17:54:53 centos-7-inap-mtl01-0003385017 oci-systemd-hook[70456]: systemdhook <debug>: 93a3239cbc44: Skipping as container command is /docker-config-scripts/nova_statedir_ownership.py, not init or systemd

[1] https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/docker/services/nova-compute.yaml#L232
[2] http://logs.openstack.org/29/611329/3/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/38cfc40/logs/subnode-2/var/log/journal.txt.gz#_Oct_24_17_54_53

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/613259

Revision history for this message
Jose Luis Franco (jfrancoa) wrote :

Reverted https://review.openstack.org/#/c/613259/ and added the patch as a dependency in https://review.openstack.org/#/c/611329/ . This will let us confirm that the removal of {{role}}HostPrepTasks is the cause of the error, as in queens we can still deploy via Heat without using config-download (as it's the case when deploying with Ceph services enabled , as stated in https://github.com/openstack/tripleo-quickstart/commit/a0aad6e280af998b7283b465c3d83f518971d16c )

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (stable/queens)

Change abandoned by Juan Antonio Osorio Robles (<email address hidden>) on branch: stable/queens
Review: https://review.openstack.org/613259
Reason: clearing up the gate to free up resources.

Changed in tripleo:
assignee: nobody → Jose Luis Franco (jfrancoa)
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Juan Antonio Osorio Robles (<email address hidden>) on branch: stable/queens
Review: https://review.openstack.org/613259
Reason: clearing up the gate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.openstack.org/613259
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=697e4ef1749c9cc4e33d2edbb3d1387cd71ac34a
Submitter: Zuul
Branch: stable/queens

commit 697e4ef1749c9cc4e33d2edbb3d1387cd71ac34a
Author: Jose Luis Franco <email address hidden>
Date: Thu Oct 25 10:18:05 2018 +0000

    Revert "Don't run host_prep_tasks from {{role}}HostPrepDeployment"

    {{role}}HostPrepDeployment resources are still needed in queens as
    config-download is not the default deployment procedure. Therefore,
    if config-download is not used we would skip the host_prep_tasks,
    which are needed for several services to work correctly (i.e:
    nova-compute as stated in LP 1799903)

    This reverts commit 3ce99b522ca99250560c5cc135125fcbeb943515.

    Change-Id: I2cd06a7f261d3587bb50683ae2b54631983ea077
    Closes-Bug: #1799903

tags: added: in-stable-queens
Changed in tripleo:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.1.0

This issue was fixed in the openstack/tripleo-heat-templates 8.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.