periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master is failing while executing cold migration & resize tempest tests with Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/<uuid>’: Permission denied\n'

Bug #1903033 reported by Sandeep Yadav on 2020-11-05
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Critical
Martin Schuppert

Bug Description

Description:-

periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master is failing while executing cold migration & resize tempest tests with Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/<uuid>’: Permission denied\n'

Logs:-

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master/a542b39/logs/undercloud/var/log/tempest/stestr_results.html.gz
~~~

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps:-

Test: test_server_connectivity_cold_migration

~~~
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 89, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 227, in test_server_connectivity_cold_migration
    'VERIFY_RESIZE')
  File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 96, in wait_for_server_status
    raise lib_exc.TimeoutException(message)
tempest.lib.exceptions.TimeoutException: Request timed out
Details: (TestNetworkAdvancedServerOps:test_server_connectivity_cold_migration) Server 6665fc35-4c29-434e-9254-7b4a2d5976ec failed to reach VERIFY_RESIZE status and task state "None" within the required time (300 s). Current status: ACTIVE. Current task state: None.
~~~

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master/a542b39/logs/overcloud-novacompute-1/var/log/containers/nova/nova-compute.log.txt.gz

~~~
Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/6665fc35-4c29-434e-9254-7b4a2d5976ec’: Permission denied\n'
~~~

Test: test_server_connectivity_resize

~~~
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 89, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 194, in test_server_connectivity_resize
    'VERIFY_RESIZE')
  File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 96, in wait_for_server_status
    raise lib_exc.TimeoutException(message)
tempest.lib.exceptions.TimeoutException: Request timed out
Details: (TestNetworkAdvancedServerOps:test_server_connectivity_resize) Server 6c855bbc-35ae-4022-87a8-5ac742f16ac5 failed to reach VERIFY_RESIZE status and task state "None" within the required time (300 s). Current status: ACTIVE. Current task state: None.
~~~

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master/a542b39/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz
~~~
Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5’: Permission denied\n'
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 10259, in migrate_disk_and_power_off
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self._remotefs.create_dir(dest, inst_base)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/volume/remotefs.py", line 94, in create_dir
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server on_completion=on_completion)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/volume/remotefs.py", line 185, in create_dir
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server on_execute=on_execute, on_completion=on_completion)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/utils.py", line 105, in ssh_execute
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return processutils.execute(*ssh_cmd, **kwargs)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 441, in execute
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server cmd=sanitized_cmd)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Command: ssh -o BatchMode=yes 172.17.0.60 mkdir -p /var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Exit code: 1
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stdout: ''
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5’: Permission denied\n'
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 10079, in _error_out_instance_on_exception
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server yield
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 5580, in _resize_instance
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server timeout, retry_interval)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 10263, in migrate_disk_and_power_off
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server exception.ResizeError(reason=reason))
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server nova.exception.InstanceFaultRollback: Instance rollback performed due to: Resize error: not able to execute ssh command: Unexpected error while running command.
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Command: ssh -o BatchMode=yes 172.17.0.60 mkdir -p /var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Exit code: 1
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stdout: ''
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5’: Permission denied\n'
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/exception_wrapper.py", line 78, in wrapped
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server raise value
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/exception_wrapper.py", line 69, in wrapped
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 189, in decorated_function
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server "Error: %s", e, instance=instance)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server raise value
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 159, in decorated_function
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1434, in decorated_function
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 217, in decorated_function
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server raise value
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 205, in decorated_function
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 5544, in resize_instance
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self._revert_allocation(context, instance, migration)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server raise value
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 5541, in resize_instance
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server instance_type, clean_shutdown, request_spec)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 5599, in _resize_instance
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server request_spec)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib64/python3.6/contextlib.py", line 99, in __exit__
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server self.gen.throw(type, value, traceback)
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 10092, in _error_out_instance_on_exception
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server raise error.inner_exception
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server nova.exception.ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Command: ssh -o BatchMode=yes 172.17.0.60 mkdir -p /var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Exit code: 1
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stdout: ''
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/6c855bbc-35ae-4022-87a8-5ac742f16ac5’: Permission denied\n'
2020-11-05 03:30:39.845 8 ERROR oslo_messaging.rpc.server
~~~

Marios Andreou (marios-b) wrote :

This is not a blocker because we have waived fs20 on master promotion (http://38.145.34.55/config/CentOS-8/master.ini ) - it is quite broken [1]

For anyone looking into this bug, the latest builds [1] - the last few days seems like we have a new/infra related issue (failing node introspection) for which I will file a new bug.

That being said, the issue reported in this bug is still happening. Some more recent examples at [2][3] from the 14th

    * Stderr: 'mkdir: cannot create directory ‘/var/lib/nova/instances/5581e212-2b2e-4a7c-ae9e-56734970f3c6’: Permission denied\n'

[1] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master
[2] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master/bebf3e9/logs/undercloud/var/log/tempest/stestr_results.html.gz
[3] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master/bebf3e9/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz

Martin Schuppert (mschuppert) wrote :

I have a reproducer for this and will check it in more detail.

The root cause is that base instance directory /var/lib/nova/instances has the wrong ownership.
It needs to be nova:nova (42436:42436) from nova user inside the container:

[root@overcloud-novacompute-0 instances]# ll /var/lib/nova/
total 0
drwxr-xr-x. 4 root root 32 Nov 19 15:14 instances

Will check on the root cause.

Changed in tripleo:
assignee: nobody → Martin Schuppert (mschuppert)
Martin Schuppert (mschuppert) wrote :

with change in Ic6f053d56194613046ae0a4a908206ebb453fcf4 run() was
removed to be triggered, as a result the script actually don't run.
I'll submit a fix as soon gerrit is up again.

tags: added: queens-backport-potential
Changed in tripleo:
status: Triaged → In Progress
yatin (yatinkarel) wrote :

Only master/queens were impacted and are fixed with https://review.opendev.org/q/I5050f198f0109faa9299de85e01b0dbe4e5a30ab.

Other branches original cherry-pick not merged yet, once merged issue will appear on those branches too, so would be better to squash the fix in those unmerged patches:- https://review.opendev.org/q/Ic6f053d56194613046ae0a4a908206ebb453fcf4

Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Martin Schuppert (mschuppert) wrote :

I have squashed the fix in the unmerged patches for victoria, ussuri and train at: https://review.opendev.org/q/Ic6f053d56194613046ae0a4a908206ebb453fcf4

yatin (yatinkarel) wrote :

All required patches merged, this can be closed. Thanks Martin.

Changed in tripleo:
status: In Progress → Fix Released

This issue was fixed in the openstack/tripleo-heat-templates 14.0.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers