Priviledged user permissions issue in Quobyte volumes

Bug #1438603 reported by Silvan Kaiser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Silvan Kaiser

Bug Description

Tempest snapshot tests fail when using os_privileged_user_* settings in cinder.conf. Example data can be found in CI build http://176.9.127.22:8081/refs-changes-35-155735-221/

Tests show SnapShotBuildErrorExceptions. Cinder volume log (c-vol.log, http://176.9.127.22:8081/refs-changes-35-155735-221/logs/screen-c-vol.log.txt) shows the underlying Permission denied message returned by qemu for the snapshot operations.

Silvan Kaiser (2-silvan)
Changed in cinder:
assignee: nobody → Silvan Kaiser (2-silvan)
description: updated
Revision history for this message
Silvan Kaiser (2-silvan) wrote :

After some analysis:
The issues is keyed to the fact that change https://review.openstack.org/#/c/131741/ introduced the capability to access the nova api with a privileged and _different_ user. The Quobyte driver currently has a hardcoded setting to use nas_secure_file_permissions=true resulting in chmod 660 on stored elements. Since the snapshot abilities are partially using the nova api and therefore privileged (different) user these operations fail with permission denied messages.
Am now thinking about how to best fix this...

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

So, currently i see three possible solutions:

- Switch to insecure mode statically. Things do work and it will be documented that this stuff is not protected by file permissions
- Implement a configurable solution. So by default stuff will be insecure but one can change that setting (which will in turn break snapshot capability. *ick*
- Add an implementation that ensures the privileged user configured for the Nova API access is also able to access the 'secure' 660 files. This would be a bit more complicated, i think and probably can easily be broken by misconfiguration.

I'll go with the first approach for now as the Kilo Deadline is _very_ near.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/169283

Changed in cinder:
status: New → In Progress
Revision history for this message
Silvan Kaiser (2-silvan) wrote :

After some more discussion i am looking at this again as i was hinted that the issue is likely not with the privileged user access but more general the secure mode. Will update this accordingly.

Changed in cinder:
status: In Progress → Incomplete
Revision history for this message
Silvan Kaiser (2-silvan) wrote :

After some more testing and reviewing:

The underlying error occurs when qemu-img tries to create the snapshot file (when cinder is in nas_secure mode) and returns "Permission denied: Permission denied" on this command, because the backing file is inaccessible.
This happens with nas_secure_file_permissions and -operations activated, who enforce mode 660 with a non privileged user.

Switching nas_secure_file_* to false yields no errors, as 666 permissions prevent this from occurring.

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

So my previous comment about this issue rising up due to https://review.openstack.org/#/c/131741/ is _wrong_. It's just that i hit
this issue after i configured the privileged access correctly and incorrectly deducted cum hoc ergo propter hoc.

Silvan Kaiser (2-silvan)
Changed in cinder:
status: Incomplete → In Progress
Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Working on a new patch to correct this. The Quobyte Cinder driver will chown volumes created by online snapshots through nova to the Cinder UID and GID.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Silvan Kaiser (<email address hidden>) on branch: master
Review: https://review.openstack.org/169283
Reason: Working on a better solution. Moving to a hard coded insecure setting was originally only intended as a quick fix.
The new approach is to chown snapshot files created by Nova online snapshotting to cinder UID:GID.

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Further update:
While working on a new patch and doing more tests it showed that these errors turn up only when running the tempest suite volume test as a whole. Running singular tests provides no errors of this type.

Doing more investigation....

Revision history for this message
Silvan Kaiser (2-silvan) wrote :
Download full text (3.3 KiB)

Standalone run of one of the affected tempest tests:

ubuntu@manualvm:/opt/stack/tempest$ tox -e all -- tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_volume_from_snapshot
all develop-inst-nodeps: /opt/stack/tempest
all installed: anyjson==0.3.3,Babel==1.3,boto==2.38.0,cffi==1.1.2,cliff==1.13.0,cmd2==0.6.8,cryptography==0.9.2,debtcollector==0.5.0,ecdsa==0.13,enum34==1.0.4,extras==0.0.3,fasteners==0.12.0,fixtures==1.3.1,functools32==3.2.3.post1,httplib2==0.9.1,idna==2.0,ipaddress==1.0.7,iso8601==0.1.10,jsonschema==2.5.1,linecache2==1.0.0,monotonic==0.2,msgpack-python==0.4.6,netaddr==0.7.15,netifaces==0.10.4,os-testr==0.2.0,oslo.concurrency==2.2.0,oslo.config==1.14.0,oslo.context==0.4.0,oslo.i18n==2.1.0,oslo.log==1.6.0,oslo.serialization==1.7.0,oslo.utils==1.8.0,paramiko==1.15.2,pbr==1.2.0,posix-ipc==1.0.0,prettytable==0.7.2,pyasn1==0.1.8,pycparser==2.14,pycrypto==2.6.1,pyOpenSSL==0.15.1,pyparsing==2.0.3,python-mimeparse==0.1.4,python-subunit==1.1.0,pytz==2015.4,PyYAML==3.11,retrying==1.3.3,six==1.9.0,stevedore==1.6.0,-e git+https://git.openstack.org/openstack/tempest.git@91ebdfac2f6021aa1bccceaf4274bb801cc0891b#egg=tempest-master,tempest-lib==0.6.1,testrepository==0.0.20,testscenarios==0.5.0,testtools==1.8.0,traceback2==1.4.0,unittest2==1.1.0,wheel==0.24.0,wrapt==1.10.5
all runtests: PYTHONHASHSEED='2573821559'
all runtests: commands[0] | find . -type f -name *.pyc -delete
all runtests: commands[1] | bash tools/pretty_tox.sh tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_volume_from_snapshot
running testr
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-500} \
OS_TEST_LOCK_PATH=${OS_TEST_LOCK_PATH:-${TMPDIR:-'/tmp'}} \
${PYTHON:-python} -m subunit.run discover -t ${OS_TOP_LEVEL:-./} ${OS_TEST_PATH:-./tempest/test_discover} --list
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-500} \
OS_TEST_LOCK_PATH=${OS_TEST_LOCK_PATH:-${TMPDIR:-'/tmp'}} \
${PYTHON:-python} -m subunit.run discover -t ${OS_TOP_LEVEL:-./} ${OS_TEST_PATH:-./tempest/test_discover} --load-list /tmp/tmp46hYRf
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_volume_from_snapshot [77.302988s] ... ok

======
Totals
======
Ran: 1 tests in 91.0000 sec.
 - Passed: 1
 - Skipped: 0
 - Expected Fail: 0
 - Unexpected Success: 0
 - Failed: 0
Sum of execute time for each test: 77.3030 sec.

==============
Worker Balance
==============
 - Worker 0 (1 tests) => 0:01:17.302988s

Slowest Tests:

Test id Runtime (s)
-------------------------------------------------------------------------------------------------------------------------------------- -----------
tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_volume_from_snapshot[id-677863d1-3142-456d-b6ac-9924f667a7f4] 77.303
_____________________________________________________________________________________________________ summary _____________...

Read more...

Revision history for this message
Silvan Kaiser (2-silvan) wrote :
Download full text (18.8 KiB)

Running the same test alongside it's siblings:

all develop-inst-nodeps: /opt/stack/tempest
all installed: anyjson==0.3.3,Babel==1.3,boto==2.38.0,cffi==1.1.2,cliff==1.13.0,cmd2==0.6.8,cryptography==0.9.2,debtcollector==0.5.0,ecdsa==0.13,enum34==1.0.4,extras==0.0.3,fasteners==0.12.0,fixtures==1.3.1,functools32==3.2.3.post1,httplib2==0.9.1,idna==2.0,ipaddress==1.0.7,iso8601==0.1.10,jsonschema==2.5.1,linecache2==1.0.0,monotonic==0.2,msgpack-python==0.4.6,netaddr==0.7.15,netifaces==0.10.4,os-testr==0.2.0,oslo.concurrency==2.2.0,oslo.config==1.14.0,oslo.context==0.4.0,oslo.i18n==2.1.0,oslo.log==1.6.0,oslo.serialization==1.7.0,oslo.utils==1.8.0,paramiko==1.15.2,pbr==1.2.0,posix-ipc==1.0.0,prettytable==0.7.2,pyasn1==0.1.8,pycparser==2.14,pycrypto==2.6.1,pyOpenSSL==0.15.1,pyparsing==2.0.3,python-mimeparse==0.1.4,python-subunit==1.1.0,pytz==2015.4,PyYAML==3.11,retrying==1.3.3,six==1.9.0,stevedore==1.6.0,-e git+https://git.openstack.org/openstack/tempest.git@91ebdfac2f6021aa1bccceaf4274bb801cc0891b#egg=tempest-master,tempest-lib==0.6.1,testrepository==0.0.20,testscenarios==0.5.0,testtools==1.8.0,traceback2==1.4.0,unittest2==1.1.0,wheel==0.24.0,wrapt==1.10.5
all runtests: PYTHONHASHSEED='125578107'
all runtests: commands[0] | find . -type f -name *.pyc -delete
all runtests: commands[1] | bash tools/pretty_tox.sh tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON
running testr
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-500} \
OS_TEST_LOCK_PATH=${OS_TEST_LOCK_PATH:-${TMPDIR:-'/tmp'}} \
${PYTHON:-python} -m subunit.run discover -t ${OS_TOP_LEVEL:-./} ${OS_TEST_PATH:-./tempest/test_discover} --list
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-500} \
OS_TEST_LOCK_PATH=${OS_TEST_LOCK_PATH:-${TMPDIR:-'/tmp'}} \
${PYTHON:-python} -m subunit.run discover -t ${OS_TOP_LEVEL:-./} ${OS_TEST_PATH:-./tempest/test_discover} --load-list /tmp/tmpi5ePgK
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshot_create_get_list_update_delete [4.529798s] ... ok
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshot_create_with_volume_in_use [32.171595s] ... ok
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshots_list_details_with_params [2.493639s] ... FAILED
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshots_list_with_params [2.654264s] ... FAILED
{0} tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_volume_from_snapshot [3.125463s] ... FAILED

==============================
Failed 3 tests - output below:
==============================

tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshots_list_details_with_params[id-220a1022-1fcd-4a74-a7bd-6b859156cda2]
----------------------------------------------------------------------------------------------------------------------------------------------------

Captured pythonlogging:
~~~~~~~~~~~~~~~~~~~~~~~
    2015-07-09 09:52:26,287 17084 INFO [tempest_lib.com...

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

As this bug report contains a lot of early wrong ideas by myself i did a clearer & more detailed new version here: https://bugs.launchpad.net/cinder/+bug/1473116

Will close this ticket now.

Changed in cinder:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.