[OSSA 2013-001] No authentication on block device used for os-volume_boot

Bug #1069904 reported by Phil Day on 2012-10-22
272
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Pádraig Brady
Folsom
High
Pádraig Brady
OpenStack Security Advisory
Undecided
Thierry Carrez
nova (Debian)
Fix Released
Unknown

Bug Description

We found this problem in our Diablo code base - I think by inspection its still valid in upstream as well but a bit harder to check as the code has changed (BootFromVolumeController no longer exists, and os-volume_boot now just inherits from the servers API).

Fillling anyway as its pretty serious, in the hope that someone can verify or dismiss it.

Boot from volume allows a volume to be passed to the create method via the block_device_mapping parameter. This parameter is not validated as having to be a volume belonging to the user creating the instance, so providing I know the valid ID of a volume belonging to another user I can create VM and gain access to that volume (c.f volume attachment which does make explicit checks for both the ownership and status of a volume)

The volume ownership and status should be explicitly checked in the compute.api layer

CVE References

Vish Ishaya (vishvananda) wrote :

The connection code attaches to the volume using api commands to cinder. Volumes belonging to other users will not be visible, so this shouldn't be possible in folsom.

Changed in nova:
status: New → Invalid
Thierry Carrez (ttx) wrote :

@Vish: would Essex be affected ?

Thierry Carrez (ttx) wrote :

Also would Folsom with nova-volumes be affected ?

Vish Ishaya (vishvananda) wrote :

attach_volume is definitely not affected. It appears that attach_volume_boot has an elevated context so it may be possible to boot off of other peoples volumes. This appears to affect folsom/nova-volumes as well, although it should be verified. A simple fix would be to make sure we don't have an elevated context before doing the volume_api.get and volume_api.get_snapshot code in stable/essex here:

 375 # TODO(yamahata): default name and description
 376 snapshot = self.volume_api.get_snapshot(context,
 377 bdm['snapshot_id'])
 378 vol = self.volume_api.create(context, bdm['volume_size'],
 379 '', '', snapshot)
 380 # TODO(yamahata): creating volume simultaneously
 381 # reduces creation time?
 382 self.volume_api.wait_creation(context, vol)
 383 self.db.block_device_mapping_update(
 384 context, bdm['id'], {'volume_id': vol['id']})
 385 bdm['volume_id'] = vol['id']
 386
 387 if bdm['volume_id'] is not None:
 388 volume = self.volume_api.get(context, bdm['volume_id'])
 389 self.volume_api.check_attach(context, volume)
 390 cinfo = self._attach_volume_boot(context,
 391 instance,
 392 volume,
 393 bdm['device_name'])

Thierry Carrez (ttx) wrote :

Looks like a deeper analysis is needed on this.. Vish, anyone you'd suggest would be a good fit to look into it ?

Vish Ishaya (vishvananda) wrote :

I know rmk has some essex installs with volumes. I will add him to the list to see if he can verify.

Vish Ishaya (vishvananda) wrote :

Verified with a trunk devstack using nova-volumes. Here is a fix for trunk.

Vish Ishaya (vishvananda) wrote :

Verified with a trunk devstack using nova-volumes. Here is a fix for trunk.

Thierry Carrez (ttx) wrote :

Now I'm confused. I understood that Grizzly was unaffected and now you say you have a trunk patch...
Which series are actually affected ?

Adding nova-core for security patch review.

Vish Ishaya (vishvananda) wrote :

ttx: only using nova-volumes is affected which means it doesn't affect grizzly. The fix involves checking at the api layer which is actually still valuable in trunk even though the security issue is not present. It should apply against folsom just fine. Rmk is working on a version for essex.

Vish Ishaya (vishvananda) wrote :

(the reason the patch is still valuable against trunk is it makes the failure happen at the api layer which causes the create to fail fast instead of putting the instance into error later in the process)

Russell Bryant (russellb) wrote :

Patch mostly looks good to me ... my only comment/question is whether we could catch something more specific than just "Exception".

Thierry Carrez (ttx) wrote :

@vish: thx for the explanation. As far as this security bug goes, I'll just consider it doesn't apply to Grizzly (even if pushing the fix there can't hurt).

Adding Daviey so that he can look into stable/diablo being affected or not.

Russell Bryant (russellb) wrote :

+2 on the patch. The Exception thing is just a potential cleanup, not a problem.

Russell Bryant (russellb) wrote :

@nova-core: We need one more review here so we can proceed

Thierry Carrez (ttx) wrote :

@nova-core: one more review still needed

Pádraig Brady (p-draigbrady) wrote :

+2 from me

Thierry Carrez (ttx) wrote :

Refreshing patches: volume.patch does not apply cleanly to Grizzly
rmk: are you working on an Essex patch ?

Thierry Carrez (ttx) wrote :
Thierry Carrez (ttx) wrote :
Thierry Carrez (ttx) wrote :

@rmk: any progress on the Essex patch ?

Adding the full Diablo maintenance team (rather than just Dave) to evaluate Diablo's vulnerability, although having the Essex patch will help in assessing that.

My proposed impact description:

Title: Boot from volume allows access to random volumes
Reporter: Phil Day (HP)
Products: Nova
Affects: ??

Description:
Phil Day from HP reported a vulnerability in volume attachment in nova-volume, affecting the boot-from-volume feature. By passing a specific volume ID, an authenticated user may be able to boot from a volume he doesn't own, potentially resulting in full access to that 3rd-party volume contents. Folsom setups making use of Cinder are not affected.

Thierry Carrez (ttx) wrote :

Looks like rmk is not making progress on the essex patch. Anyone else in nova-core want to have a try ?

Michael Still (mikalstill) wrote :

Here's the first pass at an essex patch. I haven't been able to test it though, so please review with care.

Thierry Carrez (ttx) wrote :

@nova-core: please have a look at proposed essex patch in comment 23

Vish Ishaya (vishvananda) wrote :

essex patch looks valid to me +1

Thierry Carrez (ttx) wrote :

Still need one more +1 on the Essex patch
Everyone alright with proposed impact description ?

Last chance to propose a patch for Diablo, otherwise we'll release this without the diablo fix...

Chuck Short (zulcss) wrote :

+1 from me

Thierry Carrez (ttx) wrote :

Please validate impact description:

========
Title: Boot from volume allows access to random volumes
Reporter: Phil Day (HP)
Products: Nova
Affects: Essex, Folsom

Description:
Phil Day from HP reported a vulnerability in volume attachment in nova-volume, affecting the boot-from-volume feature. By passing a specific volume ID, an authenticated user may be able to boot from a volume he doesn't own, potentially resulting in full access to that 3rd-party volume contents. Folsom setups making use of Cinder are not affected.
========

Thierry Carrez (ttx) wrote :

New refresh of the Grizzly patch

Nikola Đipanov (ndipanov) wrote :

Thierry - re the grizzly patch - it looks good although the same thing is done in compute manager's _setup_block_device_mapping.

Just a heads up: I am trying to rework BDMs at the moment as part of https://blueprints.launchpad.net/nova/+spec/improve-block-device-handling - and I am hoping to propose more thorough validations as a part of those changes. so I am likely to remove/rework the bit of nova.api. The patch should be ready during next week.

Thierry Carrez (ttx) wrote :

Remaining tasks before disclosure:
Anyone: please review impact statement

Russell Bryant (russellb) wrote :

Notification sent out to stakeholders. Proposed CRD is Thursday, January 24th at 1500 UTC.

Mark McLoughlin (markmc) wrote :

Bit confused as to why there's a grizzly patch if we're saying grizzly isn't affected?

Thierry Carrez (ttx) wrote :

@Mark: there is a bug in Grizzly, but without security consequences. See comment 10.

Jamie Strandboge (jdstrand) wrote :

I just started looking at this for Ubuntu. The essex patch fails in its testsuite with various errors like this:
======================================================================
ERROR: test_run_with_snapshot (nova.tests.api.ec2.test_cloud.CloudTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/«BUILDDIR»/nova-2012.1.3+stable-20120827-4d2a4afe/nova/tests/api/ec2/test_cloud.py", line 2014, in test_run_with_snapshot
    ec2_instance_id = self._run_instance(**kwargs)
  File "/«BUILDDIR»/nova-2012.1.3+stable-20120827-4d2a4afe/nova/tests/api/ec2/test_cloud.py", line 1441, in _run_instance
    rv = self.cloud.run_instances(self.context, **kwargs)
  File "/«BUILDDIR»/nova-2012.1.3+stable-20120827-4d2a4afe/nova/api/ec2/cloud.py", line 1341, in run_instances
    block_device_mapping=kwargs.get('block_device_mapping', {}))
  File "/«BUILDDIR»/nova-2012.1.3+stable-20120827-4d2a4afe/nova/compute/api.py", line 690, in create
    scheduler_hints=scheduler_hints)
  File "/«BUILDDIR»/nova-2012.1.3+stable-20120827-4d2a4afe/nova/compute/api.py", line 395, in _create_instance
    if bdm.get('snapshot_id') is not None and bdm['volume_id'] is None:
KeyError: "'volume_id'\n-------------------- >> begin captured logging << --------------------\nnova.service: AUDIT: Starting compute node (version 2012.1.3-LOCALBRANCH:LOCALREVISION)\nnova.virt.fake: INFO: Compute_service record created for acde74615e9e4451837745036802057d \nnova.service: AUDIT: Starting scheduler node (version 2012.1.3-LOCALBRANCH:LOCALREVISION)\nnova.service: AUDIT: Starting network node (version 2012.1.3-LOCALBRANCH:LOCALREVISION)\nnova.service: AUDIT: Starting volume node (version 2012.1.3-LOCALBRANCH:LOCALREVISION)\nnova.api.ec2.cloud: AUDIT: Create snapshot of volume vol-00000001\nnova.volume.manager: INFO: snapshot snapshot-00000001: creating\nnova.api.ec2.cloud: AUDIT: Create snapshot of volume vol-00000001\nnova.volume.manager: INFO: snapshot snapshot-00000002: creating\n--------------------- >> end captured logging << ---------------------"

This is because of the check for "bdm['volume_id'] is None". We need to verify that 'volume_id' is a key in bdm before accessing the value for the key. Attached is an updated patch for essex that passes its test suite on Ubuntu.

Jamie Strandboge (jdstrand) wrote :

FYI, the folsom patch suffers from the same problem. After adjusting it to use:
+ def _validate_bdm(self, context, instance):
+ for bdm in self.db.block_device_mapping_get_all_by_instance(
+ context, instance['uuid']):
+ # NOTE(vish): For now, just make sure the volumes are accessible.
+ if bdm['snapshot_id'] is not None and \
+ 'volume_id' in bdm and bdm['volume_id'] is None:
+ try:
+ self.volume_api.get_snapshot(context, bdm['snapshot_id'])
+ except Exception:
+ raise exception.InvalidBDMSnapshot(id=bdm['snapshot_id'])
+ elif 'volume_id' in bdm and bdm['volume_id'] is not None:
+ try:
+ self.volume_api.get(context, bdm['volume_id'])
+ except Exception:
+ raise exception.InvalidBDMVolume(id=bdm['volume_id'])
+

it passes its testsuite as well (I would attach the patch, but I had to massage it a little for the version of Folsom in Ubuntu).

Jamie Strandboge (jdstrand) wrote :

Visually inspecting Diablo, it seems affected too (the original report was against Diablo, so I am not sure why it is marked Incomplete...).

Attached is a patch for Diablo backported from Essex with my changes above. It does pass the testsuite included in Diablo on Ubuntu, but beyond that I haven't tested it.

Thierry Carrez (ttx) wrote :

Argh. We really need that private Gerrit and tests to validate patches better. Thanks for catching that. I suggest we push back publication to next Tuesday (this was a really short embargo period) and communicate the new set of patches to the stakeholders. Russell, thoughts?

About Diablo, we were waiting for a confirmation since the Diablo code base mentioned in the report is based on an early and heavily patched milestone :) The diablo-maint team didn't really pick it up though, hence the "Incomplete" status.

Pádraig Brady (p-draigbrady) wrote :

Red Hat are OK with this going out today at least. The ammendments simple.
Though when verifiying them I noticed the logic could be simplified somewhat.
Also the existing patches didn't contain commit messages.
Also all didn't apply cleanly to the latest trees.
Also the essex and diablo ones duplicated the InvalidSnapshot exception class.

So I redid all the patches against the latest trees , and collated all the info
from this bug report into the commit messages, and verified pertinent tests pass.

patches following...

Pádraig Brady (p-draigbrady) wrote :

redone diablo patch

Pádraig Brady (p-draigbrady) wrote :

redone essex patch

Pádraig Brady (p-draigbrady) wrote :

redone folsom patch

Pádraig Brady (p-draigbrady) wrote :

redone grizzly patch

Thierry Carrez (ttx) wrote :

DISCLOSURE DELAYED to Tuesday, January 29th at 1500 UTC.

Russell Bryant (russellb) wrote :

Updated Essex patch to fix pep8 error

Russell Bryant (russellb) wrote :

Updated Folsom patch to fix a pep8 issue

Russell Bryant (russellb) wrote :

And updated Grizzly patch to fix pep8

It would be good to get another ACK on this

Sean Dague (sdague) wrote :

+2 on russellb's patch, looks good to me

Vish Ishaya (vishvananda) wrote :

+2

Pádraig Brady (p-draigbrady) wrote :

Note I'll push these patches to E,F,G tomorrow at 1500 UTC,
and approve them immediately as preapproved.

I want to amend the Author to Vish (who I double checked with).

Thierry Carrez (ttx) on 2013-01-29
information type: Private Security → Public Security

Fix proposed to branch: master
Review: https://review.openstack.org/20698

Changed in nova:
assignee: nobody → Pádraig Brady (p-draigbrady)
status: Invalid → In Progress

Reviewed: https://review.openstack.org/20698
Committed: http://github.com/openstack/nova/commit/24fffd9d8b77e9b71e8013fc22c172f76bb4e84c
Submitter: Jenkins
Branch: master

commit 24fffd9d8b77e9b71e8013fc22c172f76bb4e84c
Author: Vishvananda Ishaya <email address hidden>
Date: Thu Jan 24 10:07:33 2013 +0000

    validate specified volumes to boot from at the API layer

    This causes the create to fail fast instead of putting
    the instance into error later in the process.

    Related to bug: 1069904
    Change-Id: I5f7c8d20d3ebf33ce1ce64bf0a8418bd2b5a6411

Changed in nova:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/20700
Committed: http://github.com/openstack/nova/commit/243d516cea9d3caa5a8267b12d2f577dcb24193b
Submitter: Jenkins
Branch: stable/essex

commit 243d516cea9d3caa5a8267b12d2f577dcb24193b
Author: Vishvananda Ishaya <email address hidden>
Date: Thu Jan 24 10:45:19 2013 +0000

    disallow boot from volume from specifying arbitrary volumes

    Fix a vulnerability in volume attachment in nova-volume, affecting the
    boot-from-volume feature. By passing a specific volume ID, an
    authenticated user may be able to boot from a volume they don't own,
    potentially resulting in full access to that 3rd-party volume.

    Fixes bug: 1069904, CVE-2013-0208
    Change-Id: I5f7c8d20d3ebf33ce1ce64bf0a8418bd2b5a6411

Reviewed: https://review.openstack.org/20699
Committed: http://github.com/openstack/nova/commit/317cc0af385536dee43ef2addad50a91357fc1ad
Submitter: Jenkins
Branch: stable/folsom

commit 317cc0af385536dee43ef2addad50a91357fc1ad
Author: Vishvananda Ishaya <email address hidden>
Date: Thu Jan 24 10:07:33 2013 +0000

    disallow boot from volume from specifying arbitrary volumes

    Fix a vulnerability in volume attachment in nova-volume, affecting the
    boot-from-volume feature. By passing a specific volume ID, an
    authenticated user may be able to boot from a volume they don't own,
    potentially resulting in full access to that 3rd-party volume.
    Folsom setups making use of Cinder are not affected.

    Fixes bug: 1069904, CVE-2013-0208
    Change-Id: I5f7c8d20d3ebf33ce1ce64bf0a8418bd2b5a6411

Changed in nova (Debian):
status: Unknown → Fix Released
Thierry Carrez (ttx) on 2013-02-21
Changed in nova:
milestone: none → grizzly-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2013-04-04
Changed in nova:
milestone: grizzly-3 → 2013.1
Thierry Carrez (ttx) on 2013-05-24
summary: - No authentication on block device used for os-volume_boot
+ [OSSA 2013-001] No authentication on block device used for os-
+ volume_boot
Changed in ossa:
assignee: nobody → Thierry Carrez (ttx)
status: New → Fix Released
Sean Dague (sdague) on 2014-09-15
no longer affects: nova/diablo
no longer affects: nova/essex
To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.