data from previous tenants accessible with nova baremetal

Bug #1174153 reported by Robert Collins on 2013-04-29
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Josh Gachnang
OpenStack Compute (nova)
High
Josh Gachnang
OpenStack Security Notes
High
Robert Clark

Bug Description

At the moment the baremetal driver resets the partition table on the first hard disk, but doesn't wipe the data. This has two holes: other disks have their partition tables preserved; tenant data is able to be read by the new instance.

Wiping disks can be slow (particularly in cases where TRIM cannot be relied on), so we probably want to only do it when the new instance is for a new tenant.

Thierry Carrez (ttx) wrote :

Looks like a pretty significant vulnerability to me, or am I missing something ?

Robert Collins (lifeless) wrote :

We failed to communicate the experimental nature of baremetal during Grizzly: it was not intended to be supported at all. It is a limitation, not a vulnerability IMO. There are a raft of caveats with cross-tenant use of the same hardware, of which this is just one.

Michael Still (mikal) wrote :

I think the experience from Essex shows that people will still be using Grizzly in a couple of years, and we'll still be explaining why we didn't intend people to use this feature. I think we either need to fix bugs as they are reported, or clearly document that we don't expect anyone to use it.

Robert Collins (lifeless) wrote :

The fix for this can be backported once it's done; I'm merely arguing we don't need to treat it as a security vulnerability and fire-drill it : without secure boot [not supported in Grizzly, and likely too large to backport sensibly], and without full openflow hardware networking during the boot process [definitely not a Grizzly thing], it is impossible to trust multiple tenants on baremetal at all - because the vectors for attack are so low level that instances may be running in a virtual environment and unaware of it, with the virtual environment capturing secrets, forcing entropy pools to be predictable and other such hostile behaviour.

My suggestion for Grizzly is that we document these caveats thoroughly - I'd be delighted to do that - and then as and when we tackle them in H and I and beyond update the documentation accordingly.

Baremetal's primary use case *today* is for deploying a cluster run by a single group of sysadmins : that is one tenant from a security perspective. For that, it is totally usable.

Thierry Carrez (ttx) wrote :

@Robert, I'm fine with documenting that. We actually have a process (OSSN) to document that sort of thing (we issued one for libvirt-lxc drivers and how they should not be used in multi-tenant environments either).

Let me plug the OSSN crew in.

Thierry Carrez (ttx) wrote :

The gist of what needs to be issued as an OSSN is:

"without secure boot, and without full openflow hardware networking during the boot process, it is impossible to trust multiple tenants on baremetal at all - because the vectors for attack are so low level that instances may be running in a virtual environment and unaware of it, with the virtual environment capturing secrets, forcing entropy pools to be predictable and other such hostile behaviour."

Robert Clark (robert-clark) wrote :

I've just seen this, we're happy to issue an OSSN and will draft something today.

-Rob

Changed in ossn:
importance: Undecided → High
assignee: nobody → Robert Clark (robert-clark)
Robert Clark (robert-clark) wrote :

Nova Baremetal Exposes Previous Tenant Data
-----

### Summary ###
Data of previous tenants may be exposed to new ones when using Nova Baremetal

### Affected Services / Software ###
Keystone, Databases

### Discussion ###
Nova Baremetal is intended for testing and development only, it is not intended to be production ready. Experience has shown that despite that warning the OpenStack community is keen to embrace new technologies and deploy at-risk. This OSSN serves to signpost some of the risks.

Without secure boot, and without full openflow hardware networking during the boot process, it is impossible to trust multiple tenants on baremetal at all - because the vectors for attack are so low level that instances may be running in a virtual environment and unaware of it, with the virtual environment capturing secrets, forcing entropy pools to be predictable and other such hostile behaviour.

### Recommended Actions ###
Do not use Nova Baremetal where secure separation of tenants on hardware is a requirement without a full verifiable boot chain and network hardware.

### Contacts / References ###
This OSSN: https://bugs.launchpad.net/ossn/+bug/1174153

Thierry Carrez (ttx) wrote :

Sounds good.

Robert Clark (robert-clark) wrote :

Nova Baremetal Exposes Previous Tenant Data
-----

### Summary ###
Data of previous tenants may be exposed to new ones when using Nova Baremetal

### Affected Services / Software ###
Keystone, Databases

### Discussion ###
Nova Baremetal is intended for testing and development only, it is not intended to be production ready. Experience has shown that despite that warning the OpenStack community is keen to embrace new technologies and deploy at-risk. This OSSN serves to signpost some of the risks.

Without secure boot, and without full openflow hardware networking during the boot process, it is impossible to trust multiple tenants on baremetal at all - because the vectors for attack are so low level that instances may be running in a virtual environment and unaware of it, with the virtual environment capturing secrets, forcing entropy pools to be predictable and other such hostile behaviour.

### Recommended Actions ###
Do not use Nova Baremetal where secure separation of tenants on hardware is a requirement without a full verifiable boot chain and network hardware.

### Contacts / References ###
This OSSN : https://bugs.launchpad.net/ossn/+bug/1174153
OpenStack Security ML : <email address hidden>
OpenStack Security Group : https://launchpad.net/~openstack-ossg

Robert Clark (robert-clark) wrote :

Crossposted to OpenStack/OpenStack Dev - 2nd July 2013

Changed in ossn:
status: New → Fix Released
devananda (devananda) on 2013-10-03
Changed in ironic:
status: New → Triaged
importance: Undecided → High
Roman Prykhodchenko (romcheg) wrote :

I think wiping the node is the thing that should be discussed.
There are multiple approaches for performing that. Overwriting a block device over the network does not look like a good idea because it's extremely slow. I mentioned some thoughts in on of the sessions proposal here: http://summit.openstack.org/cfp/details/57 and going to repost that here.

Roman Prykhodchenko (romcheg) wrote :

Possible solutions
--------------------------
- Build a special undeploy image and use it for either
  - Securely erasing the volume on the node side
  - Exporting a volume to manager and perform erasing on the manager side
- Create a separate boot configuration on the node that loads a kernel and a ramdisk with undeploy scripts in it

Food For Thought
--------------------------
- Should wiping be a part of deploying or undeploying?
- Should we wipe all nodes or wipe them on-demand?
  - Wiping all nodes might be ot required for everyone
  - Securely wiping a node requires a lot of time

Is there anything we'll have to do on Nova side?

Changed in nova:
status: Triaged → Incomplete
Daniel Berrange (berrange) wrote :

We've had to deal with this problem before in Nova with the libvirt driver with its LVM volume backend.

In that case we will wipe the data at VM teardown, to ensure future VMs don't see any data from previous tenants

commit 9d2ea970422591f8cdc394001be9a2deca499a5f
Author: Pádraig Brady <email address hidden>
Date: Fri Nov 23 14:59:13 2012 +0000

    Don't leak info from libvirt LVM backed instances

    * nova/virt/libvirt/utils.py (remove_logical_volumes):
    Overwrite each logical volume with zero
    (clear_logical_volume): LV obfuscation implementation.
    (logical_volume_size): A utility function used by
    clear_logical_volume()

    Fixes bug: 1070539
    Change-Id: I4e1024de8dfe9b0be3b0d6437c836d2042862f85

We made this behaviour configurable with a nova.conf setting

commit 71946855591a41dcc87ef59656a8a340774eeaf2
Author: Pádraig Brady <email address hidden>
Date: Tue Feb 11 11:51:39 2014 +0000

    libvirt: support configurable wipe methods for LVM backed instances

    Provide configurable methods to clear these volumes.
    The new 'volume_clear' and 'volume_clear_size' options
    are the same as currently supported by cinder.

    * nova/virt/libvirt/imagebackend.py: Define the new options.
    * nova/virt/libvirt/utils.py (clear_logical_volume): Support the
    new options. Refactor the existing dd method out to
    _zero_logic_volume().
    * nova/tests/virt/libvirt/test_libvirt_utils.py: Add missing test cases
    for the existing clear_logical_volume code, and for the new code
    supporting the new clearing methods.
    * etc/nova/nova.conf.sample: Add the 2 new config descriptions
    to the [libvirt] section.

    Change-Id: I5551197f9ec89ae2f9b051696bccdeb1af2c031f
    Closes-Bug: #889299

IMHO we should move this config setting & code out of the libvirt section into the general nova.conf section and re-use the logic for baremetal.

Sean Dague (sdague) on 2014-09-04
Changed in nova:
status: Incomplete → Won't Fix

I just want to point out that this is being worked on, and expected to be completed in Kilo:

https://review.openstack.org/#/q/status:open+branch:master+topic:bp/decom-nodes,n,z

This patch series covers more than wiping disks (e.g. firmware updates), but disk erase is in there and configurable at the ironic level, as I understand it.

Reviewed: https://review.openstack.org/155561
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=7589d1faaf5415c12b2e738c6c09b17b65993235
Submitter: Jenkins
Branch: master

commit 7589d1faaf5415c12b2e738c6c09b17b65993235
Author: Josh Gachnang <email address hidden>
Date: Tue Feb 17 17:42:18 2015 -0800

    Implement execute clean steps

    This implements executing the clean steps in the conductor. The RPC
    API version is bumped to allow async clean steps to call back
    to the conductor.

    Adds node.clean_step to store the current cleaning operation.

    The conductor will get a list of clean steps from the node's drivers,
    order them by priority, and then have the drivers execute the steps
    in order.

    Adds a config option to enable cleaning, defaulting to True.

    Related-bug: #1174153

    Implements blueprint implement-cleaning-states
    Change-Id: I96af133c501f86a6e620c4684ee65abad2111f7b

Changed in nova:
assignee: nobody → Josh Gachnang (joshnang)
status: Won't Fix → In Progress
devananda (devananda) wrote :

The feature work in Ironic has been done to address this long-standing bug, however Nova needs to be upated to understand the changes in Ironic.

Addressed by the following:

Add support for cleaning in Ironic driver
  https://review.openstack.org/#/c/161474/

Adjust resource tracker for new Ironic states
  https://review.openstack.org/#/c/164313/

Changed in nova:
milestone: none → kilo-3
devananda (devananda) on 2015-03-17
Changed in ironic:
milestone: none → kilo-3
devananda (devananda) on 2015-03-18
Changed in ironic:
status: Triaged → In Progress
assignee: nobody → Josh Gachnang (josh-gachnang)
assignee: Josh Gachnang (josh-gachnang) → Josh Gachnang (joshnang)
Changed in nova:
assignee: Josh Gachnang (joshnang) → Jim Rollenhagen (jim-rollenhagen)
Thierry Carrez (ttx) on 2015-03-19
Changed in ironic:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-03-19
Changed in ironic:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-03-20
Changed in nova:
milestone: kilo-3 → kilo-rc1
Changed in nova:
status: In Progress → Won't Fix

Looks like we still have 2 reviews in progress in **Nova**:
https://review.openstack.org/#/c/161474/
https://review.openstack.org/#/c/164313/

Changed in nova:
status: Won't Fix → In Progress
Changed in nova:
assignee: Jim Rollenhagen (jim-rollenhagen) → Josh Gachnang (joshnang)

Reviewed: https://review.openstack.org/161474
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=597507d66b7de6a4d17863b2713e6def2c436f7e
Submitter: Jenkins
Branch: master

commit 597507d66b7de6a4d17863b2713e6def2c436f7e
Author: Josh Gachnang <email address hidden>
Date: Wed Mar 4 15:26:54 2015 -0800

    Add support for cleaning in Ironic driver

    Ironic added a new state between deleting and available called cleaning
    where tasks like erasing drives occur after each delete. This patch
    ensures that instances can be considered deleted in Nova as soon as a
    node enters cleaning state, otherwise, instances could be stuck in
    deleting state in Nova for hours or days.

    This is necessary to implement the Ironic cleaning spec:
    https://blueprints.launchpad.net/ironic/+spec/implement-cleaning-states

    Closes-Bug: 1174153
    Change-Id: Ie04823c40efc08f887429a6b8e6219558c3e4efa

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-04-10
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-04-30
Changed in nova:
milestone: kilo-rc1 → 2015.1.0
Thierry Carrez (ttx) on 2015-04-30
Changed in ironic:
milestone: kilo-3 → 2015.1.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers