Instance type with disk set to 0 can cause DoS

Bug #1739646 reported by Mohammed Naser
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann
Ocata
Fix Committed
High
Matt Riedemann
Pike
Fix Committed
High
Matt Riedemann
Queens
Fix Committed
High
Matt Riedemann
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned
OpenStack Security Notes
New
Undecided
Unassigned

Bug Description

In OpenStack at the moment, there is the ability to create instance types with disk size 0. The API documentation states the following:

"The size of the root disk that will be created in GiB. If 0 the root disk will be set to exactly the size of the image used to deploy the instance. However, in this case filter scheduler cannot select the compute host based on the virtual image size. Therefore, 0 should only be used for volume booted instances or for testing purposes."

In a cloud environment where a deployer wants to offer boot-from-volume instances, those instance types will be there. However, this means that a user can upload an image of 4TB and boot small instances where each one will have 4TB of storage, potentially exhausting the disks local storage (or Ceph cluster if using Ceph for ephemeral storage).

I'm not sure if this is a security issue or it should be published as an advisory, but I believe there should be an option to disable the feature of booting an instance with the exact size of the image used so deployers have the ability/choice to provide boot-from-volume instance types.

I can confirm this in our environment that if a customer creates an instance with 200GB of ephemeral disk space, they can take an image of it, then create an instance with that image on an instance type that has no ephemeral disk space and get 200GB of disk.

Tags: security
Revision history for this message
Mohammed Naser (mnaser) wrote :

I've added Matt to this bug as he has helped discuss the details of this issue and figure out parts of it.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Adding mdbooth given his background in the nova block device and imagebackend code.

I did point out https://review.openstack.org/#/c/511965/ to mnaser but that's really just a spec that's up for discussion at this point, and is mostly an artifact from discussions at the Boston Forum for Pike, and trying to document a lot of the use cases and previous attempts around volume-backed flavors.

For this bug, it's probably not unreasonable to add a config option to nova to determine if a flavor with root disk=0 will not be allowed unless you're booting from volume. We could check this in the API.

For private clouds maybe they don't care. I'm not sure what we'd default this to.

1. Default to enforce makes the most sense for new deployments and security.
2. Default to not enforce would make it backward compatible until a deployer opts into the change.

#1 is probably the better choice though all things considered; the default behavior of something shouldn't be to hand you a loaded gun to shoot yourself with unknowingly, even though I live in 'merica!

I recognize this would introduce config-driven API behavior, which we really try to avoid in Nova, but it's a compromise until we have something built more into the API, like some of the ideas in the spec linked above.

Revision history for this message
Mohammed Naser (mnaser) wrote :

As much as I would like us to default to enforce, I think there will be existing installation which will break as I am willing to bet that there are people who rely on this (as a matter of fact, I know some of our customers do this because it's easier than creating and deleting volumes).

We'd ideally work to transition them out but I can imagine it being a surprise to a user. I think a more sensible thing is to introduce the flag set to non-enforcing but with a warning when the service starts that in <foo>-release, this will be defaulted to enforce.

I'm not sure about the feasibility and the ability to detect if we're using the default value *or* if it was manually added by the deployed in the config.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Changed in ossa:
status: New → Incomplete
description: updated
Revision history for this message
Matt Riedemann (mriedem) wrote :

mnaser: I'm OK with doing the backward compatible thing by default too.

We'd remove the option if/when we have a replacement, which is maybe something that could come out of that spec.

Otherwise we could probably add a microversion in Rocky to indicate if a flavor with disk=0 enforces boot from volume? Just an idea. That gets a bit weird though because the value exposed out of the API wouldn't be part of the flavor resource, it would be from the config in the API at that point in time, because the config could change after the flavor was created. Would definitely require some thought on that.

Revision history for this message
Matt Riedemann (mriedem) wrote :

As far as how far back this goes in nova, I'm sure it impacts at least all of the currently supported upstream stable branches. I don't know at what point instance_type.disk==0 was introduced though, that would require some digging.

Revision history for this message
Matthew Booth (mbooth-9) wrote :

I believe instance_type.disk == 0 is paleolithic in OpenStack terms. It recall it causing me headaches round about Havana, at least. Pretty sure it just affects everybody.

I believe the relevant test here is in compute.API._check_requested_image, which intentionally skips the flavor disk size check for volumes. This suggests a kludgy non-code workaround, which is to set the flavor disk size to 1 instead of 0. This would give users up to 1GB of local disk space per instance. This is 1GB more than the operator intends, but it's unlikely to be a practical attack.

I would also be in favour of the config setting to disable the size 0 special behaviour. I think the combination of this and pointing out the kludge should cover most use cases until we fix this properly.

Revision history for this message
Mohammed Naser (mnaser) wrote :

Hope everyone enjoyed their holidays!

Can we follow up on this on an agreed solution?

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

This sounds like a class B1/B2 according to VMT's taxonomy ( https://security.openstack.org/vmt-process.html#incident-report-taxonomy ). Unless a backward compatible fix can be proposed to stable branches, then I suggest we subscribe the ossg-coresec group and create an OSSN task.

Revision history for this message
Mohammed Naser (mnaser) wrote :

If this is not something that needs to be embaroged by the security team, could we open this up then as I'd like to discuss potential solutions openly.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Yes, given seemingly low impact, I'll subscribe the OSSG core security reviewers and if no further objections in the next few days we can switch this to public as a class B(1 or 2) report.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Per comment 3:

> I'm not sure about the feasibility and the ability to detect if we're using the default value *or* if it was manually added by the deployed in the config.

I think there might be a recent-ish feature in oslo.config that will allow us to know this, but we probably can't rely on that since it wouldn't be in older versions of oslo.config on stable branches.

Also, this bug came back to mind because I think another (public) bug reported is a duplicate of essentially the same issue:

https://bugs.launchpad.net/nova/+bug/1758278

Revision history for this message
Jeremy Stanley (fungi) wrote :

Ahh, yes there were no subsequent objections to switching this bug to public, so I'll do that now and triage as a class B1 report. The security notes editors may want to consider drafting an OSSN related to this when it gets fixed in master.

information type: Private Security → Public
Changed in ossa:
status: Incomplete → Won't Fix
description: updated
tags: added: security
Revision history for this message
Matt Riedemann (mriedem) wrote :

I was working on a backward-compatible fix for this today, should have the patch posted in the next hour or so, but now that this is public is there any reason to not just post that to Gerrit and duplicate bug 1758278 against this?

Matt Riedemann (mriedem)
Changed in nova:
status: New → Triaged
assignee: nobody → Matt Riedemann (mriedem)
Revision history for this message
Jeremy Stanley (fungi) wrote :

Oh, thanks, I guess if there are stable backports coming after all we can keep it as a class A report. Once there are backports linked from Gerrit I'll whip up an impact description and request a CVE assignment for this.

information type: Public → Public Security
Changed in ossa:
status: Won't Fix → Incomplete
tags: removed: security
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/561284

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/563692

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/563700

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/563719

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Jeremy Stanley (fungi) wrote :

It looks like the proposed patches don't actually engage the security protection they introduce, requiring an admin to make a policy configuration change before their environment will be protected from the issue described here. Am I interpreting that accurately, or misreading?

If this introduces the need to alter configuration for the deployment after the patch is applied, it's pretty clearly still class B1 in our taxonomy ("...default config value is insecure") and so more appropriate for a security note than an advisory.

Revision history for this message
Matt Riedemann (mriedem) wrote :

> It looks like the proposed patches don't actually engage the security protection they introduce, requiring an admin to make a policy configuration change before their environment will be protected from the issue described here. Am I interpreting that accurately, or misreading?

That is correct and intentional so that we don't introduce backward incompatible API behavior by default. This would land in Rocky with a backward compatible setting, and then in Stein we'll update the default policy rule to be admin-only, but it at least gives operators time to adjust to the change.

Looking at the Class B1 description:

"A vulnerability that can only be fixed in master, security note for stable branches, e.g., default config value is insecure"

The "default config value is insecure" is true, but the "can only be fixed in master" is not really true, since we can backport a fix for this, it's just not enabled by default. Maybe I'm reading too much into the intent of B1 though.

If a B1 classification means that vendors don't have to try and patch this all the way back to its origin release, then that's probably not a bad thing since it looks like this goes back to at least Diablo:

https://github.com/openstack/nova/commit/935c43b414c1685163957590a6fb77fd8ddbac2f

Revision history for this message
Jeremy Stanley (fungi) wrote :

It's only considered "fixed" from an advisory standpoint if, upon upgrading, the default behavior changes to solve the vulnerability. If operators need to alter their configuration as well, then this requires a bit more narrative and is the realm of a security note. Look at it this way... upgraded or new installations will continue to expose the vulnerability described here, as the default configuration is insecure. The plan is for it to be "fixed" (default behavior changes) at some point in a future release.

This is where we walk a fine line between vulnerabilities whose solutions can be safely backported to stable branches with minimally-disruptive changes in behavior, and those which can only be addressed through deprecation of the old unsafe behavior (regardless of whether we give the operator an easily-applied workaround, say, in the form of a configuration option).

Revision history for this message
Matt Riedemann (mriedem) wrote :

Got it. Thanks for your patience in describing the differences. With that I've removed the -W on the change on master and it should be ready for review from the nova trolls.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Sounds like we've got consensus on class B1, so I'll mark the OSSA task won't fix, add a new OSSN task, switch the bug type to just public rather than public security and add the security bugtag instead.

information type: Public Security → Public
Changed in ossa:
status: Incomplete → Won't Fix
tags: added: security
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/561284
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=763fd62464e9a0753e061171cc1fd826055bbc01
Submitter: Zuul
Branch: master

commit 763fd62464e9a0753e061171cc1fd826055bbc01
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 13 13:44:33 2018 -0400

    Add policy rule to block image-backed servers with 0 root disk flavor

    This adds a new policy rule which defaults to behave in a
    backward compatible way, but will allow operators to enforce
    that servers created with a zero disk flavor must also be
    volume-backed servers.

    Allowing users to upload their own images and create image-backed
    servers on local disk with zero root disk size flavors can be
    potentially hazardous if the size of the image is unexpectedly
    large, since it can consume the local disk (or shared storage pool).

    It should be noted that disabling the new policy rule will
    result in a non-backward compatible API behavior change and no
    microversion is being introduced for this because enforcement via
    a new microversion would not close the security gap on any previous
    microversions.

    Related compute API reference and user documentation is updated
    to mention the policy rule along with a release note since
    this is tied to a security bug, which will be backported to stable
    branches.

    Change-Id: Id67e1285a0522474844de130c9263e11868f67fb
    Closes-Bug: #1739646

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/563692
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7bcd581c78bb5916bf4b52e213322e7b56283572
Submitter: Zuul
Branch: stable/queens

commit 7bcd581c78bb5916bf4b52e213322e7b56283572
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 13 13:44:33 2018 -0400

    Add policy rule to block image-backed servers with 0 root disk flavor

    This adds a new policy rule which defaults to behave in a
    backward compatible way, but will allow operators to enforce
    that servers created with a zero disk flavor must also be
    volume-backed servers.

    Allowing users to upload their own images and create image-backed
    servers on local disk with zero root disk size flavors can be
    potentially hazardous if the size of the image is unexpectedly
    large, since it can consume the local disk (or shared storage pool).

    It should be noted that disabling the new policy rule will
    result in a non-backward compatible API behavior change and no
    microversion is being introduced for this because enforcement via
    a new microversion would not close the security gap on any previous
    microversions.

    Related compute API reference and user documentation is updated
    to mention the policy rule along with a release note since
    this is tied to a security bug, which will be backported to stable
    branches.

    Conflicts:
          nova/policies/servers.py
          nova/tests/unit/test_policy.py

    NOTE(mriedem): The conflict is due to not having change
    Iedd3fea0e86648fae364f075915555dcb2c4f199 in Queens for trusted
    certs.

    Change-Id: Id67e1285a0522474844de130c9263e11868f67fb
    Closes-Bug: #1739646
    (cherry picked from commit 763fd62464e9a0753e061171cc1fd826055bbc01)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/563700
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0bf75621bbd25d4ce8a3588f112cf714891556eb
Submitter: Zuul
Branch: stable/pike

commit 0bf75621bbd25d4ce8a3588f112cf714891556eb
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 13 13:44:33 2018 -0400

    Add policy rule to block image-backed servers with 0 root disk flavor

    This adds a new policy rule which defaults to behave in a
    backward compatible way, but will allow operators to enforce
    that servers created with a zero disk flavor must also be
    volume-backed servers.

    Allowing users to upload their own images and create image-backed
    servers on local disk with zero root disk size flavors can be
    potentially hazardous if the size of the image is unexpectedly
    large, since it can consume the local disk (or shared storage pool).

    It should be noted that disabling the new policy rule will
    result in a non-backward compatible API behavior change and no
    microversion is being introduced for this because enforcement via
    a new microversion would not close the security gap on any previous
    microversions.

    Related compute API reference and user documentation is updated
    to mention the policy rule along with a release note since
    this is tied to a security bug, which will be backported to stable
    branches.

    Conflicts:
          doc/source/user/flavors.rst
          nova/tests/functional/wsgi/test_servers.py

    NOTE(mriedem): The doc/source/user/flavors.rst conflict is due to
    not having Ia57c93ef1e72ccf134ba6fc7fcb85ab228d68a47 in Pike.
    Rather than backport that, or drop the note about volume-backed
    instances for this backport, I have elected to just copy the wording
    for that particular section on "Root Disk GB".
    The nova/tests/functional/wsgi/test_servers.py conflict is due to
    not having I294c54e5a22dd6e5b226a4b00e7cd116813f0704 in Pike.

    Change-Id: Id67e1285a0522474844de130c9263e11868f67fb
    Closes-Bug: #1739646
    (cherry picked from commit 763fd62464e9a0753e061171cc1fd826055bbc01)
    (cherry picked from commit 7bcd581c78bb5916bf4b52e213322e7b56283572)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/563719
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8392c7f2656ae624877e3df539681c0a8f8b4926
Submitter: Zuul
Branch: stable/ocata

commit 8392c7f2656ae624877e3df539681c0a8f8b4926
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 13 13:44:33 2018 -0400

    Add policy rule to block image-backed servers with 0 root disk flavor

    This adds a new policy rule which defaults to behave in a
    backward compatible way, but will allow operators to enforce
    that servers created with a zero disk flavor must also be
    volume-backed servers.

    Allowing users to upload their own images and create image-backed
    servers on local disk with zero root disk size flavors can be
    potentially hazardous if the size of the image is unexpectedly
    large, since it can consume the local disk (or shared storage pool).

    It should be noted that disabling the new policy rule will
    result in a non-backward compatible API behavior change and no
    microversion is being introduced for this because enforcement via
    a new microversion would not close the security gap on any previous
    microversions.

    Related compute API reference and user documentation is updated
    to mention the policy rule along with a release note since
    this is tied to a security bug, which will be backported to stable
    branches.

    Conflicts:
          api-ref/source/parameters.yaml
          doc/source/admin/flavors2.rst
          nova/policies/servers.py
          nova/tests/functional/wsgi/test_servers.py

    NOTE(mriedem): The api-ref/source/parameters.yaml conflict is due
    to If646149efb7eec8c90bf7d07c39ff4c495349941 not being in Pike.
    The doc/source/admin/flavors2.rst conflict is due to the doc
    not being in Ocata - it was migrated from the central admin-guide
    in Ifa0039e270e54ea2fb58ab18ce6724e5e8e061a1.
    The nova/policies/servers.py conflict is due to two changes in Pike:
    I17b6ca6e17c777ae7d337bf70ec4774ffe5187a8 and
    I050c4f5f19aa79a682e076cc3e47eba597f272dd. The DocumentedRuleDefault
    class was added to oslo.policy starting in 1.21.1 in Pike which is
    newer than what stable/ocata supports in global-requirements so we
    can't use it in this backport.
    The nova/tests/functional/wsgi/test_servers.py conflict is due to
    Ifcaaf285c8f98a1d0e8bbbc87b2f57fbce057346 and
    I294c54e5a22dd6e5b226a4b00e7cd116813f0704 not being in Ocata.

    Change-Id: Id67e1285a0522474844de130c9263e11868f67fb
    Closes-Bug: #1739646
    (cherry picked from commit 763fd62464e9a0753e061171cc1fd826055bbc01)
    (cherry picked from commit 7bcd581c78bb5916bf4b52e213322e7b56283572)
    (cherry picked from commit 0bf75621bbd25d4ce8a3588f112cf714891556eb)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b3

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.5

This issue was fixed in the openstack/nova 16.1.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.6

This issue was fixed in the openstack/nova 17.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.1.4

This issue was fixed in the openstack/nova 15.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/603910
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c8e65a5eb11515cfe70f8e6850b842cd594af6a5
Submitter: Zuul
Branch: master

commit c8e65a5eb11515cfe70f8e6850b842cd594af6a5
Author: Mohammed Naser <email address hidden>
Date: Wed Sep 19 16:58:32 2018 -0400

    Default zero disk flavor to RULE_ADMIN_API in Stein

    The policy to allow booting instances without a volume when
    root_gb is set to 0 was to be set to default to admin-only
    in Stein.

    Depends-On: I537c299b0cd400982189f35b31df74755422737e

    Co-Authored-By: Matt Riedemann <email address hidden>

    Related-Bug: #1739646
    Change-Id: I247402b6c4ff8a7cb71ef247a218478194d68ff8

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.