scheduler hints are unbounded and never deleted

Bug #1673085 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Unassigned
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned
OpenStack Security Notes
Won't Fix
Undecided
Unassigned

Bug Description

I'm initially reporting this as a potential security issue but it might not be, I'm just looking for feedback from the VMT.

The scheduler_hints in the compute API are stored in the request_specs.spec column in the nova_api database:

https://github.com/openstack/nova/blob/15.0.1/nova/db/sqlalchemy/api_models.py#L171

There is no limit on the size of the keys or values, or number of hints, in the API:

https://github.com/openstack/nova/blob/15.0.1/nova/api/openstack/compute/schemas/scheduler_hints.py#L18

There are some pre-defined hints, but additionalProperties=True in the json schema means that one can provide any hints they want.

So I could boot a server with a scheduler_hints dict that has a million keys which are a million characters long. At best that just results in a 500 because the column size limit in the database rejects the json blob size. According to the mysql 5.7 docs:

https://dev.mysql.com/doc/refman/5.7/en/string-type-overview.html

"TEXT[(M)] [CHARACTER SET charset_name] [COLLATE collation_name]

A TEXT column with a maximum length of 65,535 (216 − 1) characters. The effective maximum length is less if the value contains multibyte characters. Each TEXT value is stored using a 2-byte length prefix that indicates the number of bytes in the value."

At worst, I'm able to work backward from a million until I found out the limit at which I can fill the request_specs.spec column and then just hammer the compute API, filling up the nova_api database.

So there are two issues:

1. No key/value size limit in the API json schema for scheduler hints.

2. No quota limit on the number of hints one can provide (unlike quota limits on user-provided metadata key/value pairs which are limited to 255 for the key/value and 128 for the quota).

Add to this the fact that we never delete request_specs entries from the nova_api database automatically (that's being worked on here: https://review.openstack.org/#/c/391060/ ).

This might not be a security issue, it might just be poor API design and we can tighten things up to avoid a 500 error with quota limits and json schema validation on the key/value size on each hint, and also delete request specs when we delete an instance.

Tags: api security
Revision history for this message
Matt Riedemann (mriedem) wrote :

For reference, here is the nova API subteam meeting log where this came up:

http://eavesdrop.openstack.org/meetings/nova_api/2017/nova_api.2017-03-15-13.00.log.html

summary: - scheduler hints are unbounded
+ scheduler hints are unbounded and never deleted
Revision history for this message
Matt Riedemann (mriedem) wrote :

If I know that scheduler hints are stored in the request_specs.specs table and the size on that column is 65,535, then I could reasonably do something like:

hints = {}
for x in xrange(64000):
    hints[str(x)] = str(x)

And then use those hints for the os:scheduler_hints field in the server create request body.

Revision history for this message
Jeremy Stanley (fungi) wrote :

To start, we can almost certainly switch this report to Public Security and skip the embargo since the potential risk is already mentioned in the meeting log (and even called out as a possible security vulnerability). There doesn't seem to be much novel in this report which an interested attacker couldn't easily work out from the public discussion and a quick skim through the source code.

As for whether we issue an advisory, I think this will mostly boil down to whether there's a safe way to "fix" it in supported stable branches, or if this will need to be paired with some configuration to actually tune and enable the mitigation.

Revision history for this message
Matt Riedemann (mriedem) wrote :

The fixes would be:

1. Change the scheduler_hints json schema request validation to limit the length on key/value pairs. Changing the API like that is backward incompatible and generally requires a microversion, which we wouldn't backport to stable branches.

2. To limit the number of hints per server create request, we could introduce a new qouta limit configuration option. This would implicitly change the behavior of the API, so might be in the same boat as #1 for backports.

--

So given we can't really backport at least a microversion bump, then there is probably no reason to issue an advisory for this.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Given comment #4 and the public nature of the issue, I'm going to go ahead and triage this report as Class B1 (A vulnerability that can only be fixed in master, security note for stable branches, e.g., default config value is insecure): https://security.openstack.org/vmt-process.html#incident-report-taxonomy

Changed in ossa:
status: New → Won't Fix
information type: Private Security → Public
Revision history for this message
Jeremy Stanley (fungi) wrote :

I've added an OSSN task so the security note editors can decide if this is one they want to cover with some sort of warning for stable branches.

Revision history for this message
Luke Hinds (lhinds) wrote :

So if a backport is not possible, what would the full mitigation steps be for ocata and previous?

additionalProperties=False

Revision history for this message
Matt Riedemann (mriedem) wrote :

@lhinds, no you can't set additionalProperties=False in the jsonschema because that changes the API in a backward incompatible way without a microversion, so it could break deployments/users.

Revision history for this message
Luke Hinds (lhinds) wrote :

Thanks Matt, so would I be right in saying there is no mitigation / recommended actions? For an OSSN we need to outline the problem, and more importantly provide recommended actions to take?

From what I can see, this is an issue that can only be addressed with code changes?

Revision history for this message
Luke Hinds (lhinds) wrote :

Its looking like this cannot be resolved with an OSSN. I will give this another week, and unless shown otherwise will mark it as wont' fix only under OpenStack Security Notes.

Revision history for this message
Robert Clark (robert-clark) wrote :

Looks to me like like B1 + OSSA (public) is the most appropriate path?

Revision history for this message
Jeremy Stanley (fungi) wrote :

OSSA are specific to issues fixed in supported stable branches. If they can only be fixed in master (and so future major releases), we don't issue advisories because there is no fix for operators to apply.

Luke Hinds (lhinds)
Changed in ossn:
status: New → Won't Fix
Revision history for this message
Matt Riedemann (mriedem) wrote :
Sean Dague (sdague)
Changed in nova:
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.