Glance doesn't check location's data

Bug #1551498 reported by Feilong Wang
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Glance
New
Undecided
Feilong Wang

Bug Description

Currently, when adding new location for an image, Glance doesn't check the checksum, so we can't grantee every location for an image is bit-for-bit identical, which is incompatible with the idea of image immutability.

Feilong Wang (flwang)
Changed in glance:
assignee: nobody → Fei Long Wang (flwang)
Feilong Wang (flwang)
summary: - Glance won't check location's data
+ Glance doesn't check location's data
Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I don't know that Glance needs to check this. The data at the location is out of Glance's control, so even if glance checked at the time the image was added, the bits at the location could be changed. The immutability guarantee is that once an image goes active, the image's UUID is associated with a particular set of bits that can be confirmed by the checksum. But the prudent image consumer will verify the image bits using the checksum, because they may have become corrupted in transit.

Or am I misunderstanding the bug?

Revision history for this message
Flavio Percoco (flaper87) wrote :

I agree with Brian. When setting remote locations, users are on their own. Glance has never had any control over that. We seriously need to clarify the story around image locations as it's not clear to folks when/why they should/would enable/use it.

That being said, I'm with Brian here. Checking the data is not part of Glance's business, or at least it's never been. If that's something folks would like to change, we need a spec for it and I think using the task engine would be the only way to do this properly.

Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

> But the prudent image consumer will verify the image bits using the checksum

But there is no checksum in this case...

 $ glance --os-image-api-version 1 image-create --disk-format raw --container-format bare --name x1 --location http://www.google.com

 $ glance image-show a431ffa8-22fd-4000-af7c-9884b70ad671
 +------------------+--------------------------------------+
 | Property | Value |
 +------------------+--------------------------------------+
 | checksum | None <<<< no checksum

Revision history for this message
Chris Martin (6-chris-z) wrote :

I would use the ability to set a checksum for images which were uploaded to their backend out-of-band (rather than through the Glance API). I have an automation which conditionally migrates image data between OpenStack deployments based on checksums matching or not. Without the checksum I need to use the size field, which is not a great assurance that the images match.

Suggestion via hemanthm in IRC: when a client adds a location for an image, the API could accept a checksum field. The client would essentially say "For image X, please add location Y, which I attest has checksum Z".

The Glance server could then do one of two things:
1. Trust the client and accept the checksum
2. Download the image data from the backend and compute the checksum. If it matches what the client has specified, then store the checksum for the image, otherwise set the image to an error state (maybe "killed" status)

1 would be easiest to implement, but Glance would be trusting the image location setter to not be evil (probably OK for my organization's use case.)
2 would provide a stronger assurance of image integrity than the current arrangement, but Glance would need to do more work to get there.

Revision history for this message
Steve Lewis (steve-lewis) wrote :

Incidentally, this is somewhat related to https://review.openstack.org/#/c/448680/
(Change-Id: I6d6d6c7e9252d015d778a2836b25deb110d20b5d)

  Added spec for multihash support

  This spec provides a future-proof mechanism for providing a
  self-describing hash for each image.

In that spec, change set 3, line 135 there is mention of lazily updating the ``multihash`` field on image download. I added a comment there that the lazy-update could be a task. Likewise option 2 in Chris Martin's previous comment could be achieved via a task. I don't know if that is the right solution but this issue is certainly a design consideration for that spec.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Just a point of information:

The v1 API does allow you to specify a checksum at the time of image creation. So for Stuart's example above, if you do:

$ glance --os-image-api-version 1 image-create --disk-format raw --container-format bare --name x2 --location http://www.google.com --checksum 12345678

An image-show gives you:

glance image-show d68b98ea-d1a3-4971-b38b-3c1c6945177e
+------------------+----------------------------------------------------+
| Property | Value |
+------------------+----------------------------------------------------+
| checksum | 12345678 |
| container_format | bare |
| created_at | 2017-04-06T02:00:13Z |
| disk_format | raw |
| id | d68b98ea-d1a3-4971-b38b-3c1c6945177e |
| locations | [{"url": "http://www.google.com", "metadata": {}}] |
etc.

In v2, the checksum is read-only in the model, so you can't do this.

As far as Glance downloading and verifying the checksum, that makes sense if the location is glance's own backend, but there's no restriction at the moment. There's no point verifying the checksum for arbitrary locations Glance doesn't own, because the user could just modify the data at that location. So the consumer is going to have to verify the checksum anyway.

I'm just curious about Chris's statement that for option 1, "Glance would be trusting the image location setter to not be evil". If the location setter doesn't supply the correct checksum, then consumers of the image won't be able to validate it by the checksum. Or is the worry that since md5 isn't cryptographically strong, an evil location setter could replace an image with a modified image having the same checksum? If that's the worry, we should wait to change this until after the multihash spec is implemented (it's under review now, https://review.openstack.org/448680).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.