Glance

Add request id to returned objects

Bug #1525259 reported by Abhishek Kekane on 2015-12-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Glance	Triaged	Wishlist	Abhishek Kekane

Bug Description

We are adding support for returning ‘x-openstack-request-id’ to the caller as per the design proposed in cross-project specs:
http://specs.openstack.org/openstack/openstack-specs/specs/return-request-id.html

Problem Description:
Cannot add a new property of list type to the warlock.model object.

The requirement we are trying to meet is to make the request id available to the user of the client library [3]. The user typically doesn't have access to the headers, so the request id needs to be part of the payload returned from each method. In other clients that work with simple data types, we've subclassed dict, list, etc. to add the extra property. This adds the request id to the return value without making a breaking change to the API of the client library.

How is a model object created:
Let’s take an example of glanceclient.api.v2.images.get() call [1]:

Here after getting the response we call model() method. This model() does the job of creating a warlock.model object(essentially a dict) based on the schema given as argument (image schema retrieved from glance in this case). Inside model() the raw() method simply return the image schema as JSON object. The advantage of this warlock.model object over a simple dict is that it validates any changes to object based on the rules specified in the reference schema. The keys of this model object are available as object properties to the caller.

Underlying reason:
The schema for different sub APIs is returned a bit differently. For images, metadef APIs glance.schema.Schema.raw() is used which returns a schema containing “additionalProperties”: {“type”: “string”}. Whereas for members and tasks APIs glance.schema.Schema.minimal() is used to return schema object which does not contain “additionalProperties”.

So we can add extra properties of any type to the model object returned from members or tasks API but for images and metadef APIs we can only add properties which can be of type string. Also for the latter case we depend on the glance configuration to allow additional properties.

As per our analysis we have come up with two approaches for resolving this issue:

Approach #1: Inject request_ids property in the warlock model object in glance client
Here we do the following:
1. Inject the ‘request_ids’ as additional property into the model object(returned from model())
2. Return the model object which now contains request_ids property

Limitations:
1. Because the glance schemas for images and metadef only allows additional properties of type string, so even though natural type of request_ids should be list we have to make it as a comma separated ‘string’ of request ids as a compromise.
2. Lot of extra code is needed to wrap objects returned from the client API so that the caller can get request ids. For example we need to write wrapper classes for dict, list, str, tuple, generator.
3. Not a good design as we are adding a property which should actually be a base property but added as additional property as a compromise.
4. There is a dependency on glance whether to allow custom/additional properties or not. [2]

Approach #2: Add ‘request_ids’ property to all schema definitions in glance

Here we add ‘request_ids’ property as follows to the various APIs (schema):

“request_ids”: {
  "type": "array",
  "items": {
    "type": "string"
  }
}

Doing this will make changes in glance client very simple as compared to approach#1.
This also looks a better design as it will be consistent.
We simply need to modify the request_ids property in various API calls for example glanceclient.v2.images.get().

[1] https://github.com/openstack/python-glanceclient/blob/master/glanceclient/v2/images.py#L179
[2] https://github.com/openstack/glance/blob/master/glance/api/v2/images.py#L944
[3] http://specs.openstack.org/openstack/openstack-specs/specs/return-request-id.html

Tags:

Revision history for this message

Stuart McLaren (stuart-mclaren) wrote on 2015-12-11:

Couple of comments:

Approach #1: Inject request_ids property in the warlock model object in glance client

We've (well I've!) typically pushed back on this. Up until now we've managed to maintain a one-to-one mapping between the json returned by Glance and the subsequent warlock object. It's one of the few things we've managed to keep as originally intended.

Approach #2: Add ‘request_ids’ property to all schema definitions in glance

Glance would be returning the request_id in the body rather than the header (or in addition to the header). I'm not sure we should modify the server API to fix what is really a client (specifically, python client) issue.

Is the key thing here to give users of the library "something" they can use to read the request_ids generated by the particular method call?

If so, what about something like the following?

requests has a 'hook' capability (http://docs.python-requests.org/en/latest/user/advanced/#event-hooks):

--------------------------

import functools
import requests

def append_content_length(target, r, *args, **kwargs):
target.append(r.headers['content-length'])

content_lengths = []

hook = functools.partial(append_content_length, content_lengths)

requests.get('http://docs.openstack.org/developer/nova', hooks=dict(response=hook))
requests.get('http://docs.openstack.org/developer/glance', hooks=dict(response=hook))

print content_lengths
----------------------------

Running it gives:

$ python contentlength.py
['321', '17671', '323', '12710']

Glance client would do this by accepting an empty list, ie change:

def delete(image):

def delete(image, request_id_list=None):

callers of the method would provide an empty list if they wanted the request ids:

request_ids = []
glanceclient.delete(image, request_ids)
print request_ids

The client code would have to shepherd the hook function down to the calls to requests but wouldn't have to worry about returning anything back.

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2015-12-14:

Hi Stuart,

Thank you for comments.

I understand approach #1 mapping between glance and glanceclient.

Regarding your alternative solution for approach #2 to use request-id, we have already tried this and discussed this in cross-project sessions and comes to conclusion to add wrappers around response to add request_id.

adding hooks to requests has some drawbacks as follows:

1. The client code need to shepherd the hook function down to the calls to requests.
2. Need to modify each method call to pass request_id_list which will be a big change IMO.
3. Using this way request_id will be only used for logging purpose and not returned to caller and will not be handy to the user for tracing the logs.
4. Need changes in cross-projects as well for example nova making call to glance or cinder to glance.
5. This will not be consistent with other clients.

The requirement we are trying to meet is to make the request id available to the user of the client library. The user typically doesn't have access to the headers, so we are adding wrapper classes around response to add a request id to it which can be returned back to the caller. In case of glanceclient as it is using warlock model we are finding it difficult without making changes in schema, but In other clients it works with simple data types, as we have subclassed dict, list, etc. to add the extra property. This adds the request id to the return value without making a breaking change to the API of the client library.

Revision history for this message

Stuart McLaren (stuart-mclaren) wrote on 2015-12-14:

> 3. Using this way request_id will be only used for logging purpose and not returned to caller and will not be handy to the user for tracing the logs.

Can you add more information here?

I see the request_ids being available after the method call, like so:

# caller provides list for the request_ids to be saved in
request_ids = []
# caller calls the method with the supplied empty list
glanceclient.delete(image, request_ids)
# request_ids have now been populated in the provided list and are available for general use by the caller
print request_ids

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2015-12-14:

Hi Sturart,

Here my point is, If I am using python-glanceclient in some third party applications and if some api fails due to some unknown reason then there is no way to get X-Openstack-Request-Id from the client.

Revision history for this message

Stuart McLaren (stuart-mclaren) wrote on 2015-12-14:

Hi Abhishek,

If I make this change:

    def delete(self, image_id, request_ids=None):
        """Delete an image."""
        hook = dict(response=functools.partial(request_id_hook, request_ids))
        url = '/v2/images/%s' % image_id
        self.http_client.delete(url, hooks=hook)

And then view the 'cli' (shell) as a third party consumer of the library (ie something that calls the glance client delete method), and modify the do_image_delete function (glanceclient/v2/shell.py) to look as follows:

            request_ids = []
            gc.images.delete(args_id, request_ids=request_ids)
            print request_ids

The third party app (in this case, the cli) has access to the request_ids:

$ glance image-delete 11887d68-7faf-4821-9a42-3ec63451067c
['req-a0892f56-2626-4069-8787-f8bf56e0c1cc']

Revision history for this message

Erno Kuvaja (jokke) wrote on 2015-12-14:

Thanks Abhishek for filing the lite-spec for this.

@Stuart, Thanks for digging into this!

I personally prefer this alternative approach the most at the moment. It looks like the "most right way" to do this in our current model. I'm really not fond of original approach 2 as that would give the impression that request_id is a property in our image (so which request id we refer here, the one that created the image, the last one that modified it or something else). If using the hooks provides us access to the req id's without changing our schema and without wrapping the whole warlock object around, that really sounds like the way to go. and the implementation in the client seems fairly clean and easy to understand as well.

Revision history for this message

Stuart McLaren (stuart-mclaren) wrote on 2015-12-14:

Abhishek,

Do you agree that the alternative suggestion makes the request_ids available to 3rd party apps?

Thanks.

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2015-12-15:

Hi Stuart, Erno,

Thank you for your suggestions.

We are analyzing this approach to check if there are any corner-cases and will come up with detail information in a day or two.
If we find it is working in all the cases then we will start implementing the same design.

Flavio Percoco (flaper87) on 2015-12-15

Changed in glance:
status:	New → Triaged

Revision history for this message

Brian Rosmaita (brian-rosmaita) wrote on 2015-12-15:

Abhishek, thanks for all the work on this. You've written quite an exhaustive spec-lite!

I agree with Erno and Stuart. I really like the approach Stuart has mapped out. Hopefully you won't find any weird corner cases!

Revision history for this message

Ravi Shekhar Jethani (ravishekar-jethani) wrote on 2015-12-17:

#10

Hi Stuart, Erno,

We have looked at the solution and we understand it like this:

Every API method which is making an http request will be passed a list by the caller.

If the ‘request’ is successful then the request id (extracted from the response object) will be *appended* to this list. Although the request is successful but the corresponding operation maybe a failure. For e.g. calling delete for an invalid image id.

If the ‘request’ fails for some reason(for e.g. toke authentication failure or glance service is down) then there will
be no request id in the response received so list will not be modified.

By analyzing various API methods we can categorize them based on the no. of requests made:

1. Methods making only a single ‘request’:
This is the simplest case. Examples: images.create(), image_members.list() etc. Here after the call request id list will look like: [‘request-id’]

2. Methods making 2 to 3 ‘requests’:
Here after the call request id list will look like: [‘request-id1’, ‘request-id2’, ‘request-id3’]. Example: images.update_locations()
Here first we fetch the image's model via get()[1st req id] then we make a patch request to update the image[2nd req id] at last we fetch the updated image model[3rd req id]

3. Methods making n number of ‘requests’:
Here after the call request id list will look like: [‘request-id1’, ‘request-id2’, ‘request-id3’, …]. For example: images.list()
Here no. of request ids returned depends on page_size and limit factors. If no. of images to list is 5 and page_size=1 then no. of request ids returned is 6. First five for each page request(1 image per page). An extra request id is returned while calling paginate() one last time to raise StopIeration.

Some concerns:

For case #2 many intermediate requests are being made so should the caller get all the request ids or just one?

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2015-12-18:

#11

Hi Stuart, Erno

We are making changes to v2 api only as v1 will be deprecated.
In addition to above analysis, as of now I have analyzed nova and cinder which is calling glanceclient.

In case of nova, it is using v1 version by default (except image-show can use v2 and v1 both) so even if request_ids list is passed from nova to glanceclient it will not break but it doesn't append request-id to the list as we are making changes to v2 only.

Similarly in case of cinder, service provider can configure (glance_api_version) which api version of glance should be used and in case of v1 it will not append request-id to request-ids list.

So should we make changes to v1 api as well so that it will return request-ids in case v1 api is used?
Please suggest.

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2015-12-21:

#12

Hi Stuart, Erno,

A potential issue with the current approach:

Let’s take an example here: images.get() API call

Here we are making two 'requests': First the http_client.get() as a part of images.Contoller.get() itself
and second in schemas.Controller.get() via the images.Controller.model() to fetch glance schema for creating warlock model object.

Inserting 'hook' into the first request is straight forward.

To insert hook in second case we need to pass request ids list to schemas.Controller.get() from model()
and model() will inturn get this list from images.Controller.get() API method.

Now here we have a problem, we cannot pass extra arguements to model() like:

self.model(request_ids=request_ids, **body)

Alternative way: If we cannot pass request ids list to model() let’s try returning one:

diff --git a/glanceclient/v2/images.py b/glanceclient/v2/images.py
index c065535..b090c02 100644
--- a/glanceclient/v2/images.py
+++ b/glanceclient/v2/images.py
@@ -38,10 +38,11 @@ class Controller(object):

     @utils.memoized_property
     def model(self):
- schema = self.schema_client.get('image')
+ request_ids = []
+ schema = self.schema_client.get('image', request_ids=request_ids)
         warlock_model = warlock.model_factory(schema.raw(),
                                               schemas.SchemaBasedModel)
- return warlock_model
+ return warlock_model, request_ids

     @utils.memoized_property
     def unvalidated_model(self):
@@ -183,7 +184,9 @@ class Controller(object):
         # NOTE(bcwaldon): remove 'self' for now until we have an elegant
         # way to pass it into the model constructor without conflict
         body.pop('self', None)
- return self.model(**body)
+ model, req_ids = self.model(**body)
+ request_ids.extend(req_ids)
+ return model

But again this approach is also giving problem:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "glanceclient/v2/images.py", line 188, in get
    model, req_ids = self.model(**body)
TypeError: 'tuple' object is not callable
>>>
Also this means we need to make lot of changes to every method that uses model().

This is only required if we want the request id for fetching schema from glance. If we do not want to retrieve this particular request id then there will be no issue so please suggest whether
we should exclude this request id or not.

Hi Stuart, Erno,

A potential issue with the current approach:

Let’s take an example here: images.get() API call

Here we are making two 'requests':  First the http_client.get() as a part of images.Contoller.get() itself
and second in schemas.Controller.get() via the images.Controller.model() to fetch glance schema for creating warlock model object.

Inserting  'hook' into the first request is straight forward.

To insert hook in second case we need to pass request ids list to schemas.Controller.get() from model()
and model() will inturn get this list from images.Controller.get() API method.

Now here we have a problem, we cannot pass extra arguements to model() like:

self.model(request_ids=request_ids, **body)

Alternative way: If we cannot pass request ids list to model() let’s try returning one:

@utils.memoized_property
     def model(self):
-        schema = self.schema_client.get('image')
+        request_ids = []
+        schema = self.schema_client.get('image', request_ids=request_ids)
         warlock_model = warlock.model_factory(schema.raw(),
                                               schemas.SchemaBasedModel)
-        return warlock_model
+        return warlock_model, request_ids

@utils.memoized_property
     def unvalidated_model(self):
@@ -183,7 +184,9 @@ class Controller(object):
         # NOTE(bcwaldon): remove 'self' for now until we have an elegant
         # way to pass it into the model constructor without conflict
         body.pop('self', None)
-        return self.model(**body)
+        model, req_ids = self.model(**body)
+        request_ids.extend(req_ids)
+        return model

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2016-02-03:

#13

Implementation, https://review.openstack.org/#/c/261288/

Revision history for this message

Cao ShuFeng (caosf-fnst) wrote on 2016-04-21:

#14

Hi, everyone:
Can we use this[1] to solve this problem?
And I write an example[2] about this.

Thanks

[1]: http://wrapt.readthedocs.org/en/latest/wrappers.html#object-proxy
[2]: http://paste.openstack.org/show/494907/

Revision history for this message

Abhishek Kekane (abhishek-kekane) wrote on 2016-07-15:

#15

Hi All,

Summary for current status of this specs-lite:

As described in the description we are not able to follow implementation decided in cross-project specs [1] in glance. To overcome this Stuart has suggested to use hooks in comment #5 for which we have added a patch [2].

As of now Brant has come-up with a solution where we can use object-proxy wrapper from wrapt module to return the request-id to the caller. This solution is kind of similar to what we have proposed in cross-project specs [1]. I have a POC [3] for this patch for which Erno has raised his concern for returning all request-ids in case of pagination. I am working to resolve his concern.

[1] http://specs.openstack.org/openstack/openstack-specs/specs/return-request-id.html
[2] https://review.openstack.org/#/c/261288
[3] https://review.openstack.org/#/c/316052/2

Revision history for this message

Ravi Shekhar Jethani (ravishekar-jethani) wrote on 2016-07-25:

#16

Hi All,

The concern raised by Errno [1] has been resolved. Following is the current status:

Case 1: API calls making only a single request to complete operation
This case has no issues so far. The request id generated is simply given to the caller as an
attribute of the returned object. Examples: get(), create() api calls of image api

Case 2: API calls making an arbitrary no. of requests
This is the case for which Errno had raised concern.
A way in which we can solve this is by using same request id for each request
for fetching next page of images and at last returning only one request id as this is the
only request id used for all requests. For example if we have 1000 images and page_size is 10
then there will be 100 requests and we will see 100 entries in log. But the value of
request id will be same in all cases. This single request id will be returned to caller.

Case 3: API calls making a fixed no. of requests to complete operation
This case has an issue that the caller can only get request id of only the last
request. For example update() call in image api [2]. Here the user will only get
the request id of the request made as a part of 'self.get(image_id)' call.
The request id of the first get() call or the http_client.patch() will not
be available to the caller. IMO this is OK because if the 'patch' request fails
the corresponding request will be available as part of the exception logged.

I can see two alternatives here:

a. Return request id of the intermediate patch() instead of the last get().

b. One other solution possible here is that all the requests use the
same request id as in Case 2 above.

For a. and b. solutions we need to add '**kwargs' to get() so that we can pass request id
from a previous request to the current get() call.

[1] https://review.openstack.org/#/c/316052/2

[2] https://github.com/openstack/python-glanceclient/blob/master/glanceclient/v2/images.py#L251

Kindly make suggestions.

Revision history for this message

Ravi Shekhar Jethani (ravishekar-jethani) wrote on 2016-08-09:

#17

Added new patch [1] addressing issues explained in the above note.

Revisiting note#16 we can see three cases:

Case 1: Already fixed. No issues so far.

Case 2: The new patch only returns a single request id to the caller. This request id is the first intermediate request id generated as a part of the api call. All subsequent intermediate requests use this same request id and at the end this request id is returned to the caller.

Case 3: the new patch implements solution `a` from the above note.

[1] https://review.openstack.org/#/c/352892

Abhishek Kekane (abhishek-kekane) on 2016-11-03

summary:

- Add request_ids attribute to v2 schemas
+ Add request id to returned objects

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-01-24: Fix included in openstack/python-glanceclient 2.6.0

#18

This issue was fixed in the openstack/python-glanceclient 2.6.0 release.