Launchpad's WADL collections don't specify the entry type of which it is a collection

Bug #650967 reported by Manish Sinha (मनीष सिन्हा)
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
High
Unassigned

Bug Description

I am actually working on a WADL to .NET assembly generator, the development is pretty slow as I keep on hitting regular shortcomings with WADL for Launchpad.

I am trying to walk the links and references in the WADL

So the top list of resources is defined by
<wadl:resources base="https://api.edge.launchpad.net/1.0/">
    <wadl:resource path="" type="#service-root"/>
  </wadl:resources>

so we have the resource "service-root"

for which we have

<wadl:resource_type id="service-root">
    <wadl:doc>The root of the web service.</wadl:doc>
    <wadl:method name="GET" id="service-root-get">
      <wadl:response>
        <wadl:representation href="#service-root-json"/>
        <wadl:representation mediaType="application/vnd.sun.wadl+xml" id="service-root-wadl"/>
      </wadl:response>
    </wadl:method>
  </wadl:resource_type>

From this I get "service-root-json"

for which I have

<wadl:representation mediaType="application/json" id="service-root-json">

    <wadl:param style="plain"
                path="$['languages_collection_link']"
                name="languages_collection_link">
      <wadl:link resource_type="https://api.edge.launchpad.net/1.0/#languages"/>
    </wadl:param>
.....
</wadl:representation>

so from here we get "languages"

for which we move next to

<wadl:resource_type id="languages">
    <wadl:doc xmlns="http://www.w3.org/1999/xhtml">
      The collection of languages.
    </wadl:doc>
    <wadl:method name="GET" id="languages-get">
      <wadl:response>
        <wadl:representation
  href="https://api.edge.launchpad.net/1.0/#language-page"/>
        <wadl:representation
  mediaType="application/vnd.sun.wadl+xml"
  id="languages-wadl"/>
      </wadl:response>
    </wadl:method>
.....
</wadl:resource>

from here we get "language-page"

next we goto language-page

<wadl:representation mediaType="application/json"
                       id="language-page">

    <wadl:param style="plain" name="resource_type_link" path="$['resource_type_link']">
      <wadl:link/>
    </wadl:param>

    <wadl:param style="plain" name="total_size" path="$['total_size']" required="true"/>

    <wadl:param style="plain" name="start" path="$['start']" required="true"/>

    <wadl:param style="plain" name="next_collection_link" path="$['next_collection_link']">
      <wadl:link resource_type="#language-page-resource"/>
    </wadl:param>

    <wadl:param style="plain" name="prev_collection_link" path="$['prev_collection_link']">
      <wadl:link resource_type="#language-page-resource"/>
    </wadl:param>

    <wadl:param style="plain" name="entries" path="$['entries']" required="true"/>

    <wadl:param style="plain" name="entry_links" path="$['entries'][*]['self_link']">
      <wadl:link resource_type="https://api.edge.launchpad.net/1.0/#language"/>
    </wadl:param>
  </wadl:representation>

So here, the language-page does not specify that it is a collection/list over "language".
I know I have
<wadl:link resource_type="https://api.edge.launchpad.net/1.0/#language"/>
but in this we also have
<wadl:link resource_type="#language-page-resource"/>
AND
<wadl:link resource_type="#language-page-resource"/>

This is ambiguous. How will the tool know which link to choose?
How will the tool I am working on traverse further?

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

The only way I could think of is to generate the create a customized tool specifically for Launchpad's WADL.

tags: added: traversal wadl
Revision history for this message
Leonard Richardson (leonardr) wrote :

This might be easy to fix. We can add "rel" attributes "next" and "prev" to the two links to language-page-resource. Those are well-known relationship names defined in the HTML standard, and your tool can eliminate them as unsuitable for this purpose (by understanding what they're for and using them for navigation).

The 'language' link will be the only remaining link. And I think we could use the well-known "contents" relationship to express the relationship between an entry and the collection containing it.

   <wadl:param style="plain" name="entry_links" path="$['entries'][*]['self_link']">
      <wadl:link resource_type="https://api.edge.launchpad.net/1.0/#language" rev="contents"/>
    </wadl:param>

(Note that I'm using rev instead of rel. The "contents" relationship means "table of contents", not "contents of a list". Using rel="contents" would imply that a 'language' was a table of contents for a 'language-page'; we want the opposite.)

The only thing Launchpad-specific here is the idea that for a JSON representation, the WADL "path" attribute should be interpreted as JSONPath rather than in some other way. "next", "prev", and "contents" all mean the same thing as in HTML and standards like AtomPub that use the well-known HTML relationships. (However, I've never seen anyone use "contents" in this way.)

If the JSONPath thing is too much, we could hard-code the link in the <representation> tag:
<wadl:param name="contents_type" value="https://api.edge.launchpad.net/1.0/#language">
 <wadl:link rev="contents">
</wadl:param>

I'm pretty sure that's legal. Which of these would work for you?

Revision history for this message
Leonard Richardson (leonardr) wrote :

On second thought, I don't think "contents" is the right relationship for that last example. It implies that language-page is the "table of contents" for the language resource type itself, not for specific language entries. I think we'd have to come up with something new, define it in a metadata file, and link to that metadata file in the 'profile' attribute.

So you'd still have something Launchpad-specific (the semantics of this new relationship), but it would be somewhat standardized. Or we could go with the JSONPath thing we have now.

Changed in launchpad-foundations:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Well, I thought about it again, but choosing the <param> out of so many would be a tough work. I would have preferred if the representation itself told it's containing type.

I first looked at href, but I don't think href is meant for that. Is there any way in which I can un-ambiguously choose the <param> tag? Or an attribute?

in WADL how are containers defined?

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

I had a look at the language-page representation again.

That collection makes perfect sense for any developer if we know what it is. Sadly tools are not so intelligent. The only way I can find out is by looking at the mediaType attribute. If it is not present, then I will consider it as the final container. If it's set to json or any other type, then hunting for
1) First link
2) link inside the user provided value foo such that <param name="foo"> <link/></param>

This will find out the link's resource_type and return an array of that entry. The conversion from json/xml/s-expressions/your_fav_type_here can be done via an interface. The user has to provide the implementation.

Revision history for this message
Leonard Richardson (leonardr) wrote :

WADL does not define any notion of "container" at all. That's why I came up with the odd-looking <param> tag.

If you look at the WADL description of AtomPub in the WADL standard, you'll see that it never says that feeds contain entries, even though that's the whole point of a feed. If you get a feed *document*, using a client that understands the 'element' attribute, you'll see pretty quickly that all the <entry> tags are inside a <feed> tag. But you can't make any general assumptions about what contains what, without knowledge of the Atom media type.

HTML *does* define a notion of "container", or at least "contained". That's why I suggested using "contents" as the rev attribute of the existing, strange <param> tag. You'll usually be okay using the standard resource relationships defined in HTML.

If I add this rev attribute, you can look for the <param> to which the collection is a table of contents, and the type of that parameter will be the type of object in the collection. Just as if I add the "next" attribute to one of the other <param> tags, you'll know which link to follow to get the next page of the collection. You will be giving your client special knowledge, but it will be knowledge based on standard resource relationships, and it should be at least somewhat reusable.

Again, that would look like this:

   <wadl:param style="plain" name="entry_links" path="$['entries'][*]['self_link']">
      <wadl:link resource_type="https://api.edge.launchpad.net/1.0/#language" rev="contents"/>
    </wadl:param>

You'll know that whichever link has rev="contents" is a link to the contents of the collection, and its type is the type of those contents. You don't even have to decode the JSONPath or follow the links, actually.

I'm afraid I can't do much more from you besides adding these semantic hints to the generated WADL--I'm busy with other projects.

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Thanks Leonard, I think I will get it done this way.

It looks WADL schema is just too flexible and ambiguous for any tool to traverse and generate the code/assembly.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 650967] Re: Launchpad's WADL collections don't specify the entry type of which it is a collection

What would work best for a .net code generator? Are there alternative
standards for declaring REST APIs with existing toolchains?

Revision history for this message
Gary Poster (gary) wrote :

Alternative standards: AIUI, no. Leonard and I have discussed this, driven in part by the fact that the REST community largely seems to have rejected WADL.

A nice example is http://bitworking.org/news/193/Do-we-need-WADL . I was unconvinced by some of his arguments, perhaps in part because we are using WADL, with some success AFAICT, to do what he says is unnecessary or a bad idea: generate a programmatic API on the fly. His alternative suggestions are to reuse an existing spec (ATOM or HTML, for instance), or write an RFC.

Writing an RFC is not the right solution for our use case.

Leonard and I have talked about trying to reuse an existing spec. My understanding is that this would be expensive to do, and result in something that would again need our own extensions, or at least semantics, to fully define how to construct a programmatic API from the result. IOW, it would be an expensive change for little to no benefit.

We could add the rev attributes that Leonard describes, and Manish could implement them, and there is some precedence. However, the value is questionable, in my estimation--we would be *setting* precedent by using the rev attributes in WADL this way.

From a perhaps-deceptive 10,000-foot view, I personally think Manish should keep these extended semantics out of his wadl processor, and make his C# equivalents to launchpadlib and lazr.restfulclient be aware of the containment and any other additional extra-WADL semantics we have. That's what we do in our Python code.

If adding the rev attributes makes this more palatable or easier for Manish, I'm happy to see that happen. Meanwhile, AFAIK we're already on the cutting edge, with no other way forward except those of our own devising.

Revision history for this message
Robert Collins (lifeless) wrote :

Thanks Gary, appreciate that summary.

Revision history for this message
Leonard Richardson (leonardr) wrote :

"WADL schema is just too flexible and ambiguous for any tool to traverse and generate the code/assembly."

Exactly. WADL is explicitly not designed as a substrate for generated code. It's a guide to the documents served by the web service, describing how they link to each other and how you can use them to formulate your HTTP requests. I think we've talked about this before, but it bears repeating.

I suspect you want something like WSDL: a document that uses a standardized type system to describe a distributed object model which you can map onto client-side classes in a statically compiled language.

The main anti-WADL argument is that WADL is just a friendlier version of WSDL, a way of writing programmatic "API" clients instead of web-browser-like clients that just process documents. The trend, insofar as there is a trend, is away from documents that describe anything resembling client-side data structures, and towards documents that describe *themselves*, the way HTML documents do. (At this point, the main alternative to WADL is HTML5.)

You're understandably concerned about putting Launchpad-specific code in your WADL tool. I would say that by introducing any notion of "collection" into your tool, you've already introduced Launchpad-specific code. WADL doesn't define "collection", so of course it doesn't define "type of the collection". We respect this distinction by keeping collection-aware code in lazr.restfulclient, not in wadllib.

Now, using our WADL as though it were WSDL should be possible, because the documents we send _do_ correspond pretty closely to classical data structures. We *do* serve collections, and each collection *does* have a homogenous "type". And I've done what I can to express these facts using WADL's vocabulary.

WADL makes it easy to point out the interesting parts of a document, and so I'm able to say: "if you look *here* in the bug-page document, you'll find a bunch of links to bug documents". That's my way of saying "a bug-page is a collection of bugs". I can't say that that's what bug-page "is", because as far as WADL is concerned, a bug-page is just a JSON document that contains certain links.

There's some more I could do to distinguish the various links from each other--that's the rel/rev stuff I've talked about earlier in this bug. But if you want a "collection" that has a homogenous "type", you'll either need to hard-code those ideas into your code, or become comfortable with relating documents to each other through their rel and rev attributes.

Curtis Hovey (sinzui)
Changed in launchpad:
status: Confirmed → Triaged
Revision history for this message
Robert Collins (lifeless) wrote :

Looking back on this it seems to me that there isn't even a guarantee that our collections will, or should, be homogeneous. Certainly some of them already are not homogeneous (the things a builder built are of different types). I'm going to mark this as won't fix because of this: The type of a collection is 'collection of links', and the type of a dereferenced thing is self describing when returned. Clients that need type data should build that up on top of the wadl, and put safeguards to handle unexpected heterogeneity.

Changed in launchpad:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.