HTTP2 Expression of Interest

Here’s my (rather rushed) personal submission to the Internet Engineering Task Force (IETF) in response to their Call for Expressions of Interest in new work around HTTP; specifically, a new wire-level protocol for the semantics of HTTP (i.e., what will become HTTP/2.0), and new HTTP authentication schemes. You can also review the submissions of Facebook, FirefoxGoogle, Microsoft, Twitter and others.

[The views expressed in this submission are mine alone and not (necessarily) those of Citrix, Google, Equinix or any other current, former or future client or employer.]

My primary interest is in the consistent application of HTTP to (“cloud”) service interfaces, with a view to furthering the goals of the Open Cloud Initiative (OCI); namely widespread and ideally consistent interoperability through the use of open standard formats and interfaces.

In particular, I strongly support the use of the existing metadata channel (headers) over envelope overlays (SOAP) and alternative/ancillary representations (typically in JSON/XML) as this should greatly simplify interfaces while ensuring consistency between services. The current approach to cloud “standards” calls on vendors to define their own formats and interfaces and to maintain client libraries for the myriad languages du jour. In an application calling on multiple services this can result in a small amount of business logic calling on various bulky, often poorly written and/or unmaintained libraries. The usual counter to the interoperability problems this creates is to write “adapters” (ala ODBC) which expose a lowest-common-denominator interface, thus hiding functionality and creating an “impedence mismatch”. Ultimately this gives rise to performance, security, cost and other issues.

By using HTTP as intended it is possible to construct (cloud) services that can be consumed using nothing more than the built-in, standards compliant HTTP client. I’m not writing to discuss whether this is a good idea, but to expose a use case that I would like to see considered, and one that we have already applied with an amount of success in the Open Cloud Computing Interface (OCCI).

To illustrate the original scope, versions of HTTP (RFC 2068) included not only the Link header (recently revived by Mark Nottingham in RFC 5988) but also LINK and UNLINK verbs to manipulate it (recently proposed for revival by James Snell). Unfortunately hypertext, and in particular HTML (which includes linking in-band rather than out-of-band) arguably stole HTTP’s thunder, leaving the overwhelming majority of formats that lack in-band linking (images, documents, virtual machines, etc.) high and dry and resulting in inconsistent linking styles (HTML vs XML vs PDF vs DOC etc.). This limited the extent of web linking as well as the utility of HTTP for innovative applications including APIs. Indeed HTTP could easily and simply meet the needs of many “Semantic Web” applications, but that is beyond the scope of this particular discussion.

To illustrate by way of example, consider the following synthetic request/response for an image hosting site which incorporates Web Linking (RFC 5988), Web Categories (draft-johnston-http-category-header) and Web Attributes (yet to be published):

GET /1.jpg HTTP/1.0

HTTP/1.0 200 OK
Content-Length: 69730
Content-Type: image/jpeg
Link: http://creativecommons.org/licenses/by-sa/3.0/; rel=”license”
Link: /2.jpg; rel=”next”
Category: dog; label=”Dog”; scheme=”http://example.org/animals”
Attribute: name=”Spot”

In order to “animate” resources, consider the use of the Link header to start a virtual machine in the Open Cloud Computing Interface (OCCI):

Link: </compute/123;action=start>; rel="http://schemas.ogf.org/occi/infrastructure/compute/action#start"

The main objection to the use of the metadata channel in this fashion (beyond the application of common sense in determining what constitutes data vs metadata) is implementation issues (e.g. arbitrary limitations, i18n, handling of multiple headers, etc.) which could be largely resolved through specification. For example, the (necessary) use of e.g. RFC 2231 encoding for header values (but not keys) in e.g. RFC 5988 Web Linking gives rise to unnecessary complexity that may lead to interoperability, security and other issues which could be resolved through the specification of Unicode for keys and/or values. Another concern is the absence of features such as a standardised ability to return a collection (e.g. multiple responses). I originally suggested that HTTP 2.0 incorporate such ideas in 2009.

I’ll leave the determination of what would ultimately be required for such applications to the working group (should this use case be considered interesting by others), and while better support for performance, scalability and mobility are obviously required this has already been discussed at length. I strongly support Poul-Henning Kamp’s statement that “I think it would be far better to start from scratch, look at what HTTP/2.0 should actually do, and then design a simple, efficient and future proof protocol to do just that, and leave behind all the aggregations of badly thought out hacks of HTTP/1.1.” (and agree that we should incorporate the concept of a “HTTP Router”) as well as Tim Bray’s statement that: “I’m inclined to err on the side of protecting user privacy at the expense of almost all else” (and believe that we should prevent eavesdroppers from learning anything about an encrypted transaction; something we failed to do with DNSSEC even given alternatives like dnscurve that ensure confidentiality as well as integrity).

Is HTTP the HTTP of cloud computing?

Ok so after asking Is OCCI the HTTP of cloud computing? I realised that the position may have already been filled and that the question was more Is AtomPub already the HTTP of cloud computing?

After all my strategy for OCCI was to follow Google’s example with GData by adding some necessary functionality (a search interface, caching directives, resource-specific attributes, etc.). Most of the heavy lifting was actually being done by AtomPub, thus avoiding a huge amount of tedious and error-prone protocol writing (around 20,000 words of it) – something which OGF and the OCCI working group isn’t really geared up for anyway. This is clearly a workable and well-proven approach as it as been adopted strategically by both Microsoft and Google and also tactically by Salesforce and IBM, among others. Best of all adding things like queries and versioning is a manageable workload while starting from scratch is most certainly not.

But what if there were an easier way? Recall that the problem we are trying to solve is exposing a flexible interface to an arbitrarily large collection of interconnected compute, storage and network resources. We need to be able to describe and manipulate the resources (CRUD), associate them with each other via rich links (e.g. links with attributes like local identifiers – eth0, sda, etc.) and change their state (start, stop, restart, etc.), among other things.

Representational State Transfer (REST)

Actually we’re not talking about exposing the resources themselves (that would be impossible) but various representations of those resources – like Plato’s shadows on the cave walls – this is the “REpresentational” in “REpresentational State Transfer (REST)”. There’s an infinite number of possible representations so it’s impossible to try to capture them all now, but here’s some examples:

  • An Open Virtualisation Format (OVF) serialisation of a compute resource
  • A platform-specific descriptor file (e.g. VMX)
  • A complete archive of the virtual machine with its dependencies (OVA)
  • A graphical image of the console at a given point in time (‘snapshot’)
  • A video stream of the console for archiving/audit purposes (ala Citrix’s Project Iris)
  • The console itself (e.g. SSH, ICA, RDP, VNC)
  • Build documentation (e.g. PDF, ODF)
  • Esoteric enterprise requirements (e.g. NMS configuration)

It doesn’t take a rocket scientist to spot the correlation between this and HTTP’s existing content negotiation functionality (whereby a client can ask for a specific representation of a given resource – e.g. HTML vs PDF) so this is already pretty much solved for us (see HTTP’s Accept: header for the details). For bonus points this information should be exposed in the URI as it’s not always possible or convenient to set headers ala:

Web Linking

But what about the links? As I explained yesterday the web is built on links embedded in HTML documents using the A tag. Atom also provides enhanced linking functionality via the LINK element, where it is also possible to specify content types, languages, etc. In this case however we want to allow resources to be arbitrary types and more often than not we won’t have the ability to link within the payload itself. This leaves us with two options: put the links in the payload anyway by relying on a meta-model like Atom (or one we roll ourselves) or find some way to represent them within HTTP itself.

Enter HTTP headers which are also extensible and, as it turns out, in the process of being extended (or at least refined) to handle this very requirement by fellow down under, Mark Nottingham. See the “Web Linking” IETF Internet-Draft (draft-nottingham-http-link-header, at the time of writing version 05) for the nitty gritty details and the ietf-http-wg list for some current discussions. Basically it clarifies the existing Link: headers and the result looks something like this:

Link: <http://example.com/TheBook/chapter2>; rel="previous"; title="previous chapter">

The Link: header itself is also extensible so we can faithfully represent our model by adding e.g. the local device name when linking storage and network resources to compute resources and other requisite attributes. It would be helpful if the content-type were also specified (Atom allows for multiple links of the same relation provided the content-type differs for example) but language is already covered by HTTP (it doesn’t seem useful to advertise French links to someone who already asked to speak English).

It’s also interesting to note that earlier versions of the HTTP RFCs actually [poorly] specified both the Link: headers as well as LINK and UNLINK methods for maintaining links between web resources. John Pritchard had a crack at clarification in the Efficient HyperLink Maintenance for HTTP I-D but like most I-Ds this one seems to have died after 6 months, and with it the methods themselves. It seems to me that adding HTTP methods at this time is a drastic (and almost certainly infeasible) action, especially for something that could just as easily be accomplished via headers ala Set-Cookie: (too bad the I-D doesn’t specify how to add/delete/modify links!). In the simplest sense a Link: header appearing in a PUT or POST could replace the existing one(s) but something more elegant for acting on individual links would be nice – probably a discussion worth having on the ietf-http-wg list.

Organisation of Information

Looking back to Atom for a second we’re still missing some key functionality:

  • Atom id -> HTTP URL
  • Atom updated -> HTTP Last-Modified: Header
  • Atom title and summary -> Atom/HTTP Slug: Header or equivalent
  • Atom link -> HTTP Link: Header
  • Atom category -> ???

Houston, we have a problem. OCCI use cases range from embedded hypervisors exposing a single resource to a single entry-point for an entire enterprise or the “Great Global Grid” – we need a way to organise, categories and search for the information, likely including:

  • Free text search via a Google-style “?q=firewall” syntax
  • Taxonomy via categories (already done for Atom) for things like “Operating System” and “Data Center”
  • Folksonomy via [user] tags (already done for Atom and bearing in mind that tag spaces are cool) for things like “testlab”

Fortunately the good work already done in this area for Atom would be realatively easy to port to a Category: HTTP header, following the Link: header example above. In the mean time a standard search interface (including category support) is trivial and thanks to Google, already done.

Structured Data Formats

HTML also resolves another pressing issue – what format to use for submitting key-value pairs (which constitutes a large part of what we need to do with OCCI). It gives us two options:

The advantages of being able to create a resource from a web form simply by POSTing to the collection of resources (e.g. http://example.com/compute), and with HTML 5 by PUTting the resource in place directly (e.g. http://example.com/compute/<uuid&gt;) are immediately obvious. Not only does this help make the human and programmable web one and the same (which in turn makes it much easier for developers/users to kick the tyres and understand the API) but it means that scripting even advanced tasks with curl/wget would be trivial. Plus there’s no place for time-wasting religious arguments about angle brackets (XML) over curly braces (JSON).

RESTful State Machines

Something else which has not sat well with me until I spent the weekend ingesting RESTful Web Services book (by Leonard Richardson and Sam Ruby) was the “actuator” concept we picked up from the Sun Cloud APIs. This breaks away from RESTful principles by exposing an RPC-style API for triggering state changes (e.g. start, stop, restart). Granted it’s an improvement on the alternative (GETting a resource and PUTting it back with an updated state) as Tim Bray explains in RESTful Casuistry (to which Roy Fielding and Bill de hÓra also responded), but it still “feels funky”. Sure it doesn’t make any sense to try to “force” a monitored status to some other value (for example setting a “state” attribute to “running”), especially when we can’t be sure that’s the state we’ll get to (maybe there will be an error or the transition will be dependent on some outcome over which we have no control). Similarly it doesn’t make much sense to treat states as nouns, for example adding a “running” state to a collection of states (even if a resource can be “running” and “backing up” concurrently). But is using URLs as “buttons” representing verbs/transitions the best answer?

What makes more sense [to me] is to request a transition and check back for updates (e.g. by polling or HTTP server push). If it’s RESTful to POST comments to an article (which in addition to its own contents acts as a collection of zero or more comments) then POSTing a request to change state to a [sub]resource also makes sense. As a bonus these can be parametrised (for example a “resize” request can be accompanied with a “size” parameter and a “stop” request sent with clarification as to whether an “ACPI Off” or “Pull Cord” is required). Transitions that take a while, like “format” on a storage resource, can simply return HTTP 201 Accepted so we’ve got support for asynchronous actions as well – indeed some requests (e.g. “backup”) may not even be started immediately. We may also want to consider using something like Post Once Exactly (POE) to ensure that requests like “restart” aren’t executed repeatedly and that we can cancel requests that the system hasn’t had a chance to deal with yet.

Exactly how this should look in terms of URL layout I’m not sure (perhaps http://example.com/<resource>/requests) but being able to enumerate the possible actions as well as acceptable parameters (e.g. an enum for variations on “stop” or a range for “resize”) would be particularly useful for clients.

Collections

This is all well and good for individual resources, but collections are still a serious problem. There are many use cases which involve retrieving an arbitrarily large number of resources and making a HTTP request for each (as well as requests for enumeration etc.) doesn’t make sense. More importantly, it doesn’t scale – particularly in enterprise environments where requests via proxies and filters can suffer from high latency (if not low bandwidth).

One potential solution is to strap multiple HTTP message entities together as a multipart document, but that’s hardly clean and results in some hairy coding on the client side (e.g. manual manipulation of HTTP messages that would otherwise be fully automated). The best solution we currently have for this problem (as evidenced by widespread deployment) is AtomPub so I’m still fairly sure it’s going to have to make an appearance somewhere, even if it doesn’t wrap all of the resources by default.

Is AtomPub already the HTTP of cloud computing?

A couple of weeks ago I asked Is OCCI the HTTP of cloud computing? I explained the limitations of HTTP in this context, which basically stem from the fact that the payloads it transfers are opaque. That’s fine when they’re [X]HTML because you can express links between resources within the resources themselves, but what about when they’re some other format – like OVF describing a virtual machine as may well be the case for OCCI? If I want to link between a virtual machine and its network(s) and/or storage device(s) then I’m out of luck… I need to either find an existing meta-model or roll my own from scratch.

That’s where Atom (or more specifically, AtomPub) comes in… in the simplest sense it adds a light, RESTful XML layer to HTTP which you can extend as necessary. It provides for collections (a ‘feed’ of multiple resources or ‘entries’ in a single HTTP message) as well as a simple meta-model for linking between resources, categorising them, etc. It also gives some metadata relating to unique identifiers, authors/contributors, caching information, etc., much of which can be derived from HTTP (e.g. URL <-> Atom ID, Last-Modified <-> updated).

Although it was designed with syndication in mind, it is a very good fit for creating APIs, as evidenced by its extensive use in:

I’d explain in more detail but Mohanaraj Gopala Krishnan has done a great job already in his AtomPub, Beyond Blogs presentation:

The only question that remains is whether or not this is the best we can do… stay tuned for the answer. The biggest players in cloud computing seem to think so (except Amazon, whose APIs predate Google’s and Microsoft’s) but maybe there’s an even simpler approach that’s been sitting right under our noses the whole time.

Is OCCI the HTTP of Cloud Computing?

The Web is built on the Hypertext Transfer Protocol (HTTP), a client-server protocol that simply allows client user agents to retrieve and manipulate resources stored on a server. It follows that a single protocol could prove similarly critical for Cloud Computing, but what would that protocol look like?

The first place to look for the answer is limitations in HTTP itself. For a start the protocol doesn’t care about the payload it carries (beyond its Internet media type, such as text/html), which doesn’t bode well for realising the vision of the [Semantic] Web as a “universal medium for the exchange of data”. Surely it should be possible to add some structure to that data in the simplest way possible, without having to resort to carrying complex, opaque file formats (as is the case today)?

Ideally any such scaffolding added would be as light as possible, providing key attributes common to all objects (such as updated time) as well as basic metadata such as contributors, categories, tags and links to alternative versions. The entire web is built on hyperlinks so it follows that the ability to link between resources would be key, and these links should be flexible such that we can describe relationships in some amount of detail. The protocol would also be capable of carrying opaque payloads (as HTTP does today) and for bonus points transparent ones that the server can seamlessly understand too.

Like HTTP this protocol would not impose restrictions on the type of data it could carry but it would be seamlessly (and safely) extensible so as to support everything from contacts to contracts, biographies to books (or entire libraries!). Messages should be able to be serialised for storage and/or queuing as well as signed and/or encrypted to ensure security. Furthermore, despite significant performance improvements introduced in HTTP 1.1 it would need to be able to stream many (possibly millions) of objects as efficiently as possible in a single request too. Already we’re asking a lot from something that must be extremely simple and easy to understand.

XML

It doesn’t take a rocket scientist to work out that this “new” protocol is going to be XML based, building on top of HTTP in order to take advantage of the extensive existing infrastructure. Those of us who know even a little about XML will be ready to point out that the “X” in XML means “eXtensible” so we need to be specific as to the schema for this assertion to mean anything. This is where things get interesting. We could of course go down the WS-* route and try to write our own but surely someone else has crossed this bridge before – after all, organising and manipulating objects is one of the primary tasks for computers.

Who better to turn to for inspiration than a company whose mission it is to “organize the world’s information and make it universally accessible and useful”, Google. They use a single protocol for almost all of their APIs, GData, and while people don’t bother to look under the hood (no doubt thanks to the myriad client libraries made available under the permissive Apache 2.0 license), when you do you may be surprised at what you find: everything from contacts to calendar items, and pictures to videos is a feed (with some extensions for things like searching and caching).

OCCI

Enter the OGF’s Open Cloud Computing Interface (OCCI) whose (initial) goal it is to provide an extensible interface to Cloud Infrastructure Services (IaaS). To do so it needs to allow clients to enumerate and manipulate an arbitrary number of server side “resources” (from one to many millions) all via a single entry point. These compute, network and storage resources need to be able to be created, retrieved, updated and deleted (CRUD) and links need to be able to be formed between them (e.g. virtual machines linking to storage devices and network interfaces). It is also necessary to manage state (start, stop, restart) and retrieve performance and billing information, among other things.

The OCCI working group basically has two options now in order to deliver an implementable draft this month as promised: follow Amazon or follow Google (the whole while keeping an eye on other players including Sun and VMware). Amazon use a simple but sprawling XML based API with a PHP style flat namespace and while there is growing momentum around it, it’s not without its problems. Not only do I have my doubts about its scalability outside of a public cloud environment (calls like ‘DescribeImages’ would certainly choke with anything more than a modest number of objects and we’re talking about potentially millions) but there are a raft of intellectual property issues as well:

  • Copyrights (specifically section 3.3 of the Amazon Software License) prevent the use of Amazon’s “open source” clients with anything other than Amazon’s own services.
  • Patents pending like #20070156842 cover the Amazon Web Services APIs and we know that Amazon have been known to use patents offensively against competitors.
  • Trademarks like #3346899 prevent us from even referring to the Amazon APIs by name.

While I wish the guys at Eucalyptus and Canonical well and don’t have a bad word to say about Amazon Web Services, this is something I would be bearing in mind while actively seeking alternatives, especially as Amazon haven’t worked out whether the interfaces are IP they should protect. Even if these issues were resolved via royalty free licensing it would be very hard as a single vendor to compete with truly open standards (RFC 4287: Atom Syndication Format and RFC 5023: Atom Publishing Protocol) which were developed at IETF by the community based on loose consensus and running code.

So what does all this have to do with an API for Cloud Infrastructure Services (IaaS)? In order to facilitate future extension my initial designs for OCCI have been as modular as possible. In fact the core protocol is completely generic, describing how to connect to a single entry point, authenticate, search, create, retrieve, update and delete resources, etc. all using existing standards including HTTP, TLS, OAuth and Atom. On top of this are extensions for compute, network and storage resources as well as state control (start, stop, restart), billing, performance, etc. in much the same way as Google have extensions for different data types (e.g. contacts vs YouTube movies).

Simply by standardising at this level OCCI may well become the HTTP of Cloud Computing.