15 July 2012

HTTP2 Expression of Interest

Here's my (rather rushed) personal submission to the Internet Engineering Task Force (IETF) in response to their Call for Expressions of Interest in new work around HTTP; specifically, a new wire-level protocol for the semantics of HTTP (i.e., what will become HTTP/2.0), and new HTTP authentication schemes. You can also review the submissions of Facebook, FirefoxGoogle, Microsoft, Twitter and others.

[The views expressed in this submission are mine alone and not (necessarily) those of Citrix, Google, Equinix or any other current, former or future client or employer.]

My primary interest is in the consistent application of HTTP to ("cloud") service interfaces, with a view to furthering the goals of the Open Cloud Initiative (OCI); namely widespread and ideally consistent interoperability through the use of open standard formats and interfaces.

In particular, I strongly support the use of the existing metadata channel (headers) over envelope overlays (SOAP) and alternative/ancillary representations (typically in JSON/XML) as this should greatly simplify interfaces while ensuring consistency between services. The current approach to cloud "standards" calls on vendors to define their own formats and interfaces and to maintain client libraries for the myriad languages du jour. In an application calling on multiple services this can result in a small amount of business logic calling on various bulky, often poorly written and/or unmaintained libraries. The usual counter to the interoperability problems this creates is to write "adapters" (ala ODBC) which expose a lowest-common-denominator interface, thus hiding functionality and creating an "impedence mismatch". Ultimately this gives rise to performance, security, cost and other issues.

By using HTTP as intended it is possible to construct (cloud) services that can be consumed using nothing more than the built-in, standards compliant HTTP client. I'm not writing to discuss whether this is a good idea, but to expose a use case that I would like to see considered, and one that we have already applied with an amount of success in the Open Cloud Computing Interface (OCCI).

To illustrate the original scope, versions of HTTP (RFC 2068) included not only the Link header (recently revived by Mark Nottingham in RFC 5988) but also LINK and UNLINK verbs to manipulate it (recently proposed for revival by James Snell). Unfortunately hypertext, and in particular HTML (which includes linking in-band rather than out-of-band) arguably stole HTTP's thunder, leaving the overwhelming majority of formats that lack in-band linking (images, documents, virtual machines, etc.) high and dry and resulting in inconsistent linking styles (HTML vs XML vs PDF vs DOC etc.). This limited the extent of web linking as well as the utility of HTTP for innovative applications including APIs. Indeed HTTP could easily and simply meet the needs of many "Semantic Web" applications, but that is beyond the scope of this particular discussion.

To illustrate by way of example, consider the following synthetic request/response for an image hosting site which incorporates Web Linking (RFC 5988), Web Categories (draft-johnston-http-category-header) and Web Attributes (yet to be published):
GET /1.jpg HTTP/1.0

HTTP/1.0 200 OK
Content-Length: 69730
Content-Type: image/jpeg
Link: http://creativecommons.org/licenses/by-sa/3.0/; rel="license"
Link: /2.jpg; rel="next"
Category: dog; label="Dog"; scheme="http://example.org/animals"
Attribute: name="Spot"
In order to "animate" resources, consider the use of the Link header to start a virtual machine in the Open Cloud Computing Interface (OCCI):
Link: /compute/123;action=start;
  rel="http://schemas.ogf.org/occi/infrastructure/compute/action#start"
Note: Links should be enclosed in angle brackets but Blogger strips them out.

The main objection to the use of the metadata channel in this fashion (beyond the application of common sense in determining what constitutes data vs metadata) is implementation issues (e.g. arbitrary limitations, i18n, handling of multiple headers, etc.) which could be largely resolved through specification. For example, the (necessary) use of e.g. RFC 2231 encoding for header values (but not keys) in e.g. RFC 5988 Web Linking gives rise to unnecessary complexity that may lead to interoperability, security and other issues which could be resolved through the specification of Unicode for keys and/or values. Another concern is the absence of features such as a standardised ability to return a collection (e.g. multiple responses). I originally suggested that HTTP 2.0 incorporate such ideas in 2009.

I'll leave the determination of what would ultimately be required for such applications to the working group (should this use case be considered interesting by others), and while better support for performance, scalability and mobility are obviously required this has already been discussed at length. I strongly support Poul-Henning Kamp's statement that "I think it would be far better to start from scratch, look at what HTTP/2.0 should actually do, and then design a simple, efficient and future proof protocol to do just that, and leave behind all the aggregations of badly thought out hacks of HTTP/1.1." (and agree that we should incorporate the concept of a "HTTP Router") as well as Tim Bray's statement that: "I’m inclined to err on the side of protecting user privacy at the expense of almost all else" (and believe that we should prevent eavesdroppers from learning anything about an encrypted transaction; something we failed to do with DNSSEC even given alternatives like dnscurve that ensure confidentiality as well as integrity).

No comments:

Post a Comment

Note: only a member of this blog may post a comment.