If it’s dangerous it’s NOT cloud computing

Having written something similar over the weekend myself (How Open Cloud could have saved Sidekick users’ skins) I was getting ready to complement this post, but fear-mongering title aside (Cloud Computing is Dangerous) I was dismayed to see this:

“Let’s call it what it is, it’s a cloud app — your data when using a Sidekick is hosted in some elses data center.”

I simply can not and will not accept this, and I’m not the only one:

“Help me out here. I’m seeing really smart people I totally respect jump on this T-Mobile issue as a ‘Cloud’ failure. Am I losing my mind?”

For a start, Sidekicks predate cloud by 1/2 a dozen years, with the first releases back in 2001. Are we saying that they were so far ahead (like Google) that we just hadn’t come up with a name for their technology yet? No. Is Blackberry cloud? No, it isn’t either. This was a legacy n-tier Internet-facing application that catastrophically failed as many such applications do. It was NOT cloud. As Alexis Richardson pointed out to Redmonk’s James Governor “if it loses your data – it’s not a cloud”.

While I know that this analogy is inconvenient for some vendors it works and it’s the best we have: Cloud is resilient in the same way that the electricity grid is resilient. Power stations do fail and we (generally) don’t hear about it. Similarly datacenters fail, get disconnected, overheat, flood, burn to the ground and so on, but these events should not cause any more than a minor interruption for end users. Otherwise how are they different from “legacy” web applications? Sure, occasionally we’ll have cloud computing “blackouts” but we’ll learn to live with them just as we do today when the electricity goes out.

As a more specific example, if an Amazon DC fails you’ll lose your EC2 instances (the cost/performance hit of running lock-step across high latency links is way too high for live redundancy). However the virtual machine image itself should be automagically replicated across multiple geographically independent availability zones by S3 so it’s just a case of starting them again. If you’re using S3 directly (or Gmail for that matter) you should never need to know that something went wrong.

But Salesforce predates cloud by almost a decade you say? This data point was a thorn in my side until I found this article (Salesforce suffers gridlock as database collapses) and the associated Oracle press release (Salesforce.com’s 267,000 Subscribers To Go On Demand With Oracle® Grid). With wording like “one of its four data hubs collapsed” in what “appears to be a database cluster crash” I’m starting to question whether Salesforce really is as “cloudy” as they are claim (and are assumed) to be. Indeed the URL I’m staring at as I use Salesforce.com now (https://na1.salesforce.com/home/home.jsp – emphasis mine) would suggest that it is anything but. NA1 is one of 1/2 a dozen different data centers and their “cloud” only appears as a single point when you log in (http://login.salesforce.com/) at which time you are redirected to the one that hosts your data. Is it any wonder then that it’s Google and Amazon that are topping the surveys now rather than Microsoft and Salesforce?

Don’t get me wrong – Salesforce.com is a great company with a great product suite that I use and recommend every day. They may well be locked in to a legacy n-tier architecture but they do a great job of keeping it running at large scale and I almost can’t believe it’s not cloud. I see it as “Software. As a Service”, bearing in mind that it’s replacing some piece of software that traditionally would have run on the desktop by delivering it over the Internet via the browser. SaaS is, if anything, a subset of cloud and I’m sure that nobody here would suggest that any old LAMP application constitutes cloud. But we digress…

I honestly thought we had this issue resolved last year, having spent an inordinate amount of time discussing, blogging, writing Wikipedia articles and generally trying to extract sense (and consensus) from the noise. I was apparently wrong as even our self-appointed spokesman has foolishly conceded that what can only really be described as gross negligence in IT operations and a crass act of stupidity is somehow a failure of the cloud computing model itself. I agree completely with Chris Hoff in that “This T-Mobile debacle is a good thing. It will help further flush out definitions and expectations of Cloud. (I can dream, right?)” – it’s high time for us to revisit and nail the issue of what is (and more importantly, what is not) cloud once and for all.

How Open Cloud could have saved Sidekick users’ skins

The cloud computing scandal of the week is looking like being the catastrophic loss of millions of Sidekick users’ data. This is an unfortunate and completely avoidable event that Microsoft’s Danger subsidiary and T-Mobile (along with the rest of the cloud computing community) will surely very soon come to regret.

There’s plenty of theories as to what went wrong – the most credible being that a SAN upgrade was botched, possibly by a large outsourcing contractor, and that no backups were taken despite space being available (though presumably not on the same SAN!). Note that while most cloud services exceed the capacity/cost ceiling of SANs and therefore employ cheaper horizontal scaling options (like the Google File System) this is, or should I say was, a relatively small amount of data. As such there is no excuse whatsoever for not having reliable, off-line backups – particularly given Danger is owned by Microsoft (previously considered one of the “big 4” cloud companies even by myself). It was a paid-for service too (~$20/month or $240/year?) which makes even the most expensive cloud offerings like Apple’s MobileMe look like a bargain (though if it’s any consolation the fact that the service was paid for rather than free may well come back to bite them by way of the inevitable class action lawsuits).

“Real” cloud storage systems transparently ensure that multiple copies of data are automatically maintained on different nodes, at least one of which is ideally geographically independent. That is to say, the fact I see the term “SAN” appearing in the conversation suggests that this was a legacy architecture far more likely to fail. This is in the same way that today’s aircraft are far safer than yesterday’s and today’s electricity grids far more reliable than earlier ones (Sidekick apparently predates Android & iPhone by some years after all). It’s hard to say with any real authority what is and what is not cloud computing though, beyond saying that “I know it when I see it, and this ain’t it”.

Whatever the root cause the result is the same – users who were given no choice but to store their contacts, calendars and other essential day-to-day data on Microsoft’s servers look like having irretrievably lost it. Friends, family, acquaintances and loved ones – even (especially?) the boy/girl you met at the bar last night – may be gone for good. People will miss appointments, lose business deals and in the most extreme cases could face extreme hardship as a result (for example, I’m guessing parole officers don’t take kindly to missed appointments with no contact!). The cost of this failure will (at least initially) be borne by the users, and yet there was nothing they could have done to prevent it short of choosing another service or manually transcribing their details.

The last hope for them is that Microsoft can somehow reverse the caching process in order to remotely retrieve copies from the devices (which are effectively dumb terminals) before they lose power; good luck with that. While synchronisation is hard to get right, having a single cloud-based “master” and a local cache on the device (as opposed to a full, first-class citizen copy) is a poor design decision. I have an iPhone (actually I have a 1G, 3G, 3GS and an iPod Touch) and they’re all synchronised together via two MacBooks and in turn to both a Time Machine backup and Mozy online backup. As if that’s not enough all my contacts are in sync with Google Apps’ Gmail over the air too so I can take your number and pretty much immediately drop it in a beer without concern for data loss. Even this proprietary system protects me from such failures.

The moral of the story is that externalised risk is a real problem for cloud computing. Most providers [try to] avoid responsibility by way of terms of service that strip away users’ rights but it’s a difficult problem to solve though because enforcing liability for anything but gross negligence can exclude smaller players from the market. That is why users absolutely must have control over their data and be encouraged if not forced to take responsibility for it.

Open Cloud simply requires open formats and open APIs – that is to say, users must have access to their data in a transparent format. Even if it doesn’t make sense to maintain a local copy on the users’ computer, there’s nothing stopping providers from pushing it to a third party storage service like Amazon S3. In fact it makes a lot of sense for applications to be separated from storage entirely. We don’t expect our operating system to provide all the functionality we’ll ever need (or indeed, any of it) so we install third party applications which use the operating system to store data. What’s to stop us doing the same in the cloud, for example having Google Apps and Zoho both saving back to a common Amazon S3 store which is in turn replicated locally or to another cloud-based service like Rackspace Cloud Files?

In any case perhaps it’s time for us to dust off and revisit the Cloud Computing Bill of Rights?

“Bare Metal” cloud infrastructure “compute” services arrive

Earlier in the year during the formation of the Open Cloud Computing Interface (OCCI) working group I described three types of cloud infrastructure “compute” services:

  • Physical Machines (“Bare Metal”) which are essentially dedicated servers provisioned on a utility basis (e.g. hourly), whether physically independent or just physically isolated (e.g. blades)
  • Virtual Machines which nowadays uses hypervisors to split the resources of a physical host amongst various guests, where both the host and each of the guests run a separate operating system instance. For more details on emulation vs virtualisation vs paravirtualisation see a KB article I wrote for Citrix a while back: CTX107587 Virtual Machine Technology Overview
  • OS Virtualisation (e.g. containers, zones, chroots) which is where a single instance of an operating system provides multiple isolated user-space instances.

While the overwhelming majority of cloud computing discussions today focus on virtual machines, the reason for my making the distinction was so as the resulting API would be capable of dealing with all possibilities. The clouderati are now realising that there’s more to life than virtual machines and that the OS is likea cancer that sucks energy (e.g. resources, cycles), needs constant treatment (e.g. patches, updates, upgrades) and poses significant risk of death (e.g. catastrophic failure) to any application it hosts“. That’s some good progress – now if only the rest of the commentators would quit referring to virtualisation as private cloud so we can focus on what’s important rather than maintaining the status quo.

Anyway such cloud services didn’t exist at the time but in France at least we did have providers like Dedibox and Kimsufi who would provision a fixed configuration dedicated server for you pretty much on the spot starting at €20/month (<€0.03/hr or ~$0.04/hr). I figured there was nothing theoretically stopping this being fully automated and exposed via a user (web) or machine (API) interface, in which case it would be indistinguishable from a service delivered via VM (except for a higher level of isolation and performance). Provided you’re billing as a utility (that is, users can consume resources as they need them and are billed only for what they use) rather than monthly or annually and taking care of all the details “within” the cloud there’s no reason this isn’t cloud computing. After all, as an end user I needn’t care if you’re providing your service using an army of monkeys, so long as you are. PCI compliance anyone?

Virtually all of the cloud infrastructure services people talk about today are based on virtual machines and the market price for a reasonably capable one is $0.10/hr or around $72.00 per month. That’s said to be 3-5x more than cost at “cloud scale” (think Amazon) so expect that price to drop as the market matures. Rackspace Cloud are already offering small Xen VMs for 1.5c/hr or ~$10/month. I won’t waste any more time talking about these offerings as everyone else already is. This will be a very crowded space thanks in no small part to VMware’s introduction of vCloud (which they claim turns any web hoster into a cloud provider) but with the hypervisor well and truly commoditised I assure you there’s nothing to see here.

On the lightweight side of the spectrum, VPS providers are a dime a dozen. These guys generally slice Linux servers up into tens if not hundreds of accounts for only a few dollars a month and take care of little more than the (shared) kernel, leaving end users to install the distribution of their choice as root. Solaris has zones and even Windows has MultiWin built in now days (that’s the technology, courtesy Citrix, that allows multiple users each having their own GUI session to coexist on the same machine – it’s primarily used for Terminal Services & Fast User Switching but applications and services can also run in their own context). This delivers most of the benefits of a virtual machine, only without the overhead and cost of running and managing multiple operating systems side by side. Unfortunately nobody’s really doing this yet in cloud but if they were you’d be able to get machines for tasks like mail relaying, spam filtering, DNS, etc. for literally a fraction of a penny per hour (VPSs start at <$5/m or around 0.7c/hr).

So the reason for my writing this post today is that SoftLayer this week announced the availability of “Bare Metal Cloud” starting at $0.15 per hour. I’m not going to give them any props for having done so thanks for their disappointing attempt to trademark the obvious and generic term “bare metal cloud” and due to unattractive hourly rates that are almost four times the price of the monthly packages by the time you take into account data allowances. I will however say that it’s good to see this prophecy (however predictable) fulfilled.

I sincerely hope that the attention will continue to move further away from overpriced and inefficient virtual machines and towards more innovative approaches to virtualisation.

Cloud Computing Crypto: GSM is dead. Long live GSM!

GSM, at least in its current form, is dead and the GSMA‘s attempts to downplay serious vulnerabilities in claiming otherwise reminds me of this rather famous Monty Python sketch about a dead parrot:

Fortunately consumers these days are savvy and have access to information with which to verify (or not) vendors’ claims about security. So when they get together and say things like “the researchers still would need to build a complex radio receiver to process the raw radio data” the more cynical of us are able to dig up 18 month old threads like this one which concludes:

So it appears you might be able to construct a GSM sniffer from a USRP board and a bunch of free software, including a Wireshark patch. (It appears that one of the pieces of free software required is called “Linux” or “GNU/Linux”, depending on which side of that particular debate you’re on :-), i.e. it works by using Linux’s tunnel device to stuff packets into a fake network interface on which Wireshark can capture.

Ok so extracting the 1’s and 0’s from the airwaves and getting them into the most convenient (open source) framework we have for the dissection of live protocols is a problem long since solved. Not only are the schematics publicly available, but devices are commercially available online for around $1,000. One would have assumed that the GSMA should have known this, and presumably they did but found it preferable to turn a blind eye to the inconvenient truth for the purposes of their release.

The real news though is in the cracking of the A5/1 encryption which purports to protect most of us users by keeping the voice channels “secure”. Conversely the control information which keeps bad guys from stealing airtime is believed to remain safe for the time being. That is to say that our conversations are exposed while the carriers’ billing is secure – an “externalisation” of risk in that the costs are borne by the end users. You can bet that were the billing channels affected then there would have been a scramble to widely deploy a fix overnight rather than this poor attempt at a cover-up.

The attack works by creating a 2Tb rainbow table in advance which allows one to simply look up a secret key rather than having to brute force it. This should be infeasible even for A5/1’s 64-bit key but “the network operators decided to pad the key with ten zeros to make processing faster, so it’s really a 54-bit key” and there are other weaknesses that combine to make this possible. A fair bit of work goes into creating the table initially, but this only needs to be done once and you can buy access to the tables as a service as well as the tables themselves for many common hashes (such as those used to protect Windows and Unix passwords – and no doubt GSM soon too!). The calculations themselves can be quite expensive but advances like OpenCL in the recently released Mac OS X (Snow Leopard) can make things a lot better/faster/cheaper by taking advantage of extremely performant graphics processing units (GPUs).

Of course thanks to cloud computing you don’t even need to do the work yourself – you can just spin up a handful of instances on a service like Amazon EC2 and save the results onto Amazon S3/Amazon EBS. You can then either leave it there (at a cost of around $300/month for 2Tb storage) and use instances to interrogate the tables via a web service, or download it to a local 2Tb drive (conveniently just hitting the market at ~$300 once off).

Cloud storage providers could make the task even easier with services like public data sets which bring multi-tenancy in the form of de-duplication benefits to common data sets. For example, if Amazon found two or more customers storing the same file they could link the two together and share the costs between all of them (they may well do this today, only if they do they keep the benefit for themselves). In the best case such benefits would be exposed to all users in which case the cost of such “public domain” data would be rapidly driven down towards zero.

Ignoring A5/2 (which gives deliberately weakened protection for countries where encryption is restricted), there’s also a downgrade attack possible thanks to A5/0 (which gives no protection) and the tendency for handsets to happily transmit in the clear rather than refusing to transmit at all or at least giving a warning as suggested by the specifications. A man in the middle just needs to be the strongest signal in the area and they can negotiate an unencrypted connection while the user is none the wiser. This is something like how analog phones used to work in that there was no encryption at all and anyone with a radio scanner could trivially eavesdrop on [at least one side of] the conversation. This vulnerability apparently doesn’t apply where a 3G signal is available, in which case the man in the middle also needs to block it.

Fortunately there’s already a solution in the form of A5/3, only it’s apparently not being deployed:

A5/3 is indeed much more secure; not only is it based on the well known (and trusted) Kasumi algorithm, but it was also developed to encrypt more of the communication (including the phone numbers of those connecting together), making it much harder for ne’er-do-wells to work out which call to intercept. A5/3 was developed, at public expense, by the European Telecommunications Standards Institute (ETSI) and is mandated by the 3G standard, though can also be applied to 2.5G technologies including GPRS and EDGE.

That GSMA consider a 2Tb data set in any way a barrier to these attacks is telling about their attitude to security, and to go as far as to compare this to a “20 kilometre high pile of books” is offensively appalling for anyone who knows anything about security. Rainbow tables, cloud computing and advances in PC hardware put this attack well within the budget of individuals (~$1,000), let alone determined business and government funded attackers. Furthermore groups like the GSM Software Project, having realised that “GSM analyzer[s] cost a sh*tload of money for no good reason” are working to “build a GSM analyzer for less than $1000” so as to, among other things, “crack A5 and proof[sic] to the public that GSM is insecure”. Then there’s the GNU Radio guys who have been funded to produce the software to drive it.

Let’s not forget too that, as Steve Gibson observes in his recent Cracking GSM Cellphones podcast with Leo Laporte: “every single cellphone user has a handset which is able to decrypt GSM“. It’s no wonder then that Apple claim jailbreaking the iPhone supports terrorists and drug dealers, but at about the same price as an iPhone ($700 for the first generation USRP board) it’s a wonder why anyone would bother messing with proprietary hardware when they can deal with open hardware AND software in the same price range. What’s most distressing though is that this is not news – according to Steve an attack was published some 6 years ago:

There’s a precomputation attack. And it was published thoroughly, completely, in 2003. A bunch of researchers laid it all out. They said, here’s how we cracked GSM. We can either have – I think they had, like, a time-complexity tradeoff. You’d have to listen to two minutes of GSM cellphone traffic, and then you could crack the key that was used to encrypt this. After two minutes you could crack it in one second. Or if you listen to two seconds of GSM cellphone traffic, then you can crack it in two minutes. So if you have more input data, takes less time; less input data, more time. And they use then tables exactly like we were talking about, basically precomputation tables, the so-called two terabytes that the GSM Alliance was pooh-poohing and saying, well, you know, no one’s ever going to be able to produce this.

Fortunately us users can now take matters into our own hands by handling our own encryption given those entrusted with doing it for us have been long since asleep at the wheel. I’ve got Skype on my MacBook and iPhone for example (tools like 3G Unrestrictor on a jailbroken iPhone allow you to break the digital shackles and use it as a real GSM alternative) and while this has built in encryption (already proving a headache for the authorities) it is, like GSM, proprietary:

Everything about this is worrisome. I mean, from day one, the fact that they were keeping this algorithm, their cipher, a secret, rather than allowing it to be exposed publicly, tells you, I mean, it was like the first thing to worry about. We’ve talked often about the dangers of relying on security through obscurity. It’s not that some obscurity can’t also be useful. But relying on the obscurity is something you never want because nothing remains obscure forever.

We all know that open systems are more secure – for example, while SSL/TLS has had its fair share of flaws it can be configured securely and is far better than most proprietary alternatives. That’s why I’m most supportive of solutions like (but not necessarily) Phil Zimmerman‘s Zfone – an open source implementation of the open ZRTP specification (submitted for IETF standardisation). This could do the same for voice as what his ironically named Pretty Good Privacy did for email many years ago (that is – those who do care about their privacy can have it). Unfortunately draft-zimmermann-avt-zrtp expired last week but let’s hope it’s not the end of the road as something urgently needs to be done about this. Here you can see it successfully encrypting a Google Talk connection (with video!):

Sure there may be some performance and efficiency advantages to be had by adding encryption to compression codecs but I rather like the separation of duties as it’s unlikely a team of encryption experts will be good at audio and video compression and vice versa.

Widespread adoption of such standards would also bring us one big step closer to data-only carriers that I predict will destroy the telco industry as we know it some time soon.

Amazon VPC trojan horse finds its mark: Private Cloud

Now we’ve all had a chance to digest the Amazon Virtual Private Cloud announcement and the dust has settled I’m joining the fray with a “scoop of interpretation“. Positioned as “a secure and seamless bridge between a company’s existing IT infrastructure and the AWS cloud” the product is (like Google’s Secure Data Connector for App Engine which preceded Amazon VPC by almost 6 months) quite simply a secure connection back to legacy infrastructure from the cloud – nothing more, nothing less. Here’s a diagram for those who prefer to visualise (Virtual Private Cloud.svg on Wikimedia Commons):

Notice that “private cloud” (at least in the sense that it is most often [ab]used today) is conspicuously absent. What Amazon and Google are clearly telling customers is that they don’t need their own “private cloud”. Rather, they can safely extend their existing legacy infrastructure into the [inter]cloud using VPN-like connections and all they need to do to get up and running is install the software provided or configure a new VPN connection (Amazon uses IPsec).

Remember, a VPN is the network you have when you’re not having a network – it behaves just like a “private network” only it’s virtual. Similarly a VPC is exactly that: a virtual “private cloud” – it behaves like a “private cloud” (in that it has a [virtual] perimeter) but users still get all the benefits of cloud computing – including trading capex for opex and leaving the details to someone else.

Also recall that the origin of the cloud was network diagrams where it was used to denote sections of the infrastructure that were somebody else’s concern (e.g. a telco). You just needed to poke your packets in one side and [hopefully] they would reappear at the other (much like the Internet). Cloud computing is like that too – everything within the cloud is somebody else’s concern, but if you install your own physical “private cloud” then that no longer holds true.

Of course the “private cloud” parade (unsurprisingly consisting almost entirely of vendors who peddle “private cloud” or their agents, often having some or all of their existing revenue streams under direct threat from cloud computing) were quick to jump on this and claim that Amazon’s announcement legitimised “private cloud”. Au contraire mes amis – from my [front row] seat the message was exactly the opposite. Rather than “legitimis[ing] private cloud” or “substantiating the value proposition” they completely undermined the “private cloud” position by providing a compelling “public cloud” based alternative. This is the mother of all trojan horses and even the most critical of commentators wheeled it right on in to the town square and paraded it to the world.

Upon hearing the announcement Christofer Hoff immediately claimed that Amazon had “peed on [our] fire hydrant” and Appistry’s Sam Charrington chimed in, raising him by claiming they had also “peed in the pool” ([ab]using one of my favourite analogies). Sam went on to say that despite having effectively defined the term Amazon’s product was not, in fact, “virtual private cloud” at all, calling into question the level of “logical isolation”. Reuven Cohen (another private cloud vendor) was more positive having already talked about it a while back, but his definition of VPC as “a method for partitioning a public computing utility such as EC2 into quarantined virtual infrastructure” is a little off the mark – services like EC2 are quarantined by default but granular in that they don’t enforce the “strong perimeter” characteristic of VPCs.

Accordingly I would (provisionally) define Virtual Private Cloud (VPC) as follows:

Virtual Private Cloud (VPC) is any private cloud existing within a shared or public cloud (i.e. the Intercloud).

This is derived from the best definition I could find for “Virtual Private Network (VPN)”.

“Twitter” Trademark in Trouble Too

Yesterday I apparently struck a nerve in revealing Twitter’s “Tweet” Trademark Torpedoed. The follow up commentary both on this blog and on Twitter itself was interesting and insightful, revealing that in addition to likely losing “tweet” (assuming you accept that it was ever theirs to lose) the recently registered Twitter trademark itself (#77166246) and pending registrations for the Twitter logo (#77721757, #77721751) are also on very shaky ground.

Trademarks 101

Before we get into details as to how this could happen lt’s start with some background. A trademark is one of three main types of intellectual property (the others being copyrights and patents) in which society grants a monopoly over a “source identifier” (e.g. a word, logo, scent, etc.) in return for being given some guarantee of quality (e.g. I know what I’m getting when I buy a bottle of black liquid bearing the Coke® branding). Anybody can claim to have a trademark but generally they are registered which makes the process of enforcing the mark much easier. The registration process itself is thus more of a sanity check – making sure everything is in order, fees are paid, the mark is not obviously broken (that is, unable to function as a source identifier) and perhaps most importantly, that it doesn’t clash with other marks already issued.

Trademarks are also jurisdictional in that they apply to a given territory (typically a country but also US states) but to make things easier it’s possible to use the Madrid Protocol to extend a valid trademark in one territory to any number of others (including the EU which is known as a “Community Trademark”). Of course if the first trademark fails (within a certain period of time) then those dependent on it are also jeopardised. Twitter have also filed applications using this process.

Moving right along, there are a number of different types of trademarks, starting with the strongest and working back:

  • Fanciful marks are created specifically to be trademarks (e.g. Kodak) – these are the strongest of all marks.
  • Arbitrary marks have a meaning but not in the context in which they are used as a trademark. We all know what an apple is but when used in the context of computers it is meaningless (which is how Apple Computer is protected, though they did get in trouble when they started selling music and encroached on another trademark in the process). Similarly, you can’t trademark “yellow bananas” but you’d probably get away with “blue bananas” or “cool bananas” because they don’t exist.
  • Suggestive marks hint at some quality or characteristic without describing the product (e.g. Coppertone for sun-tan lotion)
  • Descriptive marks describe some quality or characteristic of the product and are unregistrable in most trademark offices and unprotectable in most courts. “Cloud computing” was found to be both generic and descriptive by USPTO last year in denying Dell. Twitter is likely considered a descriptive trademark (but one could argue it’s now also generic).
  • Generic marks cannot be protected as the name of a product or service cannot function as a source identifier (e.g. Apple in the context of fruits, but not in the context of computers and music)

Twitter

Twitter’s off to a bad start already in their selection of names – while Google is a deliberate misspelling of the word googol (suggesting the enormous number of items indexed), the English word twitter has a well established meaning that relates directly to the service Twitter, Inc. provides. It’s the best part of 1,000 years old too, derived around 1325–75 from ME twiteren (v.); akin to G zwitschern:

– verb (used without object)

1. to utter a succession of small, tremulous sounds, as a bird.
2. to talk lightly and rapidly, esp. of trivial matters; chatter.
3. to titter, giggle.
4. to tremble with excitement or the like; be in a flutter.

– verb (used with object)

5. to express or utter by twittering.

– noun

6. an act of twittering.
7. a twittering sound.
8. a state of tremulous excitement.

Although the primary meaning people associate these days is that of a bird, it cannot be denied that “twitter” also means “to talk lightly and rapidly, esp. of trivial matters; chatter“. The fact it is now done over the Internet matters not in the same way that one can “talk” or “chat” over it (and telephones for that matter) despite the technology not existing when the words were conceived. Had “twitter” have tried to obtain a monopoly over a more common words like “chatter” and “chat” there’d have been hell to pay, but that’s not to say they should get away with it now.

Let’s leave the definition at that for now as twitter have managed to secure registration of their trademark (which does not imply that it is enforceable). The point is that this is the weakest type of trademark already and some (including myself) would argue that it a) should never have been allowed and b) will be impossible to enforce. To make matters worse, Twitter itself has gained an entry in the dictionary as both a noun (“a website where people can post short messages about their current activities“) and a verb (“to write short messages on the Twitter website“) as well as the AP Sytlebook for good measure. This could constitute “academic credability” or “trademark kryptonite” depending how you look at it.

Enforcement

This brings us to the more pertinent point, trademark enforcement, which can essentially be summed up as “use it or lose it”. As at today I have not been able to find any reference whatsoever, anywhere on twitter.com, to any trademark rights claimed by Twitter, Inc. Sure they assert copyright (“© 2009 Twitter”) but that’s something different altogether – I have never seen this before and to be honest I can’t believe my eyes. I expect they will fix this promptly in the wake of this post by sprinking disclaimers and [registered®] trademark (TM) and servicemark (SM) symbols everywhere, but the Internet Archive never lies so once again it’s likely too little too late. If you don’t tell someone it’s a trademark then how are they supposed to avoid infringing it?

Terms of Service

The single reference to trademarks (but not “twitter” specifically) I found was in the terms of service (which are commendably concise):

We reserve the right to reclaim usernames on behalf of businesses or individuals that hold legal claim or trademark on those usernames.

That of course didn’t stop them suspending @retweet shortly after filing for the ill-fated “tweet” trademark themselves, but that’s another matter altogether. The important point is that they don’t claim trademark rights and so far as I can tell, never have.

Logo

To rub salt in the (gaping) wound they (wait for it, are you sitting down?) offer their high resolution logos for anyone to use with no mention whatsoever as to how they should and shouldn’t be used (“Download our logos“) – a huge no-no for trademarks which must be associated with some form of quality control. Again there is no trademark claim, no ™ or ® symbols, and for the convenience of invited infringers, no less than three different high quality source formats (PNG, Adobe Illustrator and Adobe Photoshop):

Advertising

Then there’s the advertising, oh the advertising. Apparently Twitter HQ didn’t get the memo about exercising extreme caution when using your trademark; lest be the trademark holder who refers to her product or service as a noun or a verb but Twitter does both, even in 3rd-party advertisements (good luck trying to get an AdWords ad containing the word “Google”):

Internal Misuse

Somebody from Adobe or Google please explain to Twitter why it’s important to educate users that they don’t “google” or “photoshop”, rather “search using Google®” and “edit using Photoshop®”. Here’s some more gems from the help section:

  • Now that you’re twittering, find new friends or follow people you already know to get their twitter updates too.
  • Wondering who sends tweets from your area?
  • @username + message directs a twitter at another person, and causes your twitter to save in their “replies” tab.
  • FAV username marks a person’s last twitter as a favorite.
  • People write short updates, often called “tweets” of 140 characters or fewer.
  • Tweets with @username elsewhere in the tweet are also collected in your sidebar tab; tweets starting with @username are replies, and tweets with @username elsewhere are considered mentions.
  • Can I edit a tweet once I post it?
  • What does RT, or retweet, mean? RT is short for retweet, and indicates a re-posting of someone else’s tweet. This isn’t an official Twitter command or feature, but people add RT somewhere in a tweet to indicate that part of their tweet includes something they’re re-posting from another person’s tweet, sometimes with a comment of their own. Check out this great article on re-tweeting, written by a fellow Twitter user, @ruhanirabin. <- FAIL x 7

Domains

According to this domain search there are currently 6,263 domains using the word “twitter”, almost all in connection with microblogging. To put that number in perspective, if Twitter wanted to take action against these registrants given current UDRP rates for a single panelist we’re talking $9,394,500 in filing fees alone (or around 1.5 billion nigerian naira if that’s not illustrative enough for you). That’s not including the cost of preparing the filings, representation, etc. that their lawyers (Fenwick & West LLP) would likely charge them.

If you (like Doug Champigny) happen to be on the receiving end of one of these letters recently you might just want to politely but firmly point them at the UDRP and have them prove, among other things, that you were acting in bad faith (don’t bother coming crying to me if they do though – this post is just one guy’s opinion and IANAL remember ;).

I could go on but I think you get the picture – Twitter has done such a poor job of protecting the Twitter trademark that they run the risk of losing it forever and becoming a lawschool textbook example of what not to do. There are already literally thousands of products and services [ab]using their brand and while some have recently succombed to the recent batch legal threats they may well have more trouble now that people know their rights and the problem is being actively discussed. Furthermore, were it not for being extremely permissive with the Twitter brand from the outset they arguably would not have had anywhere near as large a following as they do now. It is only with the dedicated support of the users and developers they are actively attacking that they have got as far as they have.

The Problem: A Microblogging Monopoly

Initially it was my position that Twitter had built their brand and deserved to keep it, but that they had gone too far with “tweet”. Then in the process of writing this story I re-read the now infamous May The Tweets Be With You post that prompted the USPTO to reject their application hours later and it changed my mind too. Most of the media coverage took the money quote out of context but here it is in its entirity (emphasis mine):

We have applied to trademark Tweet because it is clearly attached to Twitter from a brand perspective but we have no intention of “going after” the wonderful applications and services that use the word in their name when associated with Twitter.

Do you see what’s happening here? I can’t believe I missed it on the first pass. Twitter are happy for you to tweet to your heart’s content provided you use their service. That is, they realised that outside of the network effects of having millions of users all they really do is push 1’s and 0’s around (and poorly at that). They go on to say:

However, if we come across a confusing or damaging project, the recourse to act responsibly to protect both users and our brand is important.

Today’s batch of microblogging clients are hard wired to Twitter’s servers and as a result (or vice versa) they have an effective microblogging monopoly. Twitter, Inc has every reason to be happy with that outcome and is naturally seeking to protect it – how better than to have an officially sanctioned method with which to beat anyone who dare stray from the path by allowing connections to competitors like identi.ca? That’s exactly what they mean with the “when associated with Twitter” language above and by “confusing or damaging” they no doubt mean “confusing or damaging [to Twitter, Inc]”.

The Solution: Distributed Social Networking

Distributed social networking and open standards in general (in the traditional rather than Microsoft sense) are set to change that, but not if the language society uses (and has used for hundreds of years) is granted under an official monopoly to Twitter, Inc – it’s bad enough that they effectively own the @ namespace when there are existing open standards for it. Just imagine if email was a centralised system and everything went through one [unreliable] service – brings a new meaning to “email is down”! Well that’s Twitter’s [now not so] secret strategy: to be the “pulse of the planet” (their words, not mine).

Don’t get me wrong – I think Twitter’s great and will continue to twitter and tweet as @samj so long as it’s the best microblogging platform around – but I don’t want to be forced to use it because it’s the only one there is. Twitter, Inc had ample chance to secure “twitter” as a trademark and so far as I am concerned they have long since missed it (despite securing dubious and likely unenforceable registrations). Now they need to play on a level playing field and focus on being the best service there is.

Update: Before I get falsely accused of brand piracy let me clarify one important point: so far as I am concerned while Twitter can do what they like with their logo (despite continuing to give it away to the entire Internet no strings attached), the words “twitter” and “tweet” are fair game as they have been for the last 700+ years and will be for the next 700. From now on “twitter” for me means “generic microblog” and “tweet” means “microblog update”.

If I had a product interesting enough for Twitter, Inc to send me one of their infamous C&D letters I would waste no time whatsoever in scanning it, posting it here and making fun of them for it. I’m no thief but I am a fervent believer in open standards.

An obituary for Infrastructure as a Product (IaaP)

There’s been an interesting discussion in the Cloud Computing Use Cases group this week following a few people airing grievances about the increasingly problematic term “private cloud”. I thought it would be useful to share my response with you, in which I explain where cloud came from and why it is inappropriate to associate the term “cloud computing” with most (if not all) of the hardware products on the market today.

All is not lost however – where on-site hardware is deployed (and maintained by the provider) in the process of providing a service then the term “cloud computing” may be appropriate. That said, most of what we see in the space today is little more than the evolution of virtualisation, and ultimately box pushing.

Without further ado:

On Sat, Jul 11, 2009 at 2:35 PM, Khürt Williams <khurtwilliams@gmail.com> wrote:

I am not sure I even under what private cloud means given that the Cloud term was meant to refer to how the public Internet was represented on network diagrams.  If it is inside my firewall then how is it “Cloud”?

Amen. The evolution of virtualisation is NOT cloud computing.

A few decades ago network diagrams necessarily contained every node and link because that was the only way to form a connected graph. Then telcos took over the middle part of it and consumers used a cloud symbol to denote anything they didn’t [need to] care about… they just stuff a packet in one part of the cloud and it would magically appear [most of the time] out of another. Another way of looking at it (in light of the considerable complexity and cost) is “Here be dragons” – same applies today as managing infrastructure is both complex and costly.

Cloud computing is just that same cloud getting bigger, ultimately swallowing the servers and leaving only [part of] the clients on the outside (although with VDI nothing is sacred). Consumers now have the ability to consume computing resources on a utility basis, as they do electricity (you just pay for the connection and then use what you want). Clearly this is going to happen, and probably quicker than you might expect – I admit to being surprised when one of my first cloud consulting clients, Valeo, chose Google Apps for 30,000 users over legacy solutions back in 2007. Early adopters, as usual, will need to manage risk but will be rewarded with significant cost and agility advantages, as well as immunised to an extent against “digital native” competitors.

You can be sure that when Thomas Edison rocked up 125 years or so ago with his electricity grid there were discussions very similar to those that are going on today. With generators (“Electricity as a Product”) you have to buy them, install them, fuel them, maintain them and ultimately replace them, which sustained a booming industry at the time. We all know how those conversations ended… Eastman Kodak is the only company I know of today still running their own coal fired power station (though we still use generators for remote sites and backup – this will likely also be the case with cloud). Everyone else consumes “Electricity as a Service”, paying a relatively insignificant connection fee and then consuming what they need by the kilowatt hour.

What we have today is effectively “Infrastructure as a Product” and what we’ll have tomorrow is “Infrastructure as a Service” (though I prefer the term “Intrastructure Services” and expect it to be “infrastructure” again once we’ve been successful and there is no longer any point in differentiating).

Now if legacy vendors work out how to deliver products as services (for example, by using financing to translate capex into opex and providing a full maintenance and support service) then they may have some claim to the “cloud” moniker, but that’s not what I’m seeing today. Most of the “private cloud” offerings are about hardware, software and services (as was the case in the mainframe era) rather than true utility (per hour) basis. Good luck competing with the likes of Google and Amazon while carrying the on-site handicap – I’m expecting the TCO of “private cloud” setups to average an order of magnitude or so more than their “public” counterparts (that is, $1/hr ala network.com rather than $0.10/hr ala Amazon EC2), irrespective of what McKinsey et al have to say on the subject.

In the context of the use cases, sure on-premises or “internal” cloud rates a mention but the “public/private” nomenclature is problematic for more reasons than I care to list. I personally call it “I can’t believe it’s not cloud”, but that’s not to say I leave it out of proof of concepts and pilots… I’m just careful about managing expectations. Ultimately the user and machine interfaces should be the demarcation point for such offerings and everything on the supplier side (including upfront expense) should be of no concern whatsoever to the user. I consider utility billing and the absence of capex to be absolute requirements for cloud computing and feel this ought to be addressed in any such document – suppliers might, for example, offer the complete solution at $1/hr with a minimum of 150 concurrent instance minimum (~= $100k/month).

Oh and if large enterprises want to try their hands at competing with the likes of Google and Amazon by building their own next generation datacenters then that’s fine by me, though I equate it to wanting to build your own coal-fired power station when you should be focusing on making widgets (and it should in any case be done in an isolated company/business unit). I imagine it won’t be long before shareholders will be able to string up directors for running their own infrastructure, as would be the case if they lost money over an extended outage at their own coal-fired power station when the grid was available.

NewsFlash: Trend Micro trademarks the Intercloud™

Don’t worry if you’ve never heard of TrendMicro‘s InterCloud Security Service product when it was announced as a beta back on 25 September 2006 (for general availability in 2007):

I hadn’t either until I researched my recent Intercloud post and wrote the Intercloud Wikipedia article (having created the cloud computing Wikipedia article around this time last year). The Intercloud, in case you were wondering, is a global “cloud of clouds” built on top of the Internet, a global “network of networks” – even if nothing else it’s a useful term for those of us working on cloud computing interoperability (see: An open letter to the community regarding “Open Cloud”).

Being the cynical type I thought it prudent to check the US Patent & Trademark Office (USPTO) databases and surprise, surprise, Trend Micro have pending trademark application #77018125 on the term “INTERCLOUD” in international classes 9 (hardware), 42 (software) and 45 (services). If your company, like mine, is a provider of cloud computing products and services then now’s the time to sit up and pay attention as this word will almost certainly become part of your every day vocabulary before too long (as is already the case for companies like Cisco)… much to the chagrin of those who consider it, along with cloud computing in general, just another buzzword rather than the life changing paradigm shift it is.

Just like Dell’s misguided attempt to secure a monopoly over the term “cloud computing” which I uncovered last year (see also: InformationWeek: Dell Seeks, May Receive ‘Cloud Computing’ Trademark), this application has proceeded to the Notice of Allowance phase which essentially means that it is theirs for the taking… all they need to do now is file a Statement of Use within the coming month (or ask for another 6 month extension like the one they were granted in January of this year… strange for a product that appears all but abandoned ala the infamous Psion Netbook – see: The Register: Blogger fights Psion’s claim to ‘netbook’ name).

Unless we manage to make enough noise to convince the USPTO to have a change in heart (as was the case following the uproar over Dell’s attempt on “cloud computing” – see: Dell Denied: ‘Cloud Computing’ both desciptive and generic), or convince TrendMicro do the RightThing™ and put it out of its misery (which seems unlikely as Trend Micro, like most vendors, are getting into cloud cloud computing in a big way – see: Trend Micro Bets Company Future on Cloud Computing Offering), they will likely succeed in removing this word from the public lexicon for their own exclusive use.

If, like me, you don’t like that idea either then speak up now or forever hold your peace as unlike patents, trademarks don’t expire so long as they’re in continuous use.

Gartner: VMware the next Novell? Cloud to save the world?

There’s been two more-interesting-than-usual posts over at the Gartner blogs today:

Just a Thought; Will VMware become the next Novell?

“VMware owns the market, well above 90%, and continues to come out with more and more innovative products.  VMware has a loyal following of customers who see no reason to change direction – after all, the product works, the vision is sound, and the future is clear.  But lurking in the background is this little thing called hyper-V;  not as robust, or as tested as VMware, with almost no install base, and certainly not ready for prime time in most peoples minds.  However, it will be an integral part of Windows 7, Windows Server 2008 and Windows Server 7 in 2010.”

And here’s my response:

Thanks for an insightful post – I definitely think you’re onto something here, and it’s not the first time I’ve said it either.

The thing is that the hypervisor is already commoditised. Worse, it’s free and there are various open source alternatives like Sun’s VirtualBox (which just released another major version yesterday). Then you’ve got Xen, KVM, etc. competing directly as well as physical hardware management tools coming down from above and containers/VPS’s eroding share from below. VMs may be all the rage today but the OS is overhead so there’s cloud platforms to think about too…

VMware’s main advantage is having a serious solution today which it can roll out to the large base of enterprise clients they have developed over the last decade. You can bet they’re busy making hay while the sun’s shining as it won’t be long before people realise they’re not the only show in town.

As you say it’s their market to keep, but I’m sure our enterprise clients will be happy to have a thriving competitive marketplace.

And the second:

The Cloud Will Save The World

“So, how does this all add up to the Cloud saving the world? My (admittedly clumsy) interpretation of Tainter is that as the world grows more complex, the only chance we have to head off the disintegration of modern society under the weight of complexity comes through technological leaps in the form of disruptive innovation. The hype around the Cloud provides some justification for the idea that it is disruptive. Yefim Natis and I (mostly Yefim) developed a research note in June that describes what we see as the Killer App – Application Platform-as-a-Service (APaaS) – on the horizon that will result in accelerated disruption.”

And my response:

Cloud computing is set to change the world at least as much as the Internet on which it is based did a few decades ago. Things we never would have imagined possible already are, and we’re just getting started.

That said, proponents of the precautionary principle will be fast to ask whether “disruptive innovation” is in fact “destructive innovation” and whether “accelerated disruption” is in fact “accelerated destruction”.

With accelerating change comes a raft of new risks – I for one would rather live in blissful ignorance than be interrupted by the discovery that the Large Hadron Collider was in fact capable of creating creating a black hole.

I, for one, welcome our new cloud computing overlords… now if only I had a spare $25k to sign up for the 9-week Graduate Student Program at the Singularity University which is “Preparing Humanity For Accelerating Technological Change“.