How Open Cloud could have saved Sidekick users’ skins

The cloud computing scandal of the week is looking like being the catastrophic loss of millions of Sidekick users’ data. This is an unfortunate and completely avoidable event that Microsoft’s Danger subsidiary and T-Mobile (along with the rest of the cloud computing community) will surely very soon come to regret.

There’s plenty of theories as to what went wrong – the most credible being that a SAN upgrade was botched, possibly by a large outsourcing contractor, and that no backups were taken despite space being available (though presumably not on the same SAN!). Note that while most cloud services exceed the capacity/cost ceiling of SANs and therefore employ cheaper horizontal scaling options (like the Google File System) this is, or should I say was, a relatively small amount of data. As such there is no excuse whatsoever for not having reliable, off-line backups – particularly given Danger is owned by Microsoft (previously considered one of the “big 4” cloud companies even by myself). It was a paid-for service too (~$20/month or $240/year?) which makes even the most expensive cloud offerings like Apple’s MobileMe look like a bargain (though if it’s any consolation the fact that the service was paid for rather than free may well come back to bite them by way of the inevitable class action lawsuits).

“Real” cloud storage systems transparently ensure that multiple copies of data are automatically maintained on different nodes, at least one of which is ideally geographically independent. That is to say, the fact I see the term “SAN” appearing in the conversation suggests that this was a legacy architecture far more likely to fail. This is in the same way that today’s aircraft are far safer than yesterday’s and today’s electricity grids far more reliable than earlier ones (Sidekick apparently predates Android & iPhone by some years after all). It’s hard to say with any real authority what is and what is not cloud computing though, beyond saying that “I know it when I see it, and this ain’t it”.

Whatever the root cause the result is the same – users who were given no choice but to store their contacts, calendars and other essential day-to-day data on Microsoft’s servers look like having irretrievably lost it. Friends, family, acquaintances and loved ones – even (especially?) the boy/girl you met at the bar last night – may be gone for good. People will miss appointments, lose business deals and in the most extreme cases could face extreme hardship as a result (for example, I’m guessing parole officers don’t take kindly to missed appointments with no contact!). The cost of this failure will (at least initially) be borne by the users, and yet there was nothing they could have done to prevent it short of choosing another service or manually transcribing their details.

The last hope for them is that Microsoft can somehow reverse the caching process in order to remotely retrieve copies from the devices (which are effectively dumb terminals) before they lose power; good luck with that. While synchronisation is hard to get right, having a single cloud-based “master” and a local cache on the device (as opposed to a full, first-class citizen copy) is a poor design decision. I have an iPhone (actually I have a 1G, 3G, 3GS and an iPod Touch) and they’re all synchronised together via two MacBooks and in turn to both a Time Machine backup and Mozy online backup. As if that’s not enough all my contacts are in sync with Google Apps’ Gmail over the air too so I can take your number and pretty much immediately drop it in a beer without concern for data loss. Even this proprietary system protects me from such failures.

The moral of the story is that externalised risk is a real problem for cloud computing. Most providers [try to] avoid responsibility by way of terms of service that strip away users’ rights but it’s a difficult problem to solve though because enforcing liability for anything but gross negligence can exclude smaller players from the market. That is why users absolutely must have control over their data and be encouraged if not forced to take responsibility for it.

Open Cloud simply requires open formats and open APIs – that is to say, users must have access to their data in a transparent format. Even if it doesn’t make sense to maintain a local copy on the users’ computer, there’s nothing stopping providers from pushing it to a third party storage service like Amazon S3. In fact it makes a lot of sense for applications to be separated from storage entirely. We don’t expect our operating system to provide all the functionality we’ll ever need (or indeed, any of it) so we install third party applications which use the operating system to store data. What’s to stop us doing the same in the cloud, for example having Google Apps and Zoho both saving back to a common Amazon S3 store which is in turn replicated locally or to another cloud-based service like Rackspace Cloud Files?

In any case perhaps it’s time for us to dust off and revisit the Cloud Computing Bill of Rights?