Virtual Google Search Appliance is here…

I’ve been quiet of late as I’ve been busy racking up the frequent flier miles last month or two, but I’m back (albeit busy) and will endeavour to work through a backlog of posts, even if that means spending less time on them and leaving the Pulitzer Prize to someone else. While I wait for it to download I thought I’d let you know about today’s announcement of a Google Search [Virtual] Appliance (which I’ve been hanging out for, under NDA, since 2006!):

Ever wanted to write code against Google search technology, test your apps, and see how it all integrates into your development environment without having to pay a thing? If you’re an IT administrator, you’ll have that chance with the new virtual edition of the Google Search Appliance. The Google Search Appliance virtual edition is for non-commercial, development purposes only, and gives developers the opportunity to test against the features of the physical Google Search Appliance.

The Google Search Appliance virtual edition provides a free test bed for the Google Search Appliance – our solution for securely searching enterprise content behind the corporate firewall – helping ensure a smooth transition to the production-ready hardware. If your organization is considering adopting an enterprise search solution, the virtual edition platform gives your team the flexibility to build applications against the Google Search Appliance, try different configuration scenarios, explore proofs-of-concept and test the APIs supported by Google enterprise search technology. As part of testing with the virtual edition, you can:

These features might come in handy, particularly if your existing environment contains the array of legacy systems, databases, servers and integration architecture typical of most large organizations. And because it’s free, your boss might give you an extra week’s vacation just for trying it out (don’t quote us on that). You can download Google Search Appliance virtual edition software onto any server that is supported by VMWare virtualization. To learn more and get started, click here. And since we always love feedback, feel free to drop by our developer community or send your thoughts to

Well it’s almost done, but I’m not holding my breath as it wants 3Gb of RAM and I didn’t have the patience for Apple to custom build a 4Gb MacBook for me the other week so I’ve only got 2Gb. I wonder what it would take to get it up and running on a large instance of EC2?

Update: It works (albeit slowly), and it looks surprisingly standard (Linux 2.6.20 – CentOS 5 I think); maybe EC2’s not out of the realm of possibility after all:

Update 2: Having kicked the tires for a while I’m already thinking about the possibilities. Now that the GSA has broken its shackles to expensive, proprietary hardware the world is its’ oyster, and while the license prohibits production use, that’s an administrative rather than a technical hurdle. Locking down the licensing (currently the MAC address is mashed up and digitally signed along with various feature and URL count restrictions, but MAC addresses are malleable with virtual machines) and ensuring performance meets acceptable standards on uncontrollable (virtual) hardware are two obvious (if optional) hurdles. That said, expect to see something happen in this area as the competition is already offering free, downloadable search solutions; indeed I wouldn’t be surprised if there were already virtual GSAs in production.

I’d really like to see Google supported for Australian Online Solutions‘ upcoming CloudSearch product, so getting it up and running on EC2 would be nice even if only to prove the concept. Assuming there’s no non-standard kernel hacks then migration shouldn’t be that hard, and even if there were they would have to be released under the terms of the GPL per my thus far unanswered public request. That said, user selectable kernels (AKIs) and ramdisks (ARIs) on Amazon’s EC2 are currently only available to Amazon and a select few others so said modifications (if any) would have to be injected via a loadable module for now.

Watch this space…