Google is moving the goalpost significantly in the market for big data tools, at least for organizations that can work with its canned tools and are willing to trust the search giant with their data. After some time in a limited preview, Google has unveiled Google BigQuery for public consumption. Google is giving developers the ability to query up to 100GB of data per month for free, or up to 2TB of data stored without having to contact sales at all, which provides a very low bar for working with big data.
Google's BigQuery is a Platform-as-a-Service (PaaS) for working with "massive datasets" that can be in the billions of rows. It has a SQL-like query language, and promises to analyze large data sets "in seconds." Note that organizations that want a Google-hosted SQL database can tap the Cloud SQL offering.
Commodity Big Data
What's most interesting about BigQuery is the fact that it provides big-data analytics in a completely hosted offering. Organization's don't have to build out the hardware for a big-data infrastructure. They don't need to worry about setting up Hadoop or any other software. It's big data available instantly, and at a fairly affordable price.
Google is charging by storage and by the queries processed. The storage is priced at $0.12 per GB per month, up to 2TB. This is, more or less, the same as you get from Google Cloud Storage, except there's no drop in price after the 1TB tier.
Queries are priced at $0.035/GB processed, with a limit of 1,000 queries per day and 20TB of data processed per day. Note that this is after the 100GB/month free tier. And you're charged only for the data processed in a column of the data, not for the entire table.
To work with data, you have three options: a browser-based query tool, a Python command-line tool, and a REST API.
As an off-the-shelf service, it's going to be a bit less flexible than what developers could get out of a tool built with Hadoop, Hive, etc. However, it is likely to be effective for quite a few organizations and developers who need big-data tools quickly and can work within the limitations of BigQuery. Data journalists, for example, might find BigQuery quite useful in working with home-grown data sets rather than having to build out their own query tools.
There's also the matter of data privacy. The Terms of Service (ToS) give each party full control of their own intellectual property - so Google should have no rights to the data being studied using BigQuery. Nevertheless, any organization that wants to keep its data private is going to think twice before putting it into Google's BigQuery.
Flexibility and privacy aside, this is going to fill a niche very handily. There's a lot of nonsensitive data that developers might want to crunch, without having to create their own big-data toolset. BigQuery looks like a decent solution for situations when a commodity tool will fit the bill.