Twitter’s Open Source Big Data Tool Comes to the Cloud Courtesy of Nodeable

104

Usually when we think of a pivot, we think of a company that has decided to drop its core offering and market a different product or service. Obvious Corporation put ODEO up for sale and focused on Twitter. BRBN shuttered its location check-in service and became Instagram. But Nodeable‘s pivot isn’t that sort of pivot.

Today Nodeable launched a new service called StreamReduce, a cloud-hosted real-time big data analytics product. StreamReduce is based on the same architecture as Nodeable’s existing IT operations monitoring tool. The company is keeping its current service, but is expanding its scope by marketing beyond its current base of developers and system administrators.

At the heart of StreamReduce is Storm, a real-time analytics engine that was originally developed at BackType, a company that was acquired by Twitter last year. After the acquisition Twitter allowed lead developer Nathan Marz to finish the project and open source it. Twitter is now using Storm internally.

StreamReduce is essentially Storm hosted in the cloud, with a few extras such as connectors to Apache Hadoop. Nodeable CEO Dave Rosenberg explains that Storm is meant to compliment, not replace, Hadoop. Hadoop is great for running analytics on huge data sets that you’ve already collected, but it’s not good for processing streams of incoming data. That’s where Storm and StreamReduce come in.

Storm isn’t the only project trying to solve the big data streaming problem. Apache S4 is an open source project originally developed by Yahoo that provides similar functionality, and HStreaming offers a proprietary product that adds real-time capabilities to Hadoop. But Storm is the project that seems to be gaining the most traction. For example, the contact mangement startup FullContact chose Storm over other options.

“We wanted to try open source first, as it keeps our options wide open should we want to change technologies. That ruled out HStreaming,” explains FullContact CTO Dan Lynn. “S4 was very interesting, but I didn’t get the impression that it had captured the enthusiasm of the developer community as well as Storm.” Nodeable chose to use Storm as its base for similar reasons.

Nodeable launched last year as a challenger to Splunk, the big data company that IPOed earlier this year. Spunk sells an on-premise tool for collecting and analyzing large data sets. It’s become best known for handling machine generated data, mostly system log files from servers, but it could be used for pretty much any data set.

 

Read more at TechCrunch