Folks in the Big Data and Hadoop communities have been getting increasingly interested in Apache Spark, an open source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. According to Apache, Spark can run programs up to 100 times faster than Hadoop MapReduce in memory, and ten times faster on disk. When crunching large data sets, those are big performance differences.
In OStatic's recent interview with Eucalyptus cloud originator Rich Wolski, he cited Spark and other technologies competitive with MapReduce as being very interesting. Databricks and Typesafe are now out with some survey results that bolster the case for Spark usage being on the rise. Here are details.