Report Shows Apache Spark Gaining Momentum

Folks in the Big Data and Hadoop communities have been getting increasingly interested in Apache Spark, an open source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. According to Apache, Spark can run programs up to 100 times faster than Hadoop MapReduce in memory, and ten times faster on disk. When crunching large data sets, those are big performance differences.

In OStatic’s recent interview with Eucalyptus cloud originator Rich Wolski, he cited Spark and other technologies competitive with MapReduce as being very interesting. Databricks and Typesafe are now out with some survey results that bolster the case for Spark usage being on the rise. Here are details.

RELATED ARTICLESMORE FROM AUTHOR

Celebrating the Second Year of Linux Man-Pages Maintenance Sponsorship

Kubernetes on Bare Metal for Maximum Performance

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

Using OpenTelemetry and the OTel Collector for Logs, Metrics, and Traces

RELATED ARTICLES MORE FROM AUTHOR