Build a Hadoop Cluster in AWS in Minutes


Check out this process that will let you get a Hadoop cluster up and running on AWS in two easy steps.

I use Apache Hadoop to process huge data loads. Setting up Hadoop in a cloud provider, such as AWS, involves spinning up a bunch of EC2 instances, configuring nodes to talk to each other, installing software, configuring the master and data nodes’ config files, and starting services.

This was a good use case to automate, considering I wanted to solve these problems.

  • How do I build the cluster in minutes (as opposed to hours and maybe even days for a large number of data nodes)?

