May 21, 2013

Linux Ranks Among Top Skills for Big Data Jobs

Hortonworks' open source Apache Hadoop framework sits firmly at the center of the Big Data trend, so it only makes sense that Hadoop skills are among those most sought after by hiring managers in search of Big Data talent.

It's not just Hadoop skills they're looking for, however. In fact, it's Hadoop plus an assortment of other, related skills that many recruiters are hoping to find, according to a new report from IT careers site Dice, which analyzed the searches conducted by hiring managers using its new Open Web sourcing tool.

hadoop-skills-graphic“Hadoop may have been named after a stuffed toy elephant, but it’s not child’s play to recruiters and hiring managers looking for 'Big Data' talent,” wrote Howard Lee, chief architect of Dice's Open Web, in a “Mad Skills” report on Monday. “It’s big game, but not the only game hiring managers are searching to meet their recruiting needs.”

Some 4.4 million IT jobs are expected to be created globally by 2015 to support Big Data, Gartner predicts.

'More Than $100,000'

Java is by far the most commonly sought skill paired with Hadoop, which makes perfect sense given that Hadoop itself is Java-based. Next in line among specific skills, however, is NoSQL, Dice reported, adding that “professionals with Hadoop and NoSQL experience pulled in more than $100,000 on average” in a recent Dice Salary Survey.

Following down the list from there are MapReduce, Pig and Linux as well as Python, Hive and Scala.

“Since the inception of Hadoop, it has been optimized to run primarily on the Linux operating system,” Dice's Lee told Linux.com. “Although more recently supporting Windows, familiarity with Linux is a major plus in getting Hadoop up and running.”

In fact, “Linux has been a major element of many of the largest deployments of big data platforms and is often the first operating system supported with new big data software releases,” he added. “System administrators need to understand the technical challenges involved, and learn how to optimize their Linux environments for processing massive data sets.”

'Very Complementary'

Indeed, “it makes sense that Linux expertise and experience is in high demand around Big Data,” agreed Jay Lyman, senior analyst for enterprise software at 451 Research.

“Many, if not most, of today's database, data clustering, data management and data analysis technologies and tools are open source, including MySQL, PostgreSQL and NoSQL databases, Memcached, Cassandra and of course, Hadoop,” Lyman told Linux.com. “There are technical reasons in source code and flexibility and also cultural reasons in collaboration and openness that make Linux skills very complementary to leveraging these Big Data technologies.”