Large enterprises are putting a lot of money and effort into making sure they have the latest and greatest in Hadoop and other big data infrastructure tools, but it turns out their IT teams are far from prepared to actually use those tools once they are in place.
That’s one observation from Jeremy Howard, president and chief scientist of Kaggle, which uses crowdsourcing techniques to provide statistical and data analytic services for clients.
“A lot of companies don’t know how to find data scientists, and don’t understand data science,” Howard explained. “These enterprise companies can’t implement a proper data analytical solution because they have no data talent.”
Part of the problem is an overall lack of big data skills in the United States. In May 2011, the McKinsey Global Institute laid out the numbers: “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
Howard sees the problem reflected in his company’s clientele. Initially, Kaggle worked with smaller, highly capitalized startups, but now finds itself working with larger enterprise companies.
Startups Do Big Data Better Than Enterprises
The startups, it turns out, are much better equipped to handle big data than the enterprises.
“The startups are usually much closer to the data they’re analyzing,” Howard explained. “They know their stuff, and that knowledge is more centralized within a smaller organization.”
Enterprises, in contrast, are much broader and knowledge intimacy is much more distributed, he said.