Live from Apache Big Data: A 5-Point System for Data Project Success [Video]


It takes a village to make data projects work – and the most important members of the team may not have anything to do with the data science itself.

That’s been the experience of Amy Gaskins, a data scientist with more than 10 years experience as a senior intelligence analyst supporting various agencies within the United States Intelligence Community and Department of Defense. She delivered the final keynote this morning at Apache’s Big Data Conference in Vancouver.

Gaskins took the crowd through three diverse projects she’s worked on – with the Department of Defense, MetLife and the National Oceanographic and Atmospheric Administration – and highlighted the successes and failures of each project.

And in each one, it was the subject matter experts (SMEs), not the data scientists, that were the glue holding the whole project together.

“It’s the non-data SMEs that prevent IT and business from fighting each other,” Gaskins said. “It’s like magic, and I don’t say that lightly.”

SMEs are just one piece of a five point system that Gaskins believes leads to the best chance for success for data projects. The full system is:

  • Buy-in – Needed from senior leadership, middle management and the workers themselves.

  • Urgency – Everyone needs to understand that there is an existential threat to the business if the project isn’t done.

  • Transparency – People inside and outside the organization need to know what’s being done, and why. This means it can be repeated!

  • Non data science SMEs – These are the people who actually know the gritty details of how things work in their field, and how things actually get done.

  • Psychological safety – Your cross-functional team members must be able to trust each other.

“All of these facets need to be continuously tended,” Gaskins said. “It’s a system, and any part of the system can collapse at any time.”

Of the three projects she highlighted from personal experience, two found success in hitting all five marks, and each relied heavily on the subject matter experts involved with the process.

She worked with 43rd Sustainment Brigade in Afghanistan to learn more about corruption in the southern part of the country, looking to stem the amount of financial aid that found its way into the hands of the Taliban.

Gaskins said the unit was wildly understaffed, and had to recruit everyone from machine gunners to truck drivers to help with the process. So her team created a training system to bring new people on board, and were able to “train soldiers, combine the new data with existing sources, analyze it, report it up and out, learn from it and then train others.”

At MetLife’s Dubai office, her team was able to create an automated insurance fraud solution that provided a 400 percent return on investment in preventing fraud. Yet again, a crucial piece was the SMEs – in this case, insurance claims adjusters.

“One of the really critical things to getting this done was understanding the knowledge in each of the claims adjusters heads,” she said. “We wanted to make sure we gathered that knowledge; they had to be part of the hypotheses process.”

The third project, opening and commercializing NOAA’s weather data, suffered from a lack of buy-in and urgency from the government agency’s political leadership, according to Gaskins. But the scientists were all-in on the effort to open up the data and drove the successes the project did have.

“It was a team of volunteers, so it was a team of people very passionate about getting it done,” Gaskins said. “It was an egalitarian style team with no titles, which allowed everyone to make decisions very easily. We were open, transparent, and this made the team really safe. [The participants] said it was unlike any other government team they’d worked on.”