Data Wrangling at Slack


For a company like Slack that strives to be as data-driven as possible, understanding how our users use our product is essential.

The Data Engineering team at Slack works to provide an ecosystem to help people in the company quickly and easily answer questions about usage, so they can make better and data informed decisions: Based on a team’s activity within its first week, what is the probability that it will upgrade to a paid team?” or “What is the performance impact of the newest release of the desktop app?”

The Dream

We knew when we started building this system that we would need flexibility in choosing the tools to process and analyze our data. Sometimes the questions being asked involve a small amount of data and we want a fast, interactive way to explore the results. Other times we are running large aggregations across longer time series and we need a system that can handle the sheer quantity of data and help distribute the computation across a cluster. Each of our tools would be optimized for a specific use case, and they all needed to work together as an integrated system.

Read more at Slack Engineering