Amundsen: one year later (Lyft Engineering)

October 8, 2020

3075

On October 30, 2019, we officially open sourced Amundsen, our solution to solve metadata catalog and data discovery challenges. Ten months later, Amundsen joined the Linux foundation AI (LFAI) as its incubation project.

In almost every modern data-driven company, each interaction with the platform is powered by data. As data resources are constantly growing, it becomes increasingly difficult to understand what data resources exist, how to access them, and what information is available in those sources without tribal knowledge. Poor understanding of data leads to bad data quality, low productivity, duplication of work, and most importantly, a lack of trust in the data. The complexity of managing a fragmented data landscape is not just a problem unique to Lyft, but a common one that exists throughout the industry.

In a nutshell, Amundsen is a data discovery and metadata platform for improving the productivity of data analysts, data scientists, and engineers when interacting with data. By indexing the data resources (tables, dashboards, users, etc.) and powering a page-rank style search based on usage patterns (e.g. highly-queried tables show up earlier than less-queried tables), these customers are able to address their data needs faster.

RELATED ARTICLESMORE FROM AUTHOR

Building Autonomous ML Experimentation with Tangle and Tangent

Implementing Secure Zero-Touch Provisioning in AI and Edge Infrastructure

From DHCP to SZTP – The Trust Revolution

Celebrating the Second Year of Linux Man-Pages Maintenance Sponsorship

Disaggregated Routing with SONiC and VPP: Lab Demo and Performance Insights – Part Two

RELATED ARTICLES MORE FROM AUTHOR