Grant is a data scientist-turned developer with over a decade of experience building data products at companies including Capital One and Walmart. He is currently a Founding Engineer at Stemma and is a maintainer of Amundsen, the leading open source data catalog. Grant loves to help others get more value out of their data, previously founding Tree Schema, the first product-led data catalog.
Data catalogs have been overly focused on data users while shunning the needs of software engineers and, specifically, data engineers. The core features in all data catalogs — metadata capture, tagging, lineage, to name a few — are skewed to a UI-based search and discovery paradigm. Fundamentally, these capabilities support data users but offer relatively little value for data creators (data engineers, software developers) which has led to the data catalog becoming a purely reactionary software that scales with the number of users and not the amount of data. This talk will discuss how pre-existing data sets within the data catalog can support operational use-cases and enable the data creator to be more efficient.