Joe Hellerstein on Data Wrangling and Reproducibility Determined Podcast Series
-
- Technology
It is well known that machine learning is powered by data. Unfortunately, the raw data that we would like to use to train models is often created and stored in such a way that it is not machine consumable. As part of our Determined Podcast Series, Craig and I recently had a conversation with Joe Hellerstein, a computer science professor at UC Berkeley, a leading researcher in the databases community, and co-founder of the data wrangling company Trifacta. Joe talked about the unique challenges of data preparation, and how it blends ideas from software engineering and media editing.
It is well known that machine learning is powered by data. Unfortunately, the raw data that we would like to use to train models is often created and stored in such a way that it is not machine consumable. As part of our Determined Podcast Series, Craig and I recently had a conversation with Joe Hellerstein, a computer science professor at UC Berkeley, a leading researcher in the databases community, and co-founder of the data wrangling company Trifacta. Joe talked about the unique challenges of data preparation, and how it blends ideas from software engineering and media editing.
44 min