37 min

3x26: DataOps - Putting the Data in Data Science Utilizing Tech Season 6 - Utilizing AI

    • Technology

The quality of an AI application depends on the quality of the data that feeds it. Sunil Samel joins Frederic Van Haren and Stephen Foskett to discuss DataOps and the importance of data quality. When we consider data-centric AI, we must consider all aspects of the data pipeline, from storing, transporting, and understanding to controlling access and cost. We must look at the data needed to train our models, think about the desired outcomes, and consider the sources and pipeline needed to get that result. We must also decide how to define quality: Do we need a variety of data sources? Should we reject some data? How does the modality of the data type change this definition? Is there bias in what is included and excluded? Data pipelines are usually simple, ingesting and storing data from the source, slicing and preparing it, and presenting it for processing. But DataOps recognizes that the data pipeline can get very complicated and requires understanding of all these steps as well as adaptation from development to production.

Three Questions:


Frederic: Do you think we should expect another AI winter?
Stephen: When will we see a full self-driving car that can drive anywhere, any time?
Mike O'Malley, Seneca Global: Can you give an example where an AI algorithm went terribly wrong and gave a result that clearly wasn’t correct?

Gests and Hosts


Sunil Samel, VP of Products at Akridata. Connect with Sunil on LinkedIn or email him at sunil.samel@akridata.com.
Frederic Van Haren, Founder at HighFens Inc., Consultancy & Services. Connect with Frederic on Highfens.com or on Twitter at @FredericVHaren.
Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett.

Date: 3/29/2022 Tags: @SFoskett, @FredericVHaren

The quality of an AI application depends on the quality of the data that feeds it. Sunil Samel joins Frederic Van Haren and Stephen Foskett to discuss DataOps and the importance of data quality. When we consider data-centric AI, we must consider all aspects of the data pipeline, from storing, transporting, and understanding to controlling access and cost. We must look at the data needed to train our models, think about the desired outcomes, and consider the sources and pipeline needed to get that result. We must also decide how to define quality: Do we need a variety of data sources? Should we reject some data? How does the modality of the data type change this definition? Is there bias in what is included and excluded? Data pipelines are usually simple, ingesting and storing data from the source, slicing and preparing it, and presenting it for processing. But DataOps recognizes that the data pipeline can get very complicated and requires understanding of all these steps as well as adaptation from development to production.

Three Questions:


Frederic: Do you think we should expect another AI winter?
Stephen: When will we see a full self-driving car that can drive anywhere, any time?
Mike O'Malley, Seneca Global: Can you give an example where an AI algorithm went terribly wrong and gave a result that clearly wasn’t correct?

Gests and Hosts


Sunil Samel, VP of Products at Akridata. Connect with Sunil on LinkedIn or email him at sunil.samel@akridata.com.
Frederic Van Haren, Founder at HighFens Inc., Consultancy & Services. Connect with Frederic on Highfens.com or on Twitter at @FredericVHaren.
Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett.

Date: 3/29/2022 Tags: @SFoskett, @FredericVHaren

37 min

Top Podcasts In Technology

No Priors: Artificial Intelligence | Technology | Startups
Conviction | Pod People
All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Lex Fridman Podcast
Lex Fridman
Acquired
Ben Gilbert and David Rosenthal
Hard Fork
The New York Times
TED Radio Hour
NPR