37 min

EP18 - Best Practices for Modernizing Your Hadoop Workloads to AWS with Dremio Gnarly Data Waves by Dremio

    • Technology

Many organizations turned to HDFS to address the challenge of storing growing volumes of semi-structured and unstructured data. However, Hadoop never managed to replace the data warehouse for enterprise-grade Business Intelligence and Reporting, and most teams ended up with separate monolithic architectures including data lakes and data warehouses, with siloed data and analytic workloads That is why data teams are increasingly considering a data lakehouse architecture that combines the flexibility and scalability of data lake storage with the data management, data governance, and enterprise-grade analytic performance of the data warehouse. In this episode, Jorge A. Lopez, Product Specialist for Analytics at AWS, and Dremio's Jeremiah Morrow will discuss best practices for modernizing analytic workloads from Hadoop to an open data lakehouse architecture, including:

- Choosing the right storage solution for your data lakehouse, and what features and functionality, such as performance, scalability reliabilty, and more, you should be evaluating.
- Specific steps and best practices for gradually shifting on-premises workloads to a cloud data lakehouse while ensuring business continuity.
- Consolidating data silos to achieve a complete view of your customer and operational data before, during, and after migration.

See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...

Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #aws #scalability

Many organizations turned to HDFS to address the challenge of storing growing volumes of semi-structured and unstructured data. However, Hadoop never managed to replace the data warehouse for enterprise-grade Business Intelligence and Reporting, and most teams ended up with separate monolithic architectures including data lakes and data warehouses, with siloed data and analytic workloads That is why data teams are increasingly considering a data lakehouse architecture that combines the flexibility and scalability of data lake storage with the data management, data governance, and enterprise-grade analytic performance of the data warehouse. In this episode, Jorge A. Lopez, Product Specialist for Analytics at AWS, and Dremio's Jeremiah Morrow will discuss best practices for modernizing analytic workloads from Hadoop to an open data lakehouse architecture, including:

- Choosing the right storage solution for your data lakehouse, and what features and functionality, such as performance, scalability reliabilty, and more, you should be evaluating.
- Specific steps and best practices for gradually shifting on-premises workloads to a cloud data lakehouse while ensuring business continuity.
- Consolidating data silos to achieve a complete view of your customer and operational data before, during, and after migration.

See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...

Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #aws #scalability

37 min

Top Podcasts In Technology

Acquired
Ben Gilbert and David Rosenthal
All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Lex Fridman Podcast
Lex Fridman
Hard Fork
The New York Times
The TED AI Show
TED
Search Engine
PJ Vogt, Audacy, Jigsaw