DataTalks.Club

DataTalks.Club

DataTalks.Club - the place to talk about data!

  1. 6D AGO

    Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices - Juan Manuel Perafan

    In this talk, Juan, Analytics Engineer and author of Fundamentals of Analytics Engineering share his professional journey from studying psychological research in Colombia to becoming one of the first analytics engineers in the Netherlands. We explore the evolution of the role, the shift toward engineering rigor in data modeling, and how the landscape of tools like dbt and Databricks is changing the way teams work. You’ll learn about: The fundamental differences between traditional BI engineering and modern analytics engineering.How to bridge the gap between business stakeholders and technical data infrastructure.The technical "glue" that connects Python and SQL for robust data pipelines.The importance of automated testing (generic vs. singular tests) to prevent "silent" data failures.Strategies for modeling messy, fragmented source data into a unified "business reality."The current state of the "Lakehouse" paradigm and how it impacts storage and compute costs.Expert advice on navigating the dbt ecosystem and its emerging competitors. Links: DE Course: https://github.com/DataTalksClub/data-engineering-zoomcampLuma: https://luma.com/0uf7mmup TIMECODES: 0:00 Juan’s psychological research and transition to data 4:36 Riding the wave: The early days of analytics engineering 7:56 Breaking down the gap between analysts and engineers 11:03 The art of turning business reality into clean data 16:25 Why data engineering is about safety, not just speed 20:53 Reimagining data modeling in the modern era 26:53 To split or not to split: Finding the right team roles 30:35 Python, SQL, and the technical toolkit for success 38:41 How to stop manually testing your data dashboards 46:34 Bringing software engineering rigor to data workflows 49:50 Must-read books and resources for mastering the craft 55:42 The future of dbt and the shifting tool landscape 1:00:29 Deciphering the lakehouse: Warehousing in the cloud 1:11:16 Pro-tips for starting your data engineering journey 1:14:40 The big debate: Databricks vs. Snowflake 1:18:28 Why every data professional needs a local community This talk is designed for data analysts looking to level up their engineering skills, data engineers interested in the business-logic layer, and data leaders trying to structure their teams more effectively. It is particularly valuable for those preparing for the Data Engineering Zoomcamp or anyone looking to transition into an Analytics Engineering role. Connect with Juan Linkedin - https://www.linkedin.com/in/jmperafan/ Website - https://juanalytics.com/ Connect with DataTalks.Club: Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/

    1h 24m
  2. FEB 6

    AI Engineering: Skill Stack, Agents, LLMOps, and How to Ship AI Products - Paul Iusztin

    In this episode of DataTalks.Club, Paul Iusztin, founding AI engineer and author of the LLM Engineer’s Handbook, breaks down the transition from traditional software development to production-grade AI engineering. We explore the essential skill stack for 2026, the shift from "PoC purgatory" to shipping real products, and why the future of the field belongs to the full-stack generalist. You’ll learn about: - Why the role is evolving into the "new software engineer" and how to own the full product lifecycle. - Identifying when to use traditional ML (like XGBoost) over LLMs to avoid over-engineering. - The architectural shift from fine-tuning to mastering data pipelines and semantic search. - Reliable Agentic Workflows- How to use coding assistants like Claude and Cursor to act as an architect rather than just a coder. - Why human-in-the-loop evaluation is the most critical bottleneck in shipping reliable AI. - How to build a "Second Brain" portfolio project that proves your end-to-end engineering value. Links: - Course link: https: https://academy.towardsai.net/courses/agent-engineering?ref=b3ab31 - Decoding AI Magazine: https://www.decodingai.com/ TIMECODES: 00:00 From code to cars: Paul’s journey to AI 07:08 Deep learning and the autonomous driving challenge 12:09 The transition to global product engineering 15:13 Survival guide: Data science vs. AI engineering 22:29 The full-stack AI engineer skill stack 29:12 Mastering RAG and knowledge management 32:27 The generalist edge: Learning with AI 42:21 Technical pillars for shipping AI products 54:05 Portfolio secrets and the "second brain" 58:01 The future of the LLM engineer’s handbook This talk is designed for software engineers, data scientists, and ML engineers looking to move beyond proof-of-concepts and master the engineering rigors of shipping AI products in a production environment. It is particularly valuable for those aiming for founding or lead AI roles in startups. Connect with Paul - Linkedin - https://www.linkedin.com/in/pauliusztin/ - Website - https://www.pauliusztin.ai/ Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 7m
  3. JAN 9

    Applying ML: An Ongoing Personal Journey

    In this talk, Rileen, a Senior Computational Biologist and Cancer Data Scientist, shares his professional journey from physics and computer science to cutting-edge cancer genomics and applied machine learning. From his early work in alternative splicing models to deep learning in medical imaging, Rileen explains how biology, data science, and AI intersect to transform cancer research. TIMECODES:00:00 Rileen's Career Journey and Education06:14 Understanding Alternative Splicing in Computational Biology10:56 Modeling Alternative Splicing with Machine Learning14:52 Model Error Analysis and Transition to Cancer Research18:37 What Is Cancer? Mutational Theory Explained21:45 Cancer Treatments and Causes24:57 Cancer Genomics and Tumor Models28:59 Comparing Cell Lines and Tumor Samples (Multi-omics Analysis)32:32 Machine Learning Applications in Cancer Research35:38 Deep Learning for Medical Imaging and Pathology39:17 Data Privacy and Applied ML Course Projects42:50 Learning Outcomes and Future Plans46:36 Industry Experience in Pharmaceutical Research50:14 Day in the Life of a Computational Biologist55:02 Advice for Current ML Students58:40 Project Management and Challenges in Genomics1:02:23 Public Data Sets and Cancer Research in GermanyConnect with Rileen:- Twitter - https://x.com/RileenSinha- Linkedin - https://www.linkedin.com/in/rileen-sinha-a644692/- Github - https://github.com/OptimistixConnect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 5m
  4. 12/12/2025

    Building Pet Health Tech: ML, Sensors, and Dog Behavior Data

    In this session Sofya shares her journey building a pet-tech startup that blends machine learning sensor data and canine behavior analytics. She walks through her path from early programming explorations to launching a health monitoring device designed around anomaly detection and long-term behavioral baselines. TIMECODES: 00:00 Sofya's pet tech startup with machine learning sensor data and behavior pattern analytics 10:00 Journey from programming hobby to full time software development career 17:20 Career growth after skipping university and building practical experience 24:07 Puppy adoption story and family influence on pet focused innovation 32:16 Dog health monitoring framed as anomaly detection in real world machine learning 37:05 Collecting canine data with emphasis on sleep patterns and cycle tracking 43:35 Establishing a dogs normal baseline through long term data observation 49:34 Startup funding through personal savings and early stage bootstrapping 55:28 Finding cofounders and collaborators through meetups and coworking communities 59:48 Closing insights on Sofya's educational path and early device prototypes Connect with Sofya - Website - https://www.fit-tails.com/ - Linkedin - https://www.linkedin.com/in/sofya-yulpatova/ Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 1m
  5. 11/28/2025

    From Full-Time Mom to Head of Data and Cloud - Xia He-Bleinagel

    In this talk, Xia He-Bleinagel, Head of Data & Cloud at NOW GmbH, shares her remarkable journey from studying automotive engineering across Europe to leading modern data, cloud, and engineering teams in Germany. We dive into her transition from hands-on engineering to leadership, how she balanced family with career growth, and what it really takes to succeed in today’s cloud, data, and AI job market. TIMECODES: 00:00 Studying Automotive Engineering Across Europe 08:15 How Andrew Ng Sparked a Machine Learning Journey 11:45 Import–Export Work as an Unexpected Career Boos t17:05 Balancing Family Life with Data Engineering Studies 20:50 From Data Engineer to Head of Data & Cloud 27:46 Building Data Teams & Tackling Tech Debt 30:56 Learning Leadership Through Coaching & Observation 34:17 Management vs. IC: Finding Your Best Fit 38:52 Boosting Developer Productivity with AI Tools 42:47 Succeeding in Germany’s Competitive Data Job Market 46:03 Fast-Track Your Cloud & Data Career 50:03 Mentorship & Supporting Working Moms in Tech 53:03 Cultural & Economic Factors Shaping Women’s Careers 57:13 Top Networking Groups for Women in Data 1:00:13 Turning Domain Expertise into a Data Career Advantage Connect with Xia- Linkedin - https://www.linkedin.com/in/xia-he-bleinagel-51773585/ - Github - https://github.com/Data-Think-2021 - Website - https://datathinker.de/ Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 2m
  6. 11/28/2025

    From Black-Box Systems to Augmented Decision-Making - Anusha Akkina

    In this talk, Anusha Akkina, co-founder of Auralytix, shares her journey from working as a Chartered Accountant and Auditor at Deloitte to building an AI-powered finance intelligence platform designed to augment, not replace, human decision-making. Together with host Alexey from DataTalks.Club, she explores how AI is transforming finance operations beyond spreadsheets—from tackling ERP limitations to creating real-time insights that drive strategic business outcomes. TIMECODES: 00:00 Building trust in AI finance and introducing Auralytix 02:22 From accounting roots to auditing at Deloitte and Paraxel 08:20 Moving to Germany and pivoting into corporate finance 11:50 The data struggle in strategic finance and the need for change 13:23 How Auralytix was born: bridging AI and financial compliance 17:15 Why ERP systems fail finance teams and how spreadsheets fill the gap 24:31 The real cost of ERP rigidity and lessons from failed transformations 29:10 The hidden risks of spreadsheet dependency and knowledge loss 37:30 Experimenting with ChatGPT and coding the first AI finance prototype 43:34 Identifying finance’s biggest pain points through user research 47:24 Empowering finance teams with AI-driven, real-time decision insights 50:59 Developing an entrepreneurial mindset through strategy and learning 54:31 Essential resources and finding the right AI co-founder Connect with Anusha - Linkedin - https://www.linkedin.com/in/anusha-akkina-acma-cgma-56154547/ - Website - https://aurelytix.com/ Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 3m
  7. 11/28/2025

    Qdrant 2025 Conference Interviews

    At Qdrant Conference, builders, researchers, and industry practitioners shared how vector search, retrieval infrastructure, and LLM-driven workflows are evolving across developer tooling, AI platforms, analytics teams, and modern search research. Andrey Vasnetsov (Qdrant) explained how Qdrant was born from the need to combine database-style querying with vector similarity search—something he first built during the COVID lockdowns. He highlighted how vector search has shifted from an ML specialty to a standard developer tool and why hosting an in-person conference matters for gathering honest, real-time feedback from the growing community. Slava Dubrov (HubSpot) described how his team uses Qdrant to power AI Signals, a platform for embeddings, similarity search, and contextual recommendations that support HubSpot’s AI agents. He shared practical use cases like look-alike company search, reflected on evaluating agentic frameworks, and offered career advice for engineers moving toward technical leadership. Marina Ariamnova (SumUp) presented her internally built LLM analytics assistant that turns natural-language questions into SQL, executes queries, and returns clean summaries—cutting request times from days to minutes. She discussed balancing analytics and engineering work, learning through real projects, and how LLM tools help analysts scale routine workflows without replacing human expertise. Evgeniya (Jenny) Sukhodolskaya (Qdrant) discussed the multi-disciplinary nature of DevRel and her focus on retrieval research. She shared her work on sparse neural retrieval, relevance feedback, and hybrid search models that blend lexical precision with semantic understanding—contributing methods like Mini-COIL and shaping Qdrant’s search quality roadmap through end-to-end experimentation and community education. Speakers Andrey Vasnetsov Co-founder & CTO of Qdrant, leading the engineering and platform vision behind a developer-focused vector database and vector-native infrastructure. Connect: https://www.linkedin.com/in/andrey-vasnetsov-75268897/ Slava Dubrov Technical Lead at HubSpot working on AI Signals—embedding models, similarity search, and context systems for AI agents. Connect: https://www.linkedin.com/in/slavadubrov/ Marina Ariamnova Data Lead at SumUp, managing analytics and financial data workflows while prototyping LLM tools that automate routine analysis. Connect: https://www.linkedin.com/in/marina-ariamnova/ Evgeniya (Jenny) Sukhodolskaya Developer Relations Engineer at Qdrant specializing in retrieval research, sparse neural methods, and educational ML content. Connect: https://www.linkedin.com/in/evgeniya-sukhodolskaya/

    52 min
  8. 10/24/2025

    How to Build and Evaluate AI systems in the Age of LLMs - Hugo Bowne-Anderson

    In this talk, Hugo Bowne-Anderson, an independent data and AI consultant, educator, and host of the podcasts Vanishing Gradients and High Signal, shares his journey from academic research and curriculum design at DataCamp to advising teams at Netflix, Meta, and the US Air Force. Together, we explore how to build reliable, production-ready AI systems—from prompt evaluation and dataset design to embedding agents into everyday workflows. You’ll learn about: How to structure teams and incentives for successful AI adoptionPractical prompting techniques for accurate timestamp and data generationBuilding and maintaining evaluation sets to avoid “prompt overfitting”- Cost-effective methods for LLM evaluation and monitoringTools and frameworks for debugging and observing AI behavior (Logfire, Braintrust, Phoenix Arise)The evolution of AI agents—from simple RAG systems to proactive, embedded assistantsHow to escape “proof of concept purgatory” and prioritize AI projects that drive business valueStep-by-step guidance for building reliable, evaluable AI agents This session is ideal for AI engineers, data scientists, ML product managers, and startup founders looking to move beyond experimentation into robust, scalable AI systems. Whether you’re optimizing RAG pipelines, evaluating prompts, or embedding AI into products, this talk offers actionable frameworks to guide you from concept to production. LINKS Escaping POC Purgatory: Evaluation-Driven Development for AI Systems - https://www.oreilly.com/radar/escaping-poc-purgatory-evaluation-driven-development-for-ai-systems/Stop Building AI Agents - https://www.decodingai.com/p/stop-building-ai-agentsHow to Evaluate LLM Apps Before You Launch - https://www.youtube.com/watch?si=90fXJJQThSwGCaYv&v=TTr7zPLoTJI&feature=youtu.beMy Vanishing Gradients Substack - https://hugobowne.substack.com/Building LLM Applications for Data Scientists and Software Engineers https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=datatalksclub TIMECODES: 00:00 Introduction and Expertise 04:04 Transition to Freelance Consulting and Advising 08:49 Restructuring Teams and Incentivizing AI Adoption 12:22 Improving Prompting for Timestamp Generation 17:38 Evaluation Sets and Failure Analysis for Reliable Software 23:00 Evaluating Prompts: The Cost and Size of Gold Test Sets 27:38 Software Tools for Evaluation and Monitoring 33:14 Evolution of AI Tools: Proactivity and Embedded Agents 40:12 The Future of AI is Not Just Chat 44:38 Avoiding Proof of Concept Purgatory: Prioritizing RAG for Business Value 50:19 RAG vs. Agents: Complexity and Power Trade-Offs 56:21 Recommended Steps for Building Agents 59:57 Defining Memory in Multi-Turn Conversations Connect with Hugo Twitter - https://x.com/hugobowneLinkedin - https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/Github - https://github.com/hugobowneWebsite - https://hugobowne.github.io/ Connect with DataTalks.Club: Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

    1h 2m

Ratings & Reviews

5
out of 5
7 Ratings

About

DataTalks.Club - the place to talk about data!

You Might Also Like