GroundZero AI Talks

Himanshu Dubey

Your friendly neighborhood creative space shaping the frontier of tech, with occasional conversations and notes.

  1. The Delta of Intelligence is Human Data | Curtis Northcutt (Director of AI Research, Handshake)

    ६ दिवसांपूर्वी

    The Delta of Intelligence is Human Data | Curtis Northcutt (Director of AI Research, Handshake)

    Curtis Northcutt joins us on GroundZero.He is Director of AI Research at Handshake, founder of Cleanlab (acquired by Handshake) and invented confident learning during his PhD at MIT. In this conversation, we dig into why the delta between model generations is almost entirely human data, the four quadrants of the AI market, how expert data labeling is deeply misunderstood, the long tail of human knowledge that AI still can't touch, RL environments, coding benchmarks and nuances around Human Data market.PS: Handshake’s revenue has risen to nearly $1 billion up from $550 million in January and $5 million a year ago. TIMESTAMPS 00:00:00 - Intro 00:02:08 - The Snowman Effect: Growing Up in Rural Kentucky to 7 Years at MIT 00:09:50 - Confident Learning, Two Systems of Intelligence & Why Classical ML Still Matters 00:20:37 - The Origin of Cleanlab & the Handshake Acquisition 00:28:00 - How Does GPT-5 Become GPT-6? 00:31:20 - The Fastest Growing Data Lab: What Makes Handshake Different? 00:35:32 - The Difference Between Good Data & Bad Data 00:38:17 - New Kinds of Data, IA & AI 00:42:04 - Scaling Coding Benchmarks & Efficiency in Long-Horizon Tasks 00:49:12 - Pre-Training & Misconceptions Around Human Data 00:57:13 - Open Questions: Taste, Personality & Quick Fire 01:05:15 - Advice to Your 20-Year-Old Self: Skill & Obsession

    १तास ८मि.
  2. The Story of Dhravya Shah | 20yo raised $3M to build SuperMemory, Dhravya Shah

    ६ दिवसांपूर्वी

    The Story of Dhravya Shah | 20yo raised $3M to build SuperMemory, Dhravya Shah

    Dhravya Shah is the Founder of Supermemory. TIMESTAMPS 00:00:00 - Teaser 00:01:39 - Introduction 00:02:42 - What is SuperMemory? Explaining the Product 00:04:43 - Coolest Use Cases & Customer Stories 00:07:48 - Early Days: Growing Up in Mumbai & Learning to Code 00:09:24 - First Success: Discord Bot & Twitter Screenshot Tool Acquisition 00:13:11 - The IIT Story: Myth vs Reality 00:14:38 - The 40-Week Building Streak 00:17:37 - Learning Strategy & Resources 00:20:18 - From AnyContext to SuperMemory: The Origin Story 00:21:16 - Failed Projects & Lessons Learned 00:25:04 - Getting Attacked & Accidentally Joining Cloudflare 00:26:30 - Relationship Support & Building While in College 00:27:57 - How to Sell Your Projects & Acquisitions 00:29:23 - Working at Mem0 & Differences with SuperMemory 00:33:51 - Cloudflare Experience & Working with CEO Dane Knecht 00:36:00 - The Fundraising Journey: From Buildspace to a $3M Round 00:40:51 - Why Skip Y Combinator? 00:42:16 - O-1 Visa Story: Becoming “Officially Extraordinary” 00:44:14 - Being a Solo Founder: Challenges & Benefits 00:47:20 - Hiring Philosophy & Team Culture at SuperMemory 00:51:46 - India vs Bay Area: Ecosystem Differences 00:53:10 - Vision vs Profit: What Matters at the Early Stage 00:54:26 - Thoughts on Joining College 00:55:18 - What’s Next for SuperMemory (Local-First & Nova) 00:57:38 - Advice for Aspiring Builders & Students 00:59:00 - Closing Thoughts

    १तास ४२मि.
  3. Model is the Product | Common Corpus, Mid-Training, Open Science | Pierre-Carl Langlais, Pleias

    ६ दिवसांपूर्वी

    Model is the Product | Common Corpus, Mid-Training, Open Science | Pierre-Carl Langlais, Pleias

    Pierre-Carl Langlais (aka Alexandar Doria) is Co-founder of Pleias. We'd discussed about pre-training recipes, common corpus, mid-training, agentic systems, good post-training and everything AI. TIMESTAMPS 00:00:00 - TEASER 00:01:12 - INTRO 00:02:03 - Who is Alexander Doria [Pierre-Carl Langlais]? 00:04:10 - Early career: From humanities to AI research 00:07:50 - Meeting influential people in computational humanities 00:10:00 - How the idea of Pleias came about 00:13:30 - Building Pleias: Infrastructure and compute challenges in Europe 00:17:06 - Team structure and work culture at Pleias 00:19:06 - What is "open science" and why it matters 00:21:53 - Big announcement: OpenSynthetic initiative 00:25:25 - Synthetic data experiments and surprising results 00:28:11 - "The Model is the Product" - explained 00:31:56 - Implications for companies building on top of models 00:35:25 - Differentiation in a world of shared base models 00:38:40 - Common Corpus: Origins and development 00:44:12 - The lack of open, legally clear datasets 00:47:03 - Anthropic's use of Common Corpus for mechanistic interpretability 00:50:20 - What makes good post-training? 00:54:00 - Reasoning under 400M parameters in SLMs 00:56:35 - Generalist scaling is stalling - where are the diminishing returns? 00:59:40 - Will specialization always win over scale? 01:02:00 - Opinionated and task-specialized models 01:06:29 - How inference cost drops change monetization models 01:09:12 - New value layers beyond token marketplaces 01:11:38 - Major technical obstacles to embedding workflows in models 01:13:40 - How smaller labs can compete on training infrastructure 01:15:36 - Should startups raise capital for AI training? 01:17:16 - What new capabilities do models need for orchestration? 01:19:50 - Designing verifier functions for agentic models 01:22:17 - RL in domains with weak or delayed rewards 01:24:50 - Multi-step training loops: Draft, verify, refine, backtrack 01:26:38 - The scarcity of agentic data and bootstrapping solutions 01:29:32 - Making agent training tractable at scale 01:31:44 - What is mid-training and why it matters 01:34:55 - Deployment, use cases, and hybrid model architectures 01:37:37 - Human-in-the-loop for regulated domains 01:39:48 - Advice for startups positioning in this transition 01:41:58 - Europe's structural challenges in AI 01:45:52 - Tokenizers: The overlooked competitive frontier 01:49:59 - Training LLMs on personal data and dead languages 01:52:12 - World models and JEPA architectures 01:53:50 - Building agentic systems: Stack and RL environments 01:55:34 - The art of training good RL models 01:58:49 - Trivia: Underrated habits and mindsets in research 02:00:09 - AI Twitter community and its impact 02:01:40 - Advice for folks starting in AI research 02:03:27 - Final thoughts and wrap-up

    २ता. ५मि.

ह्याविषयी

Your friendly neighborhood creative space shaping the frontier of tech, with occasional conversations and notes.