AI + a16z

a16z
AI + a16z

Artificial intelligence is changing everything from art to enterprise IT, and a16z is watching all of it with a close eye. This podcast features discussions with leading AI engineers, founders, and experts, as well as our general partners, about where the technology and industry are heading.

  1. AI's Unsung Hero: Data Labeling and Expert Evals

    4 DAYS AGO

    AI's Unsung Hero: Data Labeling and Expert Evals

    Labelbox CEO Manu Sharma joins a16z Infra partner Matt Bornstein to explore the evolution of data labeling and evaluation in AI — from early supervised learning to today’s sophisticated reinforcement learning loops. Manu recounts Labelbox’s origins in computer vision, and then how the shift to foundation models and generative AI changed the game. The value moved from pre-training to post-training and, today, models are trained not just to answer questions, but to assess the quality of their own responses. Labelbox has responded by building a global network of “aligners” — top professionals from fields like  coding, healthcare, and customer service, who label and evaluate data used to fine-tune AI systems. The conversation also touches on Meta’s acquisition of Scale AI, underscoring how critical data and talent have become in the AGI race.  Here's a sample of Manu explaining how Labelbox was able to transition from one era of AI to another: It took us some time to really understand like that the world is shifting from building AI models to renting AI intelligence. A vast number of enterprises around the world are no longer building their own models; they're actually renting base intelligence and adding on top of it to make that work for their company. And that was a very big shift.  But then the even bigger opportunity was the hyperscalers and the AI labs that are spending billions of dollars of capital developing these models and data sets. We really ought to go and figure out and innovate for them. For us, it was a big shift from the DNA perspective because Labelbox was built with a hardcore software-tools mindset. Our go-to market, engineering, and product and design teams operated like software companies.  But I think the hardest part for many of us, at that time, was to just make the decision that we're going just go try it and do it. And nothing is better than that: "Let's just go build an MVP and see what happens." Follow everyone on X: Manu Sharma Matt Bornstein Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

    47 min
  2. Beyond Leaderboards: LMArena’s Mission to Make AI Reliable

    30 MAY

    Beyond Leaderboards: LMArena’s Mission to Make AI Reliable

    LMArena cofounders Anastasios N. Angelopoulos, Wei-Lin Chiang, and Ion Stoica sit down with a16z general partner Anjney Midha to talk about the future of AI evaluation. As benchmarks struggle to keep up with the pace of real-world deployment, LMArena is reframing the problem: what if the best way to test AI models is to put them in front of millions of users and let them vote? The team discusses how Arena evolved from a research side project into a key part of the AI stack, why fresh and subjective data is crucial for reliability, and what it means to build a CI/CD pipeline for large models. They also explore: Why expert-only benchmarks are no longer enough.How user preferences reveal model capabilities — and their limits.What it takes to build personalized leaderboards and evaluation SDKs.Why real-time testing is foundational for mission-critical AI.Follow everyone on X: Anastasios N. Angelopoulos Wei-Lin Chiang Ion Stoica Anjney Midha Timestamps0:04 -  LLM evaluation: From consumer chatbots to mission-critical systems 6:04 -  Style and substance: Crowdsourcing expertise 18:51 -  Building immunity to overfitting and gaming the system 29:49 -  The roots of LMArena 41:29 -   Proving the value of academic AI research 48:28 -  Scaling LMArena and starting a company 59:59 -  Benchmarks, evaluations, and the value of ranking LLMs 1:12:13 -  The challenges of measuring AI reliability 1:17:57 -  Expanding beyond binary rankings as models evolve 1:28:07 -  A leaderboard for each prompt 1:31:28 -  The LMArena roadmap 1:34:29 -  The importance of open source and openness 1:43:10 -  Adapting to agents (and other AI evolutions) Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

    1h 42m
  3. Who's Coding Now? AI and the Future of Software Development

    16 MAY

    Who's Coding Now? AI and the Future of Software Development

    In this episode of the a16z AI podcast, a16z Infra partners Guido Appenzeller, Matt Bornstein, and Yoko Li explore how generative AI is reshaping software development. From its potential as a new high-level programming abstraction to its current practical impacts, they discuss whether AI coding tools will redefine what it means to be a developer. Why has coding emerged as one of AI's most powerful use cases? How much can AI truly boost developer productivity, and will it fundamentally change traditional computer science education? Guido, Yoko, and Matt dive deep into these questions, addressing the dynamics of "vibe coding," the enduring role of formal programming languages, and the critical challenge of managing non-deterministic behavior in AI-driven applications.Among other things, they discuss: The enormous market potential of AI-generated code, projected to deliver trillions in productivity gains.How "prompt-based programming" is evolving from Stack Overflow replacements into sophisticated development assistants.Why formal languages like Python and Java are here to stay, even as natural language interactions become common.The shifting landscape of programming education, and why understanding foundational abstractions remains essential.The unique complexities of integrating AI into enterprise software, from managing uncertainty to ensuring reliability. Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

    45 min
  4. MCP Co-Creator on the Next Wave of LLM Innovation

    2 MAY

    MCP Co-Creator on the Next Wave of LLM Innovation

    In this episode of AI + a16z, Anthropic's David Soria Parra — who created MCP (Model Context Protocol) along with Justin Spahr-Summers — sits down with a16z's Yoko Li to discuss the project's inception, exciting use cases for connecting LLMs to external sources, and what's coming next for the project. If you're unfamiliar with the wildly popular MCP project, this edited passage from their discussion is a great starting point to learn: David: "MCP tries to enable building AI applications in such a way that they can be extended by everyone else that is not part of the original development team through these MCP servers, and really bring the workflows you care about, the things you want to do, to these AI applications. It's a protocol that just defines how whatever you are building as a developer for that integration piece, and that AI application, talk to each other.  "It's a very boring specification, but what it enables is hopefully ... something that looks like the current API ecosystem, but for LLM interactions." Yoko: "I really love the analogy with the API ecosystem, because they give people a mental model of how the ecosystem evolves ... Before, you may have needed a different spec to query Salesforce versus query HubSpot. Now you can use similarly defined API schema to do that. "And then when I saw MCP earlier in the year, it was very interesting in that it almost felt like a standard interface for the agent to interface with LLMs. It's like, 'What are the set of things that the agent wants to execute on that it has never seen before? What kind of context does it need to make these things happen?' When I tried it out, it was just super powerful and I no longer have to build one tool per client. I now can build just one MCP server, for example, for sending emails, and I use it for everything on Cursor, on Claude Desktop, on Goose." Learn more: A Deep Dive Into MCP and the Future of AI Tooling What Is an AI Agent? Benchmarking AI Agents on Full-Stack Coding Agent Experience: Building an Open Web for the AI Era Follow everyone on X: David Soria Parra Yoko Li Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

    54 min

About

Artificial intelligence is changing everything from art to enterprise IT, and a16z is watching all of it with a close eye. This podcast features discussions with leading AI engineers, founders, and experts, as well as our general partners, about where the technology and industry are heading.

You Might Also Like

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada