Open Source Startup Podcast

E181: Why Multimodal Is the Future of AI Data Workloads

Chang She is Co-Founder & CEO of LanceDB, the multimodal lakehouse platform. Their open source data format lance has over 5K stars on GitHub and is a modern columnar data format for ML and LLMs implemented in Rust.

LanceDB has raised $41M from investors including Theory Ventures, CRV, and Essence VC.

In this episode, we dig into:

  • Early focus: autonomous vehicles; solved real-time analysis limits with Lance format → 9,000% performance gain.

  • Multi-modal AI taking off (vision, audio, text); Midjourney & Runway as pioneers; audio now a major category.

  • How they built trust through open source.

  • Integrated workflows (data prep + search + embedding) going beyond vector DBs; education needed to show full value.

  • Cloud/serverless launch in 2023–24 enabled seamless local-to-production use.

  • Future bets: audio infra, robotics, spatial reasoning; vector DBs risk irrelevance if they don’t evolve.