Scaling AI for an Immersive 3D Platform with 77 Million Daily Active Users w/ Anupam Singh & Maria Kazandjieva #191
We had a blast at ELC Annual 2024, so we wanted to bring our podcast listeners some of the best highlights from popular sessions! This episode features one of the ELC Annual sessions with Anupam Singh (VP of AI & Growth Engineering @ Roblox) & Maria Kazandjieva (Co-Founder @ Graft), as they discuss building AI/ML models at a massive scale. Anupam shares how Roblox – an immersive 3D platform with more than 77 million daily active users – scaled from zero to nearly 200 different AI models. They discuss strategies for deciding when to use open source vs. creating proprietary models; how to operationalize your models for 24/7 use; the importance of data pipelines; current and future challenges to keep in mind when creating / scaling AI models; and answer some questions from the live Q&A.
ABOUT ANUPAM SINGH
Anupam leads Roblox's AI & Growth engineering teams, which provide the infrastructure for high throughput AI services for safety, recommendations, and assistants. Before Roblox, Anupam was chief customer officer at Cloudera, where he led product, engineering, and field teams for Data Warehousing products. Anupam has co-founded two companies in the Big Data space, acquired by Cloudera and Marketshare, respectively. Anupam built his database expertise on the SQL Query Optimizer teams at Oracle, Sybase (now SAP), and Informix (now IBM). He graduated from Pune University in India and holds patents in the areas of automatic SQL performance tuning, object databases, and resilient query execution.
"The journey always starts with, 'Let's pick a model and first decide whether we want to build our own model or we want to use one of the open source ones.' The next step is, 'Do you want to do it on public cloud?' Roblox has 24 data centers worldwide and two massive data centers in America. We have hundreds of thousands of CPUs that we could use and so for us, it's very important to decide, 'Do we really need a large model? Can you take the 700 billion model, make it into a 7 billion parameter model, and magically get it to run on the CPU?'”
- Anupam Singh
ABOUT MARIA KAZANDJIEVA
Maria is a co-founder and an engineering leader at Graft, an early-stage AI startup. Prior to that, Maria worked at Netflix, where her team earned two Emmy awards for technical achievement. She holds a PhD in Computer Science from Stanford University. Outside of work, you can find Maria kickboxing & trail running, baking & eating carbs, or relaxing with a non-fiction book and her two feline supurrvisors, Foosball and Gemma.
SHOW NOTES:
- How Roblox is being powered by AI (00:30)
- The process of scaling AI models from zero to 200 @ Roblox (2:34)
- Examples of Roblox starting with open source vs. building its own model (5:06)
- What AI models are doing in terms of safety for children (7:12)
- Strategies for deciding to use open source vs. building a proprietary model (11:19)
- Why Roblox is choosing to open source some of their own models (13:06)
- How to operationalize / engineer AI models for 24/7 use at scale (14:20)
- The importance of data pipelines in the AI journey (16:18)
- Current / future challenges as Roblox continues to scale its models (19:52)
- Tips for identifying use cases where implementing & scaling AI can be helpful (22:21)
- Audience Q&A: How do you make decisions when you’re lacking specific measurements / quantities? (24:29)
- When you deploy a model, how do you ensure confidence in its performance? (27:36)
- Recommendations for allocating / estimating the budget for a model (29:03)
- Anupam’s insights on maintaining so many models effectively (31:03)
- How do you
Information
- Show
- FrequencyUpdated Weekly
- PublishedOctober 9, 2024 at 4:00 AM UTC
- Length38 min
- RatingClean