1 hr 8 min

Ian Osband TalkRL: The Reinforcement Learning Podcast

    • Technology

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  
We spoke about: 
- Information theory and RL 
- Exploration, epistemic uncertainty and joint predictions 
- Epistemic Neural Networks and scaling to LLMs 
Featured References 
Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen 
From Predictions to Decisions: The Importance of Joint Predictive Distributions 
Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy  
 
Epistemic Neural Networks 
Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy  
Approximate Thompson Sampling via Epistemic Neural Networks 
Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy 
  
Additional References  
Thesis defence, Ian Osband Homepage, Ian Osband Epistemic Neural Networks at Stanford RL Forum Behaviour Suite for Reinforcement Learning, Osband et al 2019 Efficient Exploration for LLMs, Dwaracherla et al 2024 

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  
We spoke about: 
- Information theory and RL 
- Exploration, epistemic uncertainty and joint predictions 
- Epistemic Neural Networks and scaling to LLMs 
Featured References 
Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen 
From Predictions to Decisions: The Importance of Joint Predictive Distributions 
Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy  
 
Epistemic Neural Networks 
Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy  
Approximate Thompson Sampling via Epistemic Neural Networks 
Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy 
  
Additional References  
Thesis defence, Ian Osband Homepage, Ian Osband Epistemic Neural Networks at Stanford RL Forum Behaviour Suite for Reinforcement Learning, Osband et al 2019 Efficient Exploration for LLMs, Dwaracherla et al 2024 

1 hr 8 min

Top Podcasts In Technology

Lex Fridman Podcast
Lex Fridman
Acquired
Ben Gilbert and David Rosenthal
All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Lenny's Podcast: Product | Growth | Career
Lenny Rachitsky
Waveform: The MKBHD Podcast
Vox Media Podcast Network
The TED AI Show
TED