Deep Reinforcement Learning in the Real World with Anna Goldie ODSC's Ai X Podcast

- Technology

Requires macOS 11.4 or higher

In this episode, you’ll explore the field of deep reinforcement learning and the ways it influences the real world with Anna Goldie, Senior Staff Research Scientist at Google DeepMind.

Anna’s current role has her working on Large Language Model (LLM) research for Gemini and Bard. Prior to that, she worked on reinforcement learning for LLMs and retrieval-augmented LLMs at Anthropic, and was co-founder/lead of the ML for Systems team in Google Brain.

During this wide-ranging discussion, you’ll learn about her contributions to the field of reinforcement learning, and how we can leverage reinforcement learning effectively for real world applications going forward.
Sponsored by: https://odsc.com/
Find more ODSC lightning interviews, webinars, live trainings, certifications, bootcamps here – https://aiplus.training/

Topics:
1. Professional journey and the key moments
2. Core principles of deep reinforcement learning
3. Deep reinforcement learning for chip design vs traditional approaches
4. Key complexities in modern chip design and how deep reinforcement learning can address these complexities
5. Discuss Google’s TPUs - Tensor Processing Units - built specifically for accelerating machine learning workloads.
6. The potential of Deep Reinforcement learning in computer systems or other domains within Google Deepmind.
7. Deep reinforcement learning use in Large Language Models (LLMs)
8. Reinforcement Learning from Human Feedback (RLHF), designing effective rewards and providing feedback at scale
9. Scalable supervision techniques, for developing methods to efficiently gather feedback that aligns the LLM with human preferences
10. Implement the Constitutional AI framework where AI models are guided by a set of foundational principles or 'constitutional' directives
11. How Retrieval Augment Generation (RAG) systems improve the accuracy and relevance of responses compared to standard large language models even LLMs with large retrieval context windows
12. How “RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVALcompares to traditional RAG approaches that retrieve short, contiguous chunks?
13. Hierarchical summaries with RAPTOR
14. LLM Finetuning With Low-Rank Adaptation (LoRA)
15. Google's Gemini 1.5 next-generation LLM model and mixture of experts architecture
16. CALM—Composition to Augment Language Models
17. How dual undergraduate degrees in both computer science and linguistics from MIT has contributed to your innovative work in machine learning,
18. Constitutional AI at Antropic https://www.anthropic.com/news/claudes-constitution
19. What is the best way to follow your work?
20. Keynote address at ODSC East in mid-April.

SHOW NOTES
More about Anna Godie
https://www.linkedin.com/in/adgoldie/
https://www.annagoldie.com/

More about Constitutional AI at Antropic
https://www.anthropic.com/news/claudes-constitution
Constitutional AI: Harmlessness from AI Feedback
https://arxiv.org/pdf/2212.08073.pdf

More about Anna’s Paper
RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVAL
https://openreview.net/pdf?id=GN921JHCRw
The official Code implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
https://github.com/parthsarthi03/raptor

More about Large Language Models
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
https://arxiv.org/abs/2305.18290
LLM Finetuning With Low-Rank Adaptation (LoRA)
https://lightning.ai/pages/community/article/lora-llm

CALM—Composition to Augment Language Models
https://arxiv.org/pdf/2401.02412.pdf
https://www.anthropic.com/news/claudes-constitution