172: Transformers and Large Language Models
Intro topic: Is WFH actually WFC?
News/Links:
- Falsehoods Junior Developers Believe about Becoming Senior
- https://vadimkravcenko.com/shorts/falsehoods-junior-developers-believe-about-becoming-senior/
- Pure Pursuit
- Tutorial with python code: https://wiki.purduesigbots.com/software/control-algorithms/basic-pure-pursuit
- Video example: https://www.youtube.com/watch?v=qYR7mmcwT2w
- PID without a PHD
- https://www.wescottdesign.com/articles/pid/pidWithoutAPhd.pdf
- Google releases Gemma
- https://blog.google/technology/developers/gemma-open-models/
- https://blog.google/technology/developers/gemma-open-models/
Book of the Show
- Patrick: The Eye of the World by Robert Jordan (Wheel of Time)
- https://amzn.to/3uEhg6v
- Jason: How to Make a Video Game All By Yourself
- https://amzn.to/3UZtP7b
Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h
Tool of the Show
- Patrick: Stadia Controller Wifi to Bluetooth Unlock
- https://stadia.google.com/controller/index_en_US.html
- Jason: FUSE and SSHFS
- https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh
- https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh
Topic: Transformers and Large Language Models
- How neural networks store information
- Latent variables
- Transformers
- Encoders & Decoders
- Attention Layers
- History
- RNN
- Vanishing Gradient Problem
- LSTM
- Short term (gradient explodes), Long term (gradient vanishes)
- RNN
- Differentiable algebra
- Key-Query-Value
- Self Attention
- History
- Self-Supervised Learning & Forward Models
- Human Feedback
- Reinforcement Learning from Human Feedback
- Direct Policy Optimization (Pairwise Ranking)
★ Support this podcast on Patreon ★
Information
- Show
- FrequencyMonthly
- Published11 March 2024 at 15:00 UTC
- Length1h 26m
- Episode172
- RatingClean