1 hr 26 min

172: Transformers and Large Language Models Programming Throwdown

- How To

172: Transformers and Large Language Models

Intro topic: Is WFH actually WFC?
News/Links:
Falsehoods Junior Developers Believe about Becoming Seniorhttps://vadimkravcenko.com/shorts/falsehoods-junior-developers-believe-about-becoming-senior/Pure PursuitTutorial with python code: https://wiki.purduesigbots.com/software/control-algorithms/basic-pure-pursuit Video example: https://www.youtube.com/watch?v=qYR7mmcwT2w PID without a PHDhttps://www.wescottdesign.com/articles/pid/pidWithoutAPhd.pdfGoogle releases Gemmahttps://blog.google/technology/developers/gemma-open-models/
Book of the Show
Patrick: The Eye of the World by Robert Jordan (Wheel of Time)https://amzn.to/3uEhg6vJason: How to Make a Video Game All By Yourselfhttps://amzn.to/3UZtP7b
Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h

Tool of the Show
Patrick: Stadia Controller Wifi to Bluetooth Unlockhttps://stadia.google.com/controller/index_en_US.htmlJason: FUSE and SSHFShttps://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh
Topic: Transformers and Large Language Models
How neural networks store informationLatent variablesTransformersEncoders & DecodersAttention LayersHistoryRNNVanishing Gradient ProblemLSTMShort term (gradient explodes), Long term (gradient vanishes)Differentiable algebraKey-Query-ValueSelf AttentionSelf-Supervised Learning & Forward ModelsHuman FeedbackReinforcement Learning from Human FeedbackDirect Policy Optimization (Pairwise Ranking)

★ Support this podcast on Patreon ★

1 hr 26 min