Getting Simple

#71: Alex O'Connor — Transformers, Generative AI, and the Deep Learning Revolution

Alex O’Connor—researcher and ML manager—on the latest trends of generative AI. Language and image models, prompt engineering, the latent space, fine-tuning, tokenization, textual inversion, adversarial attacks, and more.

Alex O’Connor got his PhD in Computer Science from Trinity College, Dublin. He was a postdoctoral researcher and funded investigator for the ADAPT Centre for digital content, at both TCD and later DCU. In 2017, he joined Pivotus, a Fintech startup, as Director of Research. Alex has been Sr Manager for Data Science & Machine Learning at Autodesk for the past few years, leading a team that delivers machine learning for e-commerce, including personalization and natural language processing.

Favorite quotes

  • “None of these models can read.”
  • “Art in the future may not be good, but it will be prompt.” Mastodon

Books

  • Machine Learning Systems Design by Chip Huyen
  • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron

Papers

  • The Illustrated Transformer by Jay Alammar
  • Attention Is All You Need by Google Brain
  • Transformers: a Primer by Justin Seonyong Lee

Links

  • Alex in Mastodon ★
  • Training Dream Booth Multimodal Art on HuggingFace by @akhaliq
  • NeurIPS
  • arxiv.org: Where most papers get published
  • Nono’s Discord
  • Suggestive Drawing: Nono’s master’s thesis
  • Crungus is a fictional character from Stable Diffusion’s latent space

Machine learning models

  • Stable Diffusion
  • Arcane Style Stable Diffusion fine-tuned model ★
  • Imagen
  • DALL-E
  • CLIP
  • GPT and ChatGPT
  • BERT, ALBERT & RoBERTa
  • Bloom
  • word2vec
  • Mupert.ai and Google’s MusicLM
  • t-SNE and UMAP: Dimensionality reduction techniques
  • char-rnn

Sites

  • TensorFlow Hub
  • HuggingFace Spaces ★
  • DreamBooth
  • Jasper AI
  • Midjourney
  • Distill.pub ★

Concepts

  • High-performance computing (HPC)
  • Transformers and Attention
  • Sequence transformers
  • Quadratic growth
  • Super resolution
  • Recurrent neural networks (RNNs)
  • Long short-term memory networks (LSTMs)
  • Gated recurrent units (GRUs)
  • Bayesian classifiers
  • Machine translation
  • Encoder-decoder
  • Gradio
  • Tokenization ★
  • Embeddings ★
  • Latent space
  • The distributional hypothesis
  • Textual inversion ★
  • Pretrained models
  • Zero-shot learning
  • Mercator projection

People mentioned

  • Ted Underwood UIUC
  • Chip Huyen
  • Aurélien Géron

Chapters

  • 00:00 · Introduction
  • 00:40 · Machine learning
  • 02:36 · Spam and scams
  • 15:57 · Adversarial attacks
  • 20:50 · Deep learning revolution
  • 23:06 · Transformers
  • 31:23 · Language models
  • 37:09 · Zero-shot learning
  • 42:16 · Prompt engineering
  • 43:45 · Training costs and hardware
  • 47:56 · Open contributions
  • 51:26 · BERT and Stable Diffusion
  • 54:42 · Tokenization
  • 59:36 · Latent space
  • 01:05:33 · Ethics
  • 01:10:39 · Fine-tuning and pretrained models
  • 01:18:43 · Textual inversion
  • 01:22:46 · Dimensionality reduction
  • 01:25:21 · Mission
  • 01:27:34 · Advice for beginners
  • 01:30:15 · Books and papers
  • 01:34:17 · The lab notebook
  • 01:44:57 · Thanks

I'd love to hear from you.

Submit a question about this or any previous episodes.

Join the Discord community. Meet other curious minds.

If you enjoy the show, would you please consider leaving a short review on Apple Podcasts/iTunes? It takes less than 60 seconds and really helps.

Show notes, transcripts, and past episodes at gettingsimple.com/podcast.

Thanks to Andrea Villalón Paredes for editing this interview.
Sleep and A Loop to Kill For songs by Steve Combs under CC BY 4.0.

Follow Nono

Twitter.com/nonoesp

Instagram.com/nonoesp

Facebook.com/nonomartinezalonso

YouTube.com/nonomartinezalonso