273 episodes

A daily show about AI made by AI: news, announcements, and research from arXiv, mixed in with some fun. Hosted by Giovani Pete Tizzano, an overly hyped AI enthusiast; Robert, an often unimpressed analyst, Olivia, an overly online reader, and Belinda, a witty research expert.

GPT Reviews Earkind

    • News

A daily show about AI made by AI: news, announcements, and research from arXiv, mixed in with some fun. Hosted by Giovani Pete Tizzano, an overly hyped AI enthusiast; Robert, an often unimpressed analyst, Olivia, an overly online reader, and Belinda, a witty research expert.

    Open LLM Upgrades 🆕 // Gemma 2 Performance 💎 // SeaKR's Self-aware Learning 🧠

    Open LLM Upgrades 🆕 // Gemma 2 Performance 💎 // SeaKR's Self-aware Learning 🧠

    HuggingFace has upgraded the Open LLM Leaderboard to v2, adding new benchmarks and improving the evaluation suite for easier reproducibility.

    Gemma 2, a new addition to the Gemma family of lightweight open models, delivers the best performance for its size and offers competitive alternatives to models that are 2-3× bigger.

    SeaKR is a new model that re-ranks retrieved knowledge based on the LLM's self-aware uncertainty, outperforming existing adaptive RAG methods in generating text with relevant and accurate information.

    Step-DPO is a new method that enhances the robustness and factuality of LLMs by learning from human feedback, achieving impressive results in long-chain mathematical reasoning.

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    01:21 HuggingFace Updates Open LLM Leaderboard

    03:19 Gemma 2: Improving Open Language Models at a Practical Size

    04:16 From bare metal to a 70B model: infrastructure set-up and scripts

    05:21 Fake sponsor

    07:11 SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

    08:47 Simulating Classroom Education with LLM-Empowered Agents

    10:16 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

    12:31 Outro

    • 13 min
    OpenAI Voice Delay ⏰ // Evolution-Simulating language model 🦕 // Multi-granularity vision flow 🌉

    OpenAI Voice Delay ⏰ // Evolution-Simulating language model 🦕 // Multi-granularity vision flow 🌉

    OpenAI's advanced Voice Mode for ChatGPT Plus users has been delayed, but the company is taking a cautious approach to ensure safety and reliability.

    ESM3 is a language model that can simulate 500 million years of evolution, making biology programmable and opening up possibilities for medicine, biology research, and clean energy.

    R2R is an open-source project on GitHub that offers a comprehensive and state-of-the-art retrieval-augmented generation system for developers, making it accessible to anyone who wants to try it out.

    MG-LLaVA is a new multi-modal large language model that enhances visual processing capabilities by incorporating a multi-granularity vision flow, including low-resolution, high-resolution, and object-centric features.

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    01:36 OpenAI Delays ChatGPT Voice Mode

    03:27 ESM3 Simulating 500 million years of evolution with a language model

    04:38 Rag to Riches

    06:00 Fake sponsor

    08:11 MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning

    09:49 Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

    11:13 Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

    13:02 Outro

    • 14 min
    Apple-Meta Partnership Fails Due to Privacy 🍎 // AI Meets Quantum Computing ⚛️ // Record Labels Sue AI Startups 🎵

    Apple-Meta Partnership Fails Due to Privacy 🍎 // AI Meets Quantum Computing ⚛️ // Record Labels Sue AI Startups 🎵

    Apple and Meta's failed partnership due to privacy concerns

    IBM's integration of AI technology into quantum computing

    Record labels suing AI startups for training on copyrighted material

    Research papers on improving multimodal understanding, reinforcement learning, and automated software engineering

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    02:07 Apple shelved the idea of integrating Meta’s AI models over privacy concerns, report says

    03:25 IBM Develops The AI-Quantum Link

    05:25 Record Labels Sue Two Startups for Training AI Models on Their Songs

    06:50 Fake sponsor

    08:42 Long Context Transfer from Language to Vision

    10:27 WARP: On the Benefits of Weight Averaged Rewarded Policies

    12:11 BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    13:55 Outro

    • 15 min
    Safe Superintelligence Inc. 👍 // Massive Supercomputer Partnership 💻 // Claude 3.5 Sonnet Launch 🚀

    Safe Superintelligence Inc. 👍 // Massive Supercomputer Partnership 💻 // Claude 3.5 Sonnet Launch 🚀

    Safe Superintelligence Inc. has launched with the goal of building a safe superintelligence AI that won't turn on humanity.

    Dell, Nvidia, and Super Micro Computer are partnering with xAI and Elon Musk to build a massive supercomputer that could use up to 100,000 Nvidia H100 GPUs, potentially making it 4x larger than the biggest existing AI clusters.

    Anthropic has launched Claude 3.5 Sonnet, their latest model family, which outperforms competitor models and even their own Claude 3 Opus on a wide range of evaluations.

    The papers discussed in this episode explore the decision boundaries of large language models, auto-optimized training hyperparameters for IR models, and thinking step-by-step across modalities using whiteboard-of-thought. These findings could have important implications for the future development of AI.

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    01:40 Ilya Sutskever Launches Safe Superintelligence Inc.

    03:04 Dell joins forces with Nvidia, Grok, xAI and Elon Musk

    04:23 Anthropic Lauches Claude 3.5 Sonnet

    06:10 Fake sponsor

    08:16 Probing the Decision Boundaries of In-context Learning in Large Language Models

    09:47 Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels

    11:05 Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

    12:54 Outro

    • 14 min
    DeepMind's AI Soundtracks 🎥 // Challenges of Training AI Clusters ⚡ // Large Language Model Factual Knowledge 🤯

    DeepMind's AI Soundtracks 🎥 // Challenges of Training AI Clusters ⚡ // Large Language Model Factual Knowledge 🤯

    Google DeepMind's new AI tool that generates video soundtracks by combining text prompts with visual content.

    Challenges of building large training AI clusters, including power, network topology, and reliability.

    How large language models acquire factual knowledge during pretraining and their probabilistic reasoning capabilities.

    LLARVA's vision-action instruction tuning that enhances robot learning.

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    01:47 Google DeepMind’s new AI tool uses video pixels and text prompts to generate soundtracks

    03:31 100,000 H100 Clusters: Power, Network Topology, Ethernet vs InfiniBand, Reliability, Failures, Checkpointing

    05:22 Large language model data pipelines and Common Crawl (WARC/WAT/WET)

    06:47 Fake sponsor

    08:20 How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    10:01 What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

    11:22 LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning

    13:06 Outro

    • 14 min
    TikTok's AI-Generated Avatars 🌎 // NVIDIA's Synthetic Data 🧪 // Cohere's Generative Models 🤖

    TikTok's AI-Generated Avatars 🌎 // NVIDIA's Synthetic Data 🧪 // Cohere's Generative Models 🤖

    TikTok is expanding its Symphony ad suite with AI-generated avatars of creators and paid actors, as well as a global translation tool for multi-language support.

    NVIDIA has released an open synthetic data generation pipeline for training large language models, which could benefit industries that rely on natural language processing.

    Cohere's latest generative models, Command R and R+, can automate and streamline complex business workflows, saving time and increasing efficiency.

    XLand-100B is a large-scale dataset for in-context reinforcement learning, providing a challenging benchmark for researchers in the field. CountGen addresses the challenge of controlling the number of depicted objects in text-to-image generation, while MM-NIAH is the first benchmark specifically designed to test the comprehension abilities of existing multimodal large language models.

    Contact:  sergi@earkind.com

    Timestamps:

    00:34 Introduction

    01:23 TikTok ads may soon contain AI-generated avatars of your favorite creators

    02:59 NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

    04:43 Automating Complex Business Workflows with Cohere: Multi-Step Tool Use in Action

    06:17 Fake sponsor

    08:22 XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

    10:23 Make It Count: Text-to-Image Generation with an Accurate Number of Objects

    11:58 Needle In A Multimodal Haystack

    13:37 Outro

    • 14 min

Top Podcasts In News

Trend Topic
Podbee Media
Global News Podcast
BBC World Service
Aposto Altı Otuz
Aposto Radyo
Yeni Haller
Wand Media Network
The Intelligence from The Economist
The Economist
Mesele Ekonomi
Mesele Ekonomi

You Might Also Like

This Day in AI Podcast
Michael Sharkey, Chris Sharkey
Practical AI: Machine Learning, Data Science
Changelog Media
Last Week in AI
Skynet Today
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
Nathaniel Whittemore
The AI Podcast
NVIDIA
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Sam Charrington