GPT Reviews Earkind
-
- Nieuws
-
A daily show about AI made by AI: news, announcements, and research from arXiv, mixed in with some fun. Hosted by Giovani Pete Tizzano, an overly hyped AI enthusiast; Robert, an often unimpressed analyst, Olivia, an overly online reader, and Belinda, a witty research expert.
-
Investments Pay Off for MSFT 💰 // Apple's Language Models 🍎 // Improved Language Search 🔎
Microsoft's investment in AI is paying off, with a 17% jump in revenue and a 20% increase in profit for the first three months of the year.
Apple has released eight small AI language models aimed at on-device use, using a "layer-wise scaling strategy" to improve performance and transparency.
Multi-Head Mixture-of-Experts is a new approach to address issues with Sparse Mixtures of Experts, outperforming existing models on three different tasks.
Stream of Search (SoS) is a new technique for teaching language models to search, resulting in improved search accuracy and the ability to solve previously unsolved problems.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:28 Microsoft Reports Rising Revenues as A.I. Investments Bear Fruit
03:14 Apple releases eight small AI language models aimed at on-device use
05:00 Fake sponsor
07:01 Multi-Head Mixture-of-Experts
08:43 Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Perfect Reasoners
10:30 Stream of Search (SoS): Learning to Search in Language
12:31 Outro -
Meta's Stock Plunge 💸 // TSMC's A16 Process 🚀 // Instruction Hierarchy Boosting LLMs 📈
Meta's aggressive AI investments have caused a 13% plunge in their stock, threatening to wipe out almost $163 billion from their market value.
TSMC's new A16 manufacturing process promises to outperform its predecessor, N2P, by a significant margin, with an up to 10% higher clock rate at the same voltage and a 15% - 20% lower power consumption at the same frequency and complexity.
The Instruction Hierarchy proposes a data generation method to demonstrate hierarchical instruction following behavior, which drastically increases robustness for LLMs against attacks.
SPLATE is a lightweight adaptation of the ColBERTv2 model that improves the efficiency of late interaction retrieval, particularly for running ColBERT on CPU environments.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:27 Meta’s stock plunges on ‘aggressive’ AI spending plans
02:49 TSMC unveils 1.6nm process technology with backside power delivery, rivals Intel's competing design
04:48 tiny-gpu
05:59 Fake sponsor
07:35 The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
08:43 A Reproducibility Study of PLAID
10:18 SPLATE: Sparse Late Interaction Retrieval
12:00 Outro -
Perplexity's Funding 🦄 // NVIDIA acquires Run:ai 🏎️ // Llama-3 on LM Leaderboard 🧐
Perplexity becomes an AI unicorn with a new $63 million funding round.
NVIDIA acquires Run:ai, an Israeli startup that provides Kubernetes-based workload management and orchestration software for AI computing resources.
Llama-3 language model reaches the top-5 of the LM arena leaderboard.
New AI research papers explore efficient language models, LLMs that can read your minds, and mixtures of experts.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:44 Perplexity becomes an AI unicorn with new $63 million funding round
03:20 NVIDIA to Acquire GPU Orchestration Software Provider Run:ai
05:24 Llama 3 on top-5 of LM arena leaderboard
06:48 Fake sponsor
08:54 OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
10:32 SnapKV: LLM Knows What You are Looking for Before Generation
12:37 Multi-Head Mixture-of-Experts
14:37 Outro -
Phi-3 from Microsoft 💻 // SoftBank Invests $1B in Nvidia 🤑 // HuggingFace's FineWeb Dataset 🌐
Microsoft has launched its smallest AI model yet, the Phi-3 Mini, which is designed to be smaller and cheaper to run than its larger counterparts.
SoftBank plans to invest nearly $1 billion in Nvidia's chips to bolster its computing facilities and develop its own generative AI, giving Japan a strong domestic player in the AI space.
HuggingFace has released FineWeb, a dataset consisting of more than 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl, which outperforms models trained on other commonly used high-quality web datasets.
The papers discussed in this episode cover topics such as extending embedding models for long context retrieval, automating graphic design using large multimodal models, and Microsoft's innovative approach to training the Phi-3 Mini AI model.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:35 Microsoft launches Phi-3, its smallest AI model yet
03:10 SoftBank will reportedly invest nearly $1 billion in AI push, tapping Nvidia’s chips
05:11 HuggingFace Releases FineWeb: 15 Trillion tokens to train on
06:02 Fake sponsor
08:15 Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
09:42 LongEmbed: Extending Embedding Models for Long Context Retrieval
11:04 Graphic Design with Large Multimodal Model
12:53 Outro -
Google's Re-Org 🤖 // Electric Atlas Robot ⚡ // Large Language Models in Theorem Proving 🔍
Google merges Android, Chrome, and hardware divisions to deliver higher quality products and experiences for users and partners, with a focus on AI innovation.
Boston Dynamics introduces the electric Atlas robot, designed for real-world applications and stronger, more dexterous, and more agile than its predecessors.
"Towards Large Language Models as Copilots for Theorem Proving in Lean" explores using large language models to assist humans in theorem proving.
"AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation" introduces AutoCrawler, a framework for generating web crawlers that leverages the power of large language models to handle diverse and changing web environments more efficiently.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:32 Google merges the Android, Chrome, and hardware divisions
03:02 New Atlas Robot from Boston Dynamics
05:01 Karpathi On Llama3
06:19 Fake sponsor
08:14 Towards Large Language Models as Copilots for Theorem Proving in Lean
09:47 AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation
11:21 Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
12:58 Outro -
Meta Announces Llama 3 🤖 // Microsoft's $1.5B Investment 💰 // Dynamic Text Animation 🎥
Meta announces the release of Llama 3, their new open-source language model with improved reasoning and instruction-following capabilities.
Microsoft invests $1.5 billion in UAE-based AI firm G42, with concerns over its China links requiring negotiations with the Biden administration.
Researchers present "Dynamic Typography," an automated text animation scheme that combines deforming letters to convey semantic meaning and infusing them with movement based on user prompts.
The AI Safety Benchmark from MLCommons is a tool to assess the safety risks of AI systems that use chat-tuned language models, covering 7 of the 13 hazard categories identified by the working group.
Contact: sergi@earkind.com
Timestamps:
00:34 Introduction
01:47 Meta Announces Llama 3
03:11 Microsoft invests $1.5B in UAE AI firm
04:59 Randar: A Minecraft exploit that uses LLL lattice reduction to crack server RNG
06:23 Fake sponsor
08:08 Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
09:38 Introducing v0.5 of the AI Safety Benchmark from MLCommons
11:15 BLINK: Multimodal Large Language Models Can See but Not Perceive
13:05 Outro
Klantrecensies
Clumsy but with potential!
It’s surprisingly informative and entertaining, and although sometimes it’s hit-or-miss, I’m looking forward to see where this evolves.