
Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.
https://assemblyai.com/aiexplained
Chapters:
00:00 - Introduction
00:56 - GPT 5.1 Smarter?
01:47 - Some Regressions
03:22 - Sycophancy?
05:22 - Claude Auto-Hacking
06:16 - Jailbreaking through Granularity
08:22 - This Will be Re-used
09:30 - Hallucinating Hacker
09:57 - Surprisingly Neutral Tone
12:18 - SIMA 2
14:10 - Alpha Parallels
17:24 - AI Music
GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/
System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf
Benchmarks: https://openai.com/index/gpt-5-1-for-developers/
Simple Bench: https://lmcouncil.ai/benchmarks
Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618
https://www.anthropic.com/news/disrupting-AI-espionage
Report: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf
Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
https://x.com/amoufarek/status/1988986075331858693
Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/
Voyager: https://voyager.minedojo.org/
Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
信息
- 节目
- 频率一周一更
- 发布时间2025年11月14日 UTC 17:00
- 长度18 分钟
- 季4
- 单集2
- 分级儿童适宜