9 OCT.
S1, E113
13 MIN

9th October - AI News Daily - Google's Gemini 2.5 Unleashes Browser Automation, Reshaping Agent Capabilities

Send us a text

🌍 INAI • The Open AI Hub

The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day.

https://github.com/inai-sandy/inAI-wiki

TOP HIGHLIGHTS

Google's Gemini 2.5 introduces "computer use" capabilities for browser automation, bringing agent automation to the mainstream
AMD secures multi-billion GPU deal with OpenAI while Nvidia tightens direct sales, intensifying AI compute competition
Security concerns emerge with first malicious MCP server discovery and Figma MCP vulnerability
CoreWeave launches Serverless RL with Weights & Biases integration to simplify agent training
Disney and Universal sue Midjourney over character imagery, escalating copyright debates

NEW TOOLS & FRAMEWORKS

Microsoft unifies AutoGen and Semantic Kernel into enterprise-ready Agent Framework
Anthropic releases Petri for open-source LLM auditing
Google's Opal no-code app builder expands to 15 countries
Stripe adds model pricing and usage tracking APIs
Python 3.14 stabilizes GIL-free interpreter with Pydantic 2.12 support

LLM INNOVATIONS

Ling-1T debuts trillion-parameter open-source reasoner
Samsung's 7M-parameter Tiny Recursive Model outperforms larger systems
AI21's Jamba Reasoning 3B offers efficient reasoning trade-offs
Alibaba releases Qwen3 Omni multimodal model and Qwen Image Edit
LiquidAI demonstrates on-device reasoning for iPhone 17 Pro

RESEARCH HIGHLIGHTS

Drax achieves SOTA speech recognition with discrete flow matching
ModernVBERT outperforms larger models through architecture innovation
Multi-vector embeddings improve retrieval precision
CAIS updates "Humanity's Last Exam" to rolling benchmark
VChain introduces chain-of-visual-thought for video reasoning
Research shows quantization resilience must be built into training

INDUSTRY & POLICY DEVELOPMENTS

USPTO pilots AI-assisted prior-art discovery for patent applications
Google faces DOJ scrutiny over Gemini integration in core services
Hidden Unicode payload attacks affect some LLMs, including Gemini-class models

PRACTICAL RESOURCES

Step-by-step RAG implementation guide for beginners
Guide on when to parse vs. extract in document workflows
Strategies for Sora 2 guardrails and watermarking
Prompt optimization techniques for agent reliability
Privacy best practices for biometric data handling

DEMOS & APPLICATIONS

Intercom showcases LangGraph powering Fin_ai customer support
Pika's Predictive Video enables prompt-to-clip creation
Sora-powered "viral video recreator" teased
Seedream mobile agent enables on-device image generation
Cristiano Ronaldo reportedly used Perplexity AI for speech preparation

THOUGHT-PROVOKING DISCUSSIONS

JEPAs may bridge generative and contrastive learning
Quality over quantity emphasized for RL training signals
Studies show sycophantic AI undermines relationship repair
LLM checks identify 80M+ inconsistent Wikipedia facts
Industry consolidation raises concerns about AI infrastructure access
Sora's upside-down exploit highlights evaluation gaps

Support the show

Émission

AI News Daily
Fréquence

Tous les jours
Publiée

9 octobre 2025 à 01:30 UTC
Durée

13 min
Saison

1
Épisode

113
Classification

Tous publics

9th October - AI News Daily - Google's Gemini 2.5 Unleashes Browser Automation, Reshaping Agent Capabilities

Informations