
9th October - AI News Daily - Google's Gemini 2.5 Unleashes Browser Automation, Reshaping Agent Capabilities
Send us a text
🌍 INAI • The Open AI Hub
The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day.
https://github.com/inai-sandy/inAI-wiki
TOP HIGHLIGHTS
- Google's Gemini 2.5 introduces "computer use" capabilities for browser automation, bringing agent automation to the mainstream
- AMD secures multi-billion GPU deal with OpenAI while Nvidia tightens direct sales, intensifying AI compute competition
- Security concerns emerge with first malicious MCP server discovery and Figma MCP vulnerability
- CoreWeave launches Serverless RL with Weights & Biases integration to simplify agent training
- Disney and Universal sue Midjourney over character imagery, escalating copyright debates
NEW TOOLS & FRAMEWORKS
- Microsoft unifies AutoGen and Semantic Kernel into enterprise-ready Agent Framework
- Anthropic releases Petri for open-source LLM auditing
- Google's Opal no-code app builder expands to 15 countries
- Stripe adds model pricing and usage tracking APIs
- Python 3.14 stabilizes GIL-free interpreter with Pydantic 2.12 support
LLM INNOVATIONS
- Ling-1T debuts trillion-parameter open-source reasoner
- Samsung's 7M-parameter Tiny Recursive Model outperforms larger systems
- AI21's Jamba Reasoning 3B offers efficient reasoning trade-offs
- Alibaba releases Qwen3 Omni multimodal model and Qwen Image Edit
- LiquidAI demonstrates on-device reasoning for iPhone 17 Pro
RESEARCH HIGHLIGHTS
- Drax achieves SOTA speech recognition with discrete flow matching
- ModernVBERT outperforms larger models through architecture innovation
- Multi-vector embeddings improve retrieval precision
- CAIS updates "Humanity's Last Exam" to rolling benchmark
- VChain introduces chain-of-visual-thought for video reasoning
- Research shows quantization resilience must be built into training
INDUSTRY & POLICY DEVELOPMENTS
- USPTO pilots AI-assisted prior-art discovery for patent applications
- Google faces DOJ scrutiny over Gemini integration in core services
- Hidden Unicode payload attacks affect some LLMs, including Gemini-class models
PRACTICAL RESOURCES
- Step-by-step RAG implementation guide for beginners
- Guide on when to parse vs. extract in document workflows
- Strategies for Sora 2 guardrails and watermarking
- Prompt optimization techniques for agent reliability
- Privacy best practices for biometric data handling
DEMOS & APPLICATIONS
- Intercom showcases LangGraph powering Fin_ai customer support
- Pika's Predictive Video enables prompt-to-clip creation
- Sora-powered "viral video recreator" teased
- Seedream mobile agent enables on-device image generation
- Cristiano Ronaldo reportedly used Perplexity AI for speech preparation
THOUGHT-PROVOKING DISCUSSIONS
- JEPAs may bridge generative and contrastive learning
- Quality over quantity emphasized for RL training signals
- Studies show sycophantic AI undermines relationship repair
- LLM checks identify 80M+ inconsistent Wikipedia facts
- Industry consolidation raises concerns about AI infrastructure access
- Sora's upside-down exploit highlights evaluation gaps
Support the show
Informations
- Émission
- FréquenceTous les jours
- Publiée9 octobre 2025 à 01:30 UTC
- Durée13 min
- Saison1
- Épisode113
- ClassificationTous publics