AI News Daily

9th October - AI News Daily - Google's Gemini 2.5 Unleashes Browser Automation, Reshaping Agent Capabilities

Send us a text

🌍 INAI • The Open AI Hub

The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day.

https://github.com/inai-sandy/inAI-wiki

TOP HIGHLIGHTS

  • Google's Gemini 2.5 introduces "computer use" capabilities for browser automation, bringing agent automation to the mainstream
  • AMD secures multi-billion GPU deal with OpenAI while Nvidia tightens direct sales, intensifying AI compute competition
  • Security concerns emerge with first malicious MCP server discovery and Figma MCP vulnerability
  • CoreWeave launches Serverless RL with Weights & Biases integration to simplify agent training
  • Disney and Universal sue Midjourney over character imagery, escalating copyright debates

NEW TOOLS & FRAMEWORKS

  • Microsoft unifies AutoGen and Semantic Kernel into enterprise-ready Agent Framework
  • Anthropic releases Petri for open-source LLM auditing
  • Google's Opal no-code app builder expands to 15 countries
  • Stripe adds model pricing and usage tracking APIs
  • Python 3.14 stabilizes GIL-free interpreter with Pydantic 2.12 support

LLM INNOVATIONS

  • Ling-1T debuts trillion-parameter open-source reasoner
  • Samsung's 7M-parameter Tiny Recursive Model outperforms larger systems
  • AI21's Jamba Reasoning 3B offers efficient reasoning trade-offs
  • Alibaba releases Qwen3 Omni multimodal model and Qwen Image Edit
  • LiquidAI demonstrates on-device reasoning for iPhone 17 Pro

RESEARCH HIGHLIGHTS

  • Drax achieves SOTA speech recognition with discrete flow matching
  • ModernVBERT outperforms larger models through architecture innovation
  • Multi-vector embeddings improve retrieval precision
  • CAIS updates "Humanity's Last Exam" to rolling benchmark
  • VChain introduces chain-of-visual-thought for video reasoning
  • Research shows quantization resilience must be built into training

INDUSTRY & POLICY DEVELOPMENTS

  • USPTO pilots AI-assisted prior-art discovery for patent applications
  • Google faces DOJ scrutiny over Gemini integration in core services
  • Hidden Unicode payload attacks affect some LLMs, including Gemini-class models

PRACTICAL RESOURCES

  • Step-by-step RAG implementation guide for beginners
  • Guide on when to parse vs. extract in document workflows
  • Strategies for Sora 2 guardrails and watermarking
  • Prompt optimization techniques for agent reliability
  • Privacy best practices for biometric data handling

DEMOS & APPLICATIONS

  • Intercom showcases LangGraph powering Fin_ai customer support
  • Pika's Predictive Video enables prompt-to-clip creation
  • Sora-powered "viral video recreator" teased
  • Seedream mobile agent enables on-device image generation
  • Cristiano Ronaldo reportedly used Perplexity AI for speech preparation

THOUGHT-PROVOKING DISCUSSIONS

  • JEPAs may bridge generative and contrastive learning
  • Quality over quantity emphasized for RL training signals
  • Studies show sycophantic AI undermines relationship repair
  • LLM checks identify 80M+ inconsistent Wikipedia facts
  • Industry consolidation raises concerns about AI infrastructure access
  • Sora's upside-down exploit highlights evaluation gaps

Support the show