## Short Segments Google Research enhances enterprise search with Agentic RAG, tackling multi-hop queries for more accurate results. Today, we're diving into Google's latest addition to the Gemini Enterprise Agent Platform, which aims to solve a common problem in enterprise search: handling complex, multi-source queries. And later, we'll explore Microsoft's new MAI-Transcribe-1.5, a speech-to-text model that promises faster and more accurate transcription across 43 languages. Google Research has introduced a new agentic RAG framework, now part of the Gemini Enterprise Agent Platform. This innovation powers Cross-Corpus Retrieval, currently in public preview, and addresses a known failure mode in enterprise search. Traditional single-step RAG systems struggle with multi-source, multi-hop queries, often returning incomplete answers. Google's Agentic RAG framework plans, reasons, and interacts with data sources iteratively, improving dependability and accuracy. It includes a sufficient context check before generating responses, increasing accuracy on factuality datasets by up to 34%. This multi-agent architecture functions like an organized research department, with specialized roles enhancing the search process. The result is a more reliable and accurate enterprise search experience, particularly for complex queries that require information from multiple sources. ## Feature Story Microsoft's MAI-Transcribe-1.5 sets a new standard in multilingual speech-to-text technology, offering unprecedented accuracy and speed. Last week, Microsoft AI unveiled MAI-Transcribe-1.5, the latest iteration of its in-house speech-to-text model. This model is designed to handle 43 languages, including diverse accents and noisy environments, making it a robust tool for production transcription workloads. MAI-Transcribe-1.5 is an automatic speech recognition model that converts audio into text. Unlike many transcription services that rely on third-party bases, Microsoft built this model entirely in-house. It's integrated into various Microsoft products, such as Copilot, Teams, GitHub, and Dynamics 365 Contact Centre, and is available on Microsoft's Foundry platform. The model's accuracy is measured by Word-Error-Rate (WER), with a lower WER indicating fewer transcription errors. Microsoft reports that MAI-Transcribe-1.5 achieves best-in-class WER across 43 languages on the FLEURS benchmark, a standard for multilingual transcription. On the Artificial Analysis leaderboard, it posts a WER of 2.4%, placing it third among competitors. This dual achievement highlights the model's strength in both accuracy and language coverage. One of the significant advancements in MAI-Transcribe-1.5 is its expanded language support. The model now covers 43 languages, up from 25, without sacrificing accuracy. This expansion includes 18 new languages, with a focus on South Asian languages like Bengali, Tamil, and Telugu. This broad coverage makes the model particularly valuable for global enterprises and multilingual environments. In addition to its accuracy, MAI-Transcribe-1.5 is up to five times faster than previous models like Gemini 3.1 Flash and ScribeV2 on the Artificial Analysis leaderboard. This speed, combined with its accuracy, positions it as a leading choice for enterprises needing efficient and reliable transcription services. For businesses, this means more accessible and accurate transcription capabilities, reducing the time and cost associated with manual transcription. The integration of MAI-Transcribe-1.5 into Microsoft's suite of products also means that users can expect seamless transcription services across various platforms, enhancing productivity and communication. Looking ahead, the introduction of MAI-Transcribe-1.5 could set a new benchmark for speech-to-text technology, encouraging further innovation in the field. As enterprises continue to seek efficient ways to manage and analyze audio data, models like MAI-Transcribe-1.5 will play a crucial role in meeting these demands. In summary, Microsoft's MAI-Transcribe-1.5 offers a significant leap forward in speech-to-text technology, providing faster, more accurate, and more comprehensive transcription services. As it becomes more widely adopted, it could transform how businesses handle audio data, making transcription more accessible and efficient than ever before.