Two Voice Devs

Mark and Allen

1.0 (1)
TECHNOLOGY
UPDATED WEEKLY

Mark and Allen talk about the latest news in the VoiceFirst world from a developer point of view.

1D AGO

Episode 254 - Agent Frameworks Compared: Google's ADK vs LangChainJS

Allen and Mark are back to discuss AI agent frameworks again. This time, Allen compares Google's Agent Development Kit (ADK) with LangChainJS and LangGraphJS. He walks through building a simple agent in both frameworks, highlighting the differences in their approaches, from configuration by convention in ADK to the explicit configuration in LangGraph. They also explore the web-based testing environments for both, showing how each allows for debugging and inspecting the agent's behavior. The discussion also touches on the upcoming LangChain version 1.0 and its focus on backward compatibility. [00:00:00] - Introduction [00:01:09] - Comparing agent frameworks: Google's ADK and LangChainJS [00:02:20] - A look at the ADK code [00:06:55] - A look at the LangChainJS code [00:13:20] - The web interface for testing [00:19:10] - ADK's web interface [00:22:30] - LangGraph's web interface [00:27:20] - LangGraph's state management [00:32:15] - Final thoughts #AI #AgenticAI #GoogleADK #LangChain #LangGraph #JavaScript #Python #TwoVoiceDevs

33 min
AUG 29

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

In this episode of Two Voice Devs, Mark and Allen dive into the new experimental Text-to-Speech (TTS) model in Google's Gemini 2.5. They explore its capabilities, from single-speaker to multi-speaker audio generation, and discuss how it's a significant leap from the old days of SSML. They also touch on how this new technology can be integrated with LangChainJS to create more dynamic and natural-sounding voice applications. Is this the return of voice as the primary interface for AI? [00:00:00] Introduction [00:00:45] Google's new experimental TTS model for Gemini [00:01:55] Demo of single-speaker TTS in Google's AI Studio [00:03:05] Code walkthrough for single-speaker TTS [00:04:30] Lack of fine-grained control compared to SSML [00:05:15] Using text cues to shape the TTS output [00:06:20] Demo of multi-speaker TTS with a script [00:09:50] Code walkthrough for multi-speaker TTS [00:11:30] The model is tuned for TTS, not general conversation [00:12:10] Using a separate LLM to generate a script for the TTS model [00:13:30] Code walkthrough of the two-function approach with LangChainJS [00:16:15] LangChainJS integration details [00:19:00] Is Speech Markdown still relevant? [00:21:20] Latency issues with the current TTS model [00:22:00] Caching strategies for TTS [00:23:30] Voice as the natural UI for AI [00:25:30] Outro #Gemini #TTS #VoiceAI #VoiceFirst #AI #Google #LangChainJS #LLM #Developer #Podcast

26 min
AUG 15

Episode 252 - GPT-5 First Look: Evolution, Not Revolution

Join Allen and Mark as they take a first look at the newly released GPT-5 from OpenAI. They dive into the details of what's new, what's changed, and what's missing, frequently comparing it to other models like Google's Gemini. From the new mini and nano models to the pricing wars with competitors, they cover the landscape of the latest LLM offerings. They also discuss the new features for developers, including verbosity settings and constrained outputs with context-free grammars, and what this means for the future of AI development. Is GPT-5 the leap forward everyone was expecting, or a sign that the rapid pace of AI evolution is starting to plateau? Tune in to find out! [00:00:00] Introduction and the hype around GPT-5 [00:01:00] Overview of GPT-5, mini, and nano models [00:02:00] The new "thinking" model and smart routing [00:03:00] Simplifying models for developers [00:04:00] Reasoning levels vs. Gemini's "thinking budget" [00:06:00] Pricing wars and new models [00:07:00] OpenAI's new open source models [00:08:00] New verbosity setting for developers [00:09:00] Constrained outputs and context-free grammars [00:12:00] Using LLMs to translate to well-defined data structures [00:14:00] Reducing hallucinations and medical applications [00:16:00] Knowledge cutoff dates for the new models [00:18:00] Coding with GPT-5 and IDE integration [00:19:00] More natural conversations with ChatGPT [00:21:00] Missing audio and image modalities vs. Gemini [00:22:00] Community reaction to the GPT-5 release [00:24:00] The future of LLMs: Maturing and plateauing [00:26:00] The need for better developer tools and agentic computing #GPT5 #OpenAI #LLM #AI #ArtificialIntelligence #Developer #TechTalk #Podcast #AIDEvelopment #MachineLearning #FutureOfAI #AGI #GoogleGemini #TwoVoiceDevs

28 min
AUG 12

Episode 251 - AI Agents: Frameworks and Concepts

Join Mark and Allen in this episode of Two Voice Devs as they explore the fascinating world of AI agents. They break down what agents are, how they work, and what sets them apart from earlier AI technologies. The discussion covers key concepts like "context engineering," and the essential components of an agentic system, including prompts, RAG, memory, tools, and structured outputs. Using a practical example of a prescription management chatbot for veterans, they demonstrate how agents can handle complex tasks. They compare various frameworks for building agents, specifically focusing on OpenAI's Agent SDK (for TypeScript) and Microsoft's Semantic Kernel (for C#). They also touch on other popular frameworks like LangGraph and Google's Agent Developer Kit. Tune in for a detailed comparison of how OpenAI's Agent SDK and Microsoft's Semantic Kernel handle state, tools, and the overall agent lifecycle, and learn what the future holds for these intelligent systems. [00:00:00] - Introduction [00:01:02] - What is an AI Agent? [00:03:12] - Context Engineering and its components [00:06:02] - The role of the Agent Controller [00:08:01] - Agent Mode vs. Agent AI [00:09:36] - Use Case: Prescription Management Chatbot [00:13:42] - Handling Large Lists of Data [00:16:15] - Tools and State Management [00:21:05] - Filtering and Searching with Tools [00:27:08] - Displaying Information and Iterating through lists [00:30:10] - The power of LLMs in Agentic Systems [00:35:18] - Sub-agents and the future of agentic systems [00:38:25] - Comparing different Agent Frameworks [00:39:00] - Wrap up #AIAgents #TwoVoiceDevs #ContextEngineering #OpenAIAgentSDK #SemanticKernel #LangGraph #GoogleADK #LLMs #GenAI #AI #Developer #Podcast #TypeScript #CSharp

39 min
JUL 31

Episode 250 - Five Years Up, Up, and Away in Voice & AI

Join Mark and Allen for a very special 250th episode as they celebrate five years of Two Voice Devs! You won't want to miss the unique, AI-animated opening that takes them to new heights, or the special closing that brings it all home, both created with the help of Veo 3. In between, they take a look back at the evolution of voice and AI technology. From the early days of Alexa and Google Assistant to the rise of LLMs and generative AI, they discuss the shifts in the industry, the enduring importance of context, and what the future might hold for agentic AI, security, and the developer experience. [00:02:45] - Where did we think the industry would be in 5 years? [00:05:30] - How LLMs and Generative AI changed the landscape [00:11:05] - Context Engineering is the new Prompt Engineering [00:14:30] - The explosion of frameworks, libraries, and models [00:18:00] - The importance of guardrails and security [00:22:30] - Where are things going in the near term? [00:27:30] - The future of devices and developer platforms [00:30:00] - Right-sizing models and the cost of AI [00:33:30] - The importance of community and having fun #TwoVoiceDevs #VoiceAI #ArtificialIntelligence #LLMs #GenerativeAI #AIAgents #VoiceFirst #TechPodcast #ConversationalAI #AICommunity #FutureOfTech #AIEthics #AISecurity #DeveloperExperience #HotAirBalloon #Veo3

36 min
JUL 24

Episode 249 - Cracking Copilot and the Mysteries of Microsoft 365

In this episode, guest host Andrew Connell, a Microsoft MVP of 21 years, joins Allen to unravel the complexities of Microsoft's AI strategy, particularly within the enterprise. They explore the world of Microsoft 365 Copilot, distinguishing it from the broader AI landscape and consumer tools like ChatGPT. Andrew provides an insider's look at how Copilot functions within a secure, private "enclave," leveraging a "Semantic Index" of your organization's data to provide relevant, contextual answers. The conversation then shifts to the developer experience. Discover the different ways developers can extend and customize Copilot, from low-code solutions in Copilot Studio to creating powerful "declarative agents" with JSON and even building "custom engine agents" where you can bring your own models and infrastructure. If you've ever wondered what Microsoft's AI story is for businesses and internal developers, this episode provides a comprehensive and honest overview. Timestamps: [00:00:01] - Introducing guest host Andrew Connell [00:00:54] - What is a Microsoft 365 developer? [00:01:40] - Andrew's journey into the Microsoft ecosystem [00:05:00] - 21 years as a Microsoft MVP [00:06:15] - Enterprise Cloud vs. Developer Cloud [00:08:06] - Microsoft's AI focus for the enterprise [00:10:57] - What is Microsoft 365 Copilot? [00:13:07] - How Copilot ensures data privacy with a "secure enclave" [00:14:58] - Understanding the Semantic Index [00:16:31] - Is Copilot a Retrieval Augmented Generation (RAG) system? [00:17:23] - Responsible AI in the Copilot stack [00:19:19] - The developer story for extending Copilot [00:22:43] - Building declarative agents with JSON and YAML [00:25:05] - Using actions and tools with agents [00:27:00] - How agents are deployed via Microsoft Teams [00:32:48] - Where does Copilot actually run? [00:36:20] - Key takeaways from Microsoft Build [00:41:20] - The spectrum of development: low-code to full-code [00:43:00] - Full control with Custom Engine Agents [00:49:30] - Where to find Andrew Connell online Hashtags: #Microsoft #AI #Copilot #Microsoft365 #Azure #SharePoint #MicrosoftTeams #MVP #Developer #Podcast #Tech #EnterpriseSoftware #CloudComputing #ArtificialIntelligence #Agents #LowCode #NoCode #RAG

52 min
JUL 17

Episode 248 - AI Showdown: Gemini CLI vs. Claude Code CLI

Join Allen Firstenberg and guest host Isaac Johnson, a Google Developer Expert with a deep background in DevOps and SRE, as they dive into the world of command-line AI assistants. In this episode, they compare and contrast two powerful tools: Anthropic's Claude Code CLI and Google's Gemini CLI. Isaac shares his journey from coding with Fortran in the 90s to becoming a GDE, and explains why he often prefers the focused, context-aware power of a CLI tool over crowded IDE integrations. They discuss the pros and cons of each approach, from ease of use and learning curves to the critical importance of using version control as a safety net. The conversation then gets practical with a live demo where both Claude and Gemini are tasked with generating system architecture diagrams for a real-world project. Discover the differences in speed, cost, output, and user experience. Plus, learn how to customize Gemini's behavior with `GEMINI.md` files and explore fascinating use cases beyond just writing code, including podcast production, image generation, and more. [00:00:30] - Introducing the topic: AI assistants in the command line. [00:01:00] - Guest Isaac Johnson's extensive background in tech. [00:03:00] - Why use a CLI tool instead of an IDE plugin? [00:07:30] - Pro Tip: Always use Git with AI coding tools! [00:09:30] - The cost of AI: Comparing Claude's and Gemini's pricing. [00:12:15] - The benefits of Gemini CLI being open source. [00:17:30] - Live Demo: Claude Code CLI generates a system diagram. [00:21:30] - Live Demo: Gemini CLI tackles the same task. [00:27:30] - Customizing your AI with system prompts (`GEMINI.md`). [00:31:30] - Beyond Code: Using CLI tools for podcasting and media generation. [00:40:30] - Where to find and connect with Isaac Johnson. #AI #DeveloperTools #CLI #Gemini #Claude #GoogleCloud #Anthropic #TwoVoiceDevs #TechPodcast #SoftwareDevelopment #DevOps #SRE #AIassistant #Coding #Programming #FirebaseStudio #Imagen #Veo

42 min
JUL 10

Episode 247 - Apple's AI Gets Serious

John Gillilan, our official Apple correspondent, returns to Two Voice Devs to unpack the major announcements from Apple's latest Worldwide Developer Conference (WWDC). After failing to ship the ambitious "Apple Intelligence" features promised last year, how did Apple address the elephant in the room? We dive deep into the new "Foundation Models Framework," which gives developers unprecedented access to on-device LLMs. We explore how features like structured data output with the "Generable" macro, "Tools" for app integration, and trainable "Adapters" are changing the game for developers. We also touch on the revamped speech-to-text, "Visual Intelligence," "Swift Assist" in Xcode, and the mysterious "Private Cloud Compute." Join us as we analyze Apple's AI strategy, the internal reorgs shaping their product future, and the competitive landscape with Google and OpenAI. [00:00:00] Welcome back, John Gillilan! [00:01:00] What was WWDC like from an insider's perspective? [00:06:00] Apple's big miss: What happened to last year's AI promises? [00:12:00] The new Foundation Models Framework [00:16:00] Structured data output with the "Generable" macro [00:19:00] Extending the LLM with "Tools" [00:22:00] Fine-tuning with trainable "Adapters" [00:28:00] Modernized on-device Speech-to-Text [00:29:00] "Visual Intelligence" and app integration [00:32:00] The powerful "call model" block in Shortcuts [00:36:00] Swift Assist and BYO-Model in Xcode [00:39:00] Inside Apple's big AI reorg [00:42:00] The Jony Ive / OpenAI hardware mystery [00:45:00] How Apple, Google, and OpenAI will compete and collaborate #Apple #WWDC #AI #AppleIntelligence #FoundationModels #LLM #OnDeviceAI #Swift #iOSDev #Developer #TechPodcast #TwoVoiceDevs #Siri #SwiftAssist #OpenAI #GoogleGemini #GoogleAndroid

49 min

See All (255)

TRAILER

Episode 10 Teaser

We have something special planned for our 10th episode! Curious what it might be?

S1, E10

•

29 sec

No langchain

04/10/2023

soslooooooooow

Reading the readme may be better use of your time if you want to learn about langchain

Mark and Allen talk about the latest news in the VoiceFirst world from a developer point of view.

Creator

Mark and Allen
Years Active

2020 - 2025
Episodes

255
Rating

Clean
Show Website

Two Voice Devs

Two Voice Devs

Episode 254 - Agent Frameworks Compared: Google's ADK vs LangChainJS

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Episode 252 - GPT-5 First Look: Evolution, Not Revolution

Episode 251 - AI Agents: Frameworks and Concepts

Episode 250 - Five Years Up, Up, and Away in Voice & AI

Episode 249 - Cracking Copilot and the Mysteries of Microsoft 365

Episode 248 - AI Showdown: Gemini CLI vs. Claude Code CLI

Episode 247 - Apple's AI Gets Serious

Trailer

Episode 10 Teaser

Ratings & Reviews

No langchain

About

Information

Two Voice Devs

Episodes

Episode 254 - Agent Frameworks Compared: Google's ADK vs LangChainJS

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Episode 252 - GPT-5 First Look: Evolution, Not Revolution

Episode 251 - AI Agents: Frameworks and Concepts

Episode 250 - Five Years Up, Up, and Away in Voice & AI

Episode 249 - Cracking Copilot and the Mysteries of Microsoft 365

Episode 248 - AI Showdown: Gemini CLI vs. Claude Code CLI

Episode 247 - Apple's AI Gets Serious

Trailer

Ratings & Reviews

About

Information