The Future of Voice AI

Davit Baghdasaryan

In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? voice-ai-newsletter.krisp.ai

  1. 3 DAYS AGO

    Inside the Data: The State of Voice in CX Unpacked | Peter Ryan ( Ryan Strategic Advisory)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is Peter Ryan, President and Principal Analyst at Ryan Strategic Advisory. Peter Ryan is recognized as one of the world’s leading experts in CX and BPO. Throughout his career, Peter has advised CX outsourcers, contact center clients, national governments, and industry associations on strategic matters like vertical market penetration, service delivery, best practices in technology deployment, and offshore positioning. Ryan Strategic Advisory provides market insight, brand development initiatives, and actionable data for organizations in the customer experience services ecosystem. With two decades of experience, Ryan Strategic Advisory supports outsourcing operators, technology providers, industry associations, and economic development agencies. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * The hype cycle around AI has made it hard for CX leaders to separate real progress from inflated promises. * Adoption of voice AI is moving from concept to mainstream, driven by accuracy, latency improvements, and reliability. * Customers care most about issue resolution, not whether the agent sounds robotic or perfectly human. * One bad phone experience, often caused by language or accent misunderstandings, can permanently lose a customer. * Nearly half of surveyed enterprises are already using AI-powered voice translation, showing trust in its growing value. * About a quarter are experimenting with or adopting AI accent conversion, a big leap from just a few years ago. * Accent technology is not just for customers; it reduces agent stress and helps retain frontline workers. * Better agent retention directly lowers costs tied to recruiting, training, and high attrition. * Frontline agents are often more enthusiastic about accent technology than executives, because it eases real pain in daily calls. * CX leaders see accent and translation tools as a way to improve loyalty by making communication effortless across borders. * Latency in AI responses is no longer the barrier it once was—customers tolerate small delays if accuracy is high. * The biggest risk with AI in CX is overpromising; pragmatic, real-world use cases drive adoption faster than hype. * Failed AI deployments are often rolled back, especially with voice bots that don’t meet expectations. * Real-world case studies are becoming essential for buyers to justify investments in a tight economic climate. * CX voice AI adoption has followed a clear path: noise cancellation first, then accent tools, now translation at scale. * The next wave of adoption depends on showing measurable business outcomes rather than futuristic demos. * AI in CX today is compared to Pentium processors in the 90s: a turning point that accelerates everything once it matures. * Companies that promise realistically and deliver consistently will win long-term trust in a crowded AI market. * The real test of AI in CX isn’t novelty—it’s whether it helps customers resolve issues faster, cheaper, and with less friction. Check out the last week’s article to dive deeper into the data discussed in this episode. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    23 min
  2. 21 AUG

    Voice AI for Frontline Workers | Assaf Asbag (Chief Product & Technology Officer at aiOla)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is Assaf Asbag, Chief Technology and Product Officer at aiOla. Assaf Asbag is the CPTO at aiOla, leading AI-driven product innovation and enterprise solutions. He previously served as VP of AI at Playtika, where he built the AI division into a key growth engine. Assaf’s background includes advanced algorithm work at Applied Materials and leadership across engineering and data science teams. He holds B.Sc. and M.Sc. degrees in Electrical and Computer Engineering with a focus on machine learning from Ben-Gurion University, making him a recognized expert in AI and technology strategy. aiOla's patented models and technology supports over 100 languages and discerns jargon, abbreviations, and acronyms, demonstrating a low error rate even in noisy environments. aiOla's purpose-built technology converts manual processes in critical industries into data-driven, paperless, AI-powered workflows through cutting-edge speech recognition. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * Turning spoken language into structured data in noisy, multilingual, and jargon-heavy environments is the real differentiator for enterprise voice AI. * Standard ASR models fail in frontline industries due to heavy accents, domain-specific vocabulary, and constant background noise. * Zero-shot keyword spotting from large jargon lists without fine-tuning can drastically cut setup time for specialized speech recognition. * Building proprietary, noise-heavy training datasets is essential for robust ASR performance in the real world. * Synthetic data generation that blends realistic noise with text-to-speech can cheaply scale model adaptation for niche environments. * Real-time processing is critical to making voice the primary human–technology interface, especially for operational workflows. * Voice AI has massive untapped potential among the world’s billion-plus frontline workers, far beyond current call center focus. * Incomplete or missing documentation is a hidden cost that voice-first tools can solve by capturing richer, structured information on the spot. * Effective enterprise AI solutions often require both a core product and flexible integration layers (SDK, API, or full app). * Trustworthy AI for voice will require guardrails, watermarking, bias detection, and context-aware filtering. * The next leap in conversational AI will be personalized, real-time adaptive systems rather than today’s generic emotion mimicking. * Designing for multimodal interaction (voice, text, UI) will be as important as model accuracy for user adoption. * AI revolutions historically create more jobs than they displace, but require new roles in monitoring, reliability, and context engineering. * Future speech AI should emulate human listening: diagnosing issues, correcting in real-time, and adapting based on cues like pace, volume, and accent. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    22 min
  3. 7 AUG

    What to expect in 2025 | Jack Piunti (GTM Lead for Communications at ElevenLabs)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is Jack Piunti, GTM Lead for Communications at ElevenLabs. Jack Piunti is the GTM lead for Communications at ElevenLabs, where he oversees go-to-market strategy across CPaaS, CCaaS, UCaaS, and customer experience. With a strong background in consultative technology partnerships and startup growth, Jack brings deep expertise in AI-driven communications. Prior to ElevenLabs, he spent six years at Twilio, helping shape enterprise adoption of real-time voice technologies. He is passionate about the future of connected applications and the role of AI in transforming how we communicate. ElevenLabs is a voice AI company offering ultra-realistic text-to-speech, speech-to-text, voice cloning, multilingual dubbing, and conversational AI tools. Founded in 2022, it enables creators and developers to build voice apps and generate lifelike, emotionally rich speech in 70+ languages. Its latest models support expressive cues and multi-speaker dialogue. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * Most AI failures in conversation don't come from the language model, but from inaccurate speech-to-text at the start. * Bad transcription of critical details like names or codes breaks the entire user experience and can’t easily be recovered. * Accurate speech-to-text is now a make-or-break factor for building reliable AI agents. * Voice will soon replace typing as the main way humans interact with machines because it's more natural and efficient. * Enterprises don’t want to stitch together multiple AI vendors, they want end-to-end platforms that simplify the stack and reduce latency. * Demos often look impressive, but very few companies can scale real-time voice tech reliably in production environments. * AI voice agents that sound expressive aren't enough — turn-taking and accuracy are still bigger challenges. * Most companies ignore accessibility in AI, but modeling things like stuttering actually improves agent behavior. * Streaming speech and voice models will unlock more lifelike, responsive AI agents — and it’s coming fast. * Audio AI needs deep expertise beyond AI, including sound engineering and context-aware modeling of human speech. * There’s a growing trend of AI companies going beyond voice to control the full audio experience, including music and sound effects. * The way voice models are trained is fundamentally different from language models and requires much cleaner training data. * Many agentic AI builders today are forced to cobble together solutions from different vendors, which creates delay and complexity. * True real-time voice AI must handle language switching, emotional cues, and speech disfluencies automatically to feel natural. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    26 min
  4. 31 JUL

    Solve First, Then Automate | Bryce Cressy (VP of Strategic Solutions at Nutun)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is Bryce Cressy, VP of Strategic Solutions at Nutun. Bryce Cressy is the VP of Strategic Solutions at Nutun, where he leads innovation, AI integration, and process optimization across global CX and collections programs. With deep expertise in partnerships and outsourcing, he helps clients futureproof their contact center operations by combining human talent with transformative technology. Based in South Africa, Bryce is a vocal advocate for the region’s rise as a high-skill BPO hub, and works closely with enterprise leaders in the US and UK to design tailored, tech-forward customer experiences. Nutun is a global BPO headquartered in South Africa, specializing in customer experience and debt collection services for clients in the US, UK, Australia, and beyond. With 30 years of industry experience and a strong foundation in collections, Nutun blends skilled human talent with cutting-edge AI to deliver high-impact, scalable solutions. Nutun is redefining offshore CX by combining local expertise, robust infrastructure, and a commitment to continuous innovation. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * AI only works when solving specific, targeted problems; using it as a blanket solution guarantees failure. * The term "agentic AI" is being overused without a shared definition, creating more confusion than clarity. * South Africa's time zone, infrastructure, educated talent pool, and English fluency give it a global CX advantage. * Contact center jobs are now aspirational in South Africa, offering career paths from agent to executive. * Voice still dominates support channels, but without Voice AI, BPOs risk becoming obsolete. * Escalation design is the most critical aspect of Voice AI adoption; bad handoffs will break customer trust. * Voice bots should never trap customers in AI-only loops without access to a human. * Companies afraid of AI hallucinations start with agent-assist tools, not bots—it's a low-risk entry point. * Clear audio is make-or-break for AI accuracy, especially in noisy environments like collections. * IVR menus are outdated; conversational routing with AI voice agents is the new standard. * Smart BPOs are flipping the model, letting humans hand off to bots for routine tasks, not the other way around. * Voice AI isn't just a cost play, it's a CX differentiator that drives loyalty and efficiency. * Many vendors sound the same; what matters is whether their tech solves a real, measurable problem. * AI voice agents won't kill human support, it will triage it—handling volume while preserving empathy. * Customers need to know a human is always available or they'll lose confidence in the brand. * The future of BPOs lies in combining process consulting with selective, surgical AI integration. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    15 min
  5. 24 JUL

    Voice is becoming a top channel again | Sharang Sharma (Vice President at Everest Group)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next few years? This episode’s guest is Sharang Sharma, Founder and Vice President at Everest Group. Sharang Sharma is a Vice President at Everest Group, where he leads research and advisory in Business Process Services with a focus on customer experience management. He works closely with CX leaders to help them navigate digital transformation, AI adoption, sourcing, and operational strategy. Sharang brings deep insight into how technologies like voice AI, accent neutralization, and translation are reshaping global support models. Everest Group is a global research and advisory firm headquartered in Dallas. The company provides deep insights into business process services, technology, and customer experience, helping organizations navigate innovation and operational transformation across industries. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * AI is now setting the standard for customer experience, not just helping it. * Voice is becoming a top channel again because of real-time AI improvements. * Accent and language tools are changing what it means to deliver global CX. * Companies used to ignore voice, but AI is making it faster, smarter, and easier to scale. * CX is where AI is being tested the most because it needs to be accurate and cost-effective at large scale. * After years of shifting away from voice, AI is bringing it back as a preferred support option. * Voice AI helps companies hire globally by making accents less of a barrier. * Translation AI is still early and not as reliable yet, like accent tech was a year ago. * Accuracy is a bigger issue than speed when it comes to using translation AI in real-time calls. * Companies shouldn’t expect perfect AI, but should ask if it’s better than the other options. * What counts as good CX now is shifting toward clarity, empathy, and smarter service. * Working with AI to support humans is more reliable right now than using bots alone. * People often think CX work is dull, but it depends on human connection and emotion. * Small improvements from AI are adding up to major gains in customer experience. * AI is growing in CX not just because of the tech, but because it opens cheaper and wider hiring options. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    14 min
  6. 17 JUL

    The CX Knowledge Crisis | Justin Robbins (Founder & Principal Analyst at Metric Sherpa)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is Justin Robbins, Founder and Principal Analyst at Metric Sherpa. Justin Robbins is the founder of Metric Sherpa, an independent analyst firm helping CX leaders, investors, and solution providers cut through the noise and make confident decisions. With a career spanning both the front lines and boardrooms of CX, Justin brings clarity to a fast-changing market. He’s built frontline support teams, advised global brands, and knows firsthand what drives real impact—from the operations floor to executive strategy. Through his research, insights, and content, Justin equips businesses not just with information, but with the direction they need to act. Metric Sherpa is an analyst firm built on the belief that CX insights should lead to action. The firm was born out of real-world experience in customer operations and a need for clarity in a crowded, fast-moving space. Metric Sherpa helps solution providers, investors, and business leaders find the meaningful signal in the noise, translating market trends into decisions that matter. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * Metric Sherpa focuses on turning insights into clear decisions and actions. * Most CX data isn’t useful until it’s made actionable for business decisions. * There’s a major disconnect between what leaders think customers want and what customers actually experience. * AI deployment is exposing how broken and outdated most companies’ knowledge bases really are. * Critical knowledge still lives in employees’ heads, with no scalable way to capture or share it. * Poor knowledge directly causes AI agents to fail and lose companies money. * AI is finally forcing organizations to treat knowledge management as a core function, not an afterthought. * Companies rely on a few internal champions to maintain knowledge, which collapses when they leave. * Automatically generating knowledge from conversations is possible, but requires human oversight to ensure accuracy. * AI can accelerate documentation by drafting knowledge articles that humans refine. * Many leaders claim they’ll reinvest AI savings, but higher-ups often prioritize headcount cuts instead. * Organizational workload won’t decrease with AI; new complexity and tasks will quickly fill the gap. * More efficient support channels could increase usage, driving higher inbound volumes. * Most contact centers are understaffed due to poor visibility into agent productivity and shrinkage. * Frontline agents are starting to take on new roles coaching and guiding AI systems in real time. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    16 min
  7. 10 JUL

    The role of empathy in AI and CX | James Bednar ( VP of Product and Innovation at TTEC)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This episode’s guest is James Bednar, VP of Product and Innovation at TTEC. James Bednar is VP of Product and Innovation at TTEC, where he leads strategy at the intersection of CX operations and emerging tech. With a background in cognitive science and over 20 years in the industry, he brings a human-centered lens to AI innovation, helping global brands scale meaningful customer experiences. TTEC is a leading global CX technology and services innovator for AI-enabled digital CX solutions. Serving iconic and disruptive brands, TTEC's outcome-based solutions span the entire enterprise, touch every virtual interaction channel, and improve each step of the customer journey. Founded in 1982, TTEC’s employees operate on six continents and bring technology and humanity together to deliver happy customers and differentiated business results. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * Over the past few years, the contact center industry has shifted from people-based services to tech-dominated solutions. * Empathy was once the top CX priority but has taken a backseat to speed and automation in recent years, especially post-COVID. * Post-COVID impatience has changed customer expectations—people now prioritize fast resolution over emotional connection. * AI may never feel empathy like a human, but it can simulate enough of it to meet rising expectations for speed. * Younger generations are building emotional trust with AI, even consulting it for major life decisions more than their parents. * The idea that empathy is a uniquely human trait may no longer hold up as AI gets more advanced and socially accepted. * Most contact center training still forces fake empathy through scripts, which can backfire and hurt customer trust. * Disingenuous empathy can be worse than showing no empathy at all. * Real customer satisfaction comes more from issue resolution than emotional expression alone. * AI may be better than humans at consistently identifying emotional cues in conversations, even if it’s not perfect yet. * Traditional QA processes still rely on subjective human judgments of empathy, which lack consistency and scalability. * Trust plays a major role in perceived empathy—users trust AI more when it provides helpful and consistent answers. * The industry may need new KPIs to measure the genuineness or effectiveness of empathy, especially in AI-led interactions. * The balance between empathy and speed is evolving, and AI might soon outperform humans in delivering both at scale. * TTEC breaks down empathy’s ROI in the contact center in the age of AI in their latest report: “Is Empathy Overrated?” This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    14 min
  8. 3 JUL

    CX AI: What the Data Says | Jordan Zivoder (Quantitative Research Lead at Customer Management Practice)

    In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? This CCW episode features guest Jordan Zivoder, Quantitative Research Lead at Customer Management Practice (CMP Research). Jordan Zivoder has 10 years of experience in Market Research and Voice of the Customer leads quantitative research and analysis for CMP Research, Customer Management Practice’s dedicated independent insights and research product. With a primary focus on empowering executives to leverage data for data-driven decisions, Jordan combines expertise in survey research with machine learning to deliver unparalleled understanding of the customer and employee experience. CMP Research delivers unlimited advisory support, diagnostic tools, and data-driven insights to help customer contact & CX executives optimize experience, technology, and operations, while enabling solution providers with go-to-market strategies and customer insights—all powered by the organization behind Customer Contact Week. Recap Video Thanks for reading Voice AI Newsletter! Subscribe for free to receive weekly updates. Takeaways * Rising cost pressures are shifting priorities toward automation and self-service instead of hiring, changing how leaders approach customer support. * AI is helping agents do better work faster. Companies can boost performance without replacing people. * One bad self-service or bot experience can damage customer trust and stall long-term adoption. * Even as AI gets smarter, customers still expect clear access to a human—over-automation risks breaking trust. * Leaders and agents have different views on what matters most. Closing that gap is key to strong performance and retention. * Executives overestimate the impact of culture while agents care more about good managers, flexibility, and career growth. * Internal tools like Agent Assist are a safer way to test AI performance and reduce risk before deploying customer-facing automation. * AI only works well if the data behind it is accurate and up to date. Bad information leads to poor results and failed launches. * Contact centers are rich in conversation data, but few use it well. Those who miss this opportunity fall behind. * The best teams feed call data into AI tools to fill knowledge gaps and continuously improve performance. * New AI tools can detect missing knowledge and automatically update content, creating a self-improving feedback loop. * AI adoption forces companies to treat knowledge management as a core priority, not an afterthought. * AI’s value is not just in automating conversations but in creating systems that help both bots and humans improve over time. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit voice-ai-newsletter.krisp.ai

    14 min

About

In the Future of Voice AI series of interviews, I ask three questions to my guests: - What problems do you currently see in Enterprise Voice AI? - How does your company solve these problems? - What solutions do you envision in the next 5 years? voice-ai-newsletter.krisp.ai

You Might Also Like