Bot Nirvana | AI & Automation Podcast

Nandan Mullakara

4.6 (11)
TECHNOLOGY
UPDATED BIWEEKLY

Bot Nirvana is a podcast on all things Intelligent Automation. We cover RPA, AI, Process Intelligence, Process Mining, and a host of other tools and techniques for intelligent automation.

SEP 18

Agentic Process Automation (APA)

In this episode, we explore Agentic Process Automation (APA), a paradigm that could revolutionize digital automation by harnessing the power of AI agents. The discussion focuses on the ProAgent system as an example of APA. APA introduces a new paradigm where AI-driven agents can analyze, decide, and execute complex tasks with minimal human intervention. We'll unpack the groundbreaking Automation concept which showcases the true potential of AI agents through its innovative approach to workflow construction and execution. Key Topics Covered Introduction to Agentic Process Automation (APA) Comparison between traditional Robotic Process Automation (RPA) and APA ProAgent: A prime example of APA implementation Key innovations of ProAgent: Agentic workflow construction Agentic workflow execution Types of agents in ProAgent: Data agents Control agents Case study: Using ProAgent with Google Sheets for business line management Potential impacts and implications of APA on work and decision-making Future developments and considerations for APA technology This episode was generated using Google Notebook LM, drawing insights from the paper "ProAgent: From Robotic Process Automation to Agentic Process Automation" Stay ahead in your AI journey with Bot Nirvana AI Mastermind. Podcast Transcript All right, everyone. Buckle up, because today's deep dive is going to be a wild ride through the future of automation. We're talking way beyond those basic schedule this kind of tasks. Yeah, we're diving headfirst into the realm where AI takes the wheel and handles the thinking for us. Oh, yeah, the thinking part. Yeah. If you could give your computer a really complex task, something that needs analysis, decision-making, maybe even a dash of creativity, that's what we're talking about. And right now, your typical automation tools, they would hit a wall. Hard. They're great at following those rigid step-by-step instructions. Like robots. Exactly. But when it comes to anything that requires actual brain power. Still got to do it ourselves. Well, that's where this research paper we're diving into today comes in. It's all about something called agentic process automation, or APA for short. And let me tell you, this stuff has the potential to completely change the game. OK, for those of us who haven't dedicated our lives to the art of automation, give us the lowdown. What is APA, and why is it such a big deal? Think about your current automation workhorse RPA, robotic process automation. It's like that super reliable assistant who never complains but needs very specific instructions for every single step. Right. Amazing at those repetitive tasks, but needs you to hold their hand through every decision point. Exactly. Now, imagine that same assistant, but with a secret weapon, an AI sidekick whispering genius solutions in their ear. OK, now you're talking. That's APA in a nutshell. We're giving RPA a massive intelligence boost. So instead of just blindly following pre-programmed rules, we're talking about automation that can actually think. You got it. APA introduces the idea of agents, which are basically AI helpers embedded directly into the workflow. These agents can analyze data, make judgment calls based on that analysis, and even generate things like reports, all without a human meticulously laying out each step. So it's not just about automating tasks anymore. It's about automating the intelligence behind those tasks. You're catching on quickly. And this paper focuses on a system called ProAgent as a prime example of APA in action. All right, lay it on us. What is ProAgent? So ProAgent really highlights the potential of APA with two key innovations-- agentic workflow construction and agentic workflow execution. OK, so those are some pretty hefty terms. Can you break those down for us? Let's start with how ProAgent constructs workflows. What makes it so revolutionary? Well, with your traditiona

11 min
SEP 18

OCR 2.0

In this podcast, we dive into the new concept of OCR 2.0 - the future of OCR with LLMs. We explore how this new approach addresses the limitations of traditional OCR by introducing a unified, versatile system capable of understanding various visual languages. We discuss the innovative GOT (General OCR Theory) model, which utilizes a smaller, more efficient language model. The podcast highlights GOT's impressive performance across multiple benchmarks, its ability to handle real-world challenges, and its capacity to preserve complex document structures. We also examine the potential implications of OCR 2.0 for future human-computer interactions and visual information processing across diverse fields. Key Points Traditional OCR vs. OCR 2.0 Current OCR limitations (multi-step process, prone to errors) OCR 2.0: A unified, end-to-end approach Principles of OCR 2.0 End-to-end processing Low cost and accessibility Versatility in recognizing various visual languages GOT (General OCR Theory) Model Uses a smaller, more efficient language model (Quinn) Trained in diverse visual languages (text, math formulas, sheet music, etc.) Training Innovations Data engines for different visual languages E.g. LaTeX for mathematical formulas Performance and Capabilities State-of-the-art results on standard OCR benchmarks Outperforms larger models in some tests Handles real-world challenges (blurry images, odd angles, different lighting) Advanced Features Formatted document OCR (preserving structure and layout) Fine-grained OCR (precise text selection) Generalization to untrained languages This episode was generated using Google Notebook LM, drawing insights from the paper "General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model". Stay ahead in your AI journey with Bot Nirvana AI Mastermind. Podcast Transcript: All right, so we're diving into the future of OCR today. Really interesting stuff. Yeah, and you know how sometimes you just gain a document, you just want the text, you don't really think twice about it. Right, right. But this paper, General OCR Theory, towards OCR 2.0 via a unified end-to-end model. Catchy title. I know, right? But it's not just the title, they're proposing this whole new way of thinking about OCR. OCR 2.0 as they call it. Exactly, it's not just about text anymore. Yeah, it's really about understanding any kind of visual information, like humans do. So much bigger. It's a really ambitious goal. Okay, so before we get ahead of ourselves, let's back up for a second. Okay. How does traditional OCR even work? Like when you and I scan a document, what's actually going on? Well, it's kind of like, imagine an assembly line, right? First, the system has to figure out where on the page the actual text is. Find it. Right, isolate it. Then it crops those bits out. Okay. And then it tries to recognize the individual letters and words. So it's like a multi-step? Yeah, it's a whole process. And we've all been there, right? When one of those steps goes wrong. Oh, tell me about it. And you get that OCR output that's just… Gibberish, told gibberish. The worst. And the paper really digs into this. They're saying that whole assembly line approach, it's not just prone to errors, it's just clunky. Yeah, very inefficient. Like different fonts can throw it off. And write. Different languages, forget it. Oh yeah, if it's not basic printed text, OCR 1.0 really struggles. It's like it doesn't understand the context. Yeah, exactly. It's treating information like it's just a bunch of isolated letters, instead of seeing the bigger picture, you know, the relationships between them. It doesn't get the human element of it. It's missing that human touch, that understanding of how we visually organize information. And that's a problem. A big one. Especially now, when we're just like drowning in visual information everywhere you look. It's true, we need someth

11 min
AUG 9

JP Morgenthal

JP Morgenthal (JP) is a seasoned expert in applied AI and automation. With over 20 years of experience as a Chief Technology Officer (CTO) and Solution Architect, JP has been a driving force behind digital transformation for Fortune 1000 companies. His expertise spans IT architecture, cloud strategies, and large-scale system implementations. Currently, JP is the Vice President of Solution Engineering at CafeX Communications, following prominent roles as CTO of Automation Anywhere and App Services at DXC. In this episode, we delve into the convergence of various automation technologies like RPA, BPM, iPaas, and AI. JP shares insights on the influence of new AI advancements, including Large Language Models (LLMs) and AI agents, and explores the future trends in intelligent automation. Join us as we unpack these topics, offering a glimpse into how these innovations reshape the technological landscape. More information and Links: More about JP Morgenthal: https://jpmorgenthal.com/ Connect with JP Morgenthal: linkedin.com/in/jpmorgenthal/ Visit Nandan on the web at nandan.info

29 min
MAR 8

Decoding the Gen AI secrets

Why is Generative AI the talk of the tech world? Today, we break down the basics and explore its vast applications beyond text generation, from drug discovery to logistics. Subscribe for premium content on mastering AI. Stay ahead in your AI journey with Bot Nirvana AI Mastermind.

6 min
11/21/2023

Shreekant Mandvikar

Shreekant Mandvikar is an Intelligent Automation expert who has helped 20+ customers on their Intelligent Automation journey. He currently leads Intelligent Automation initiatives at Ally Financial. In this chat, we discuss his Intelligent Automation journey and learnings, interesting use cases, tracking value, Gen AI, and more. More information and Links: More about Shreekant : shreekantmandvikar.com Connect with Shreekant: linkedin.com/in/shreekant-mandvikar Visit Nandan on the web at nandan.info

29 min
10/06/2023

Andy Thurai

Andy is VP and principal analyst at Constellation Research. He is an accomplished IT executive having served in leadership roles at major companies like IBM, Intel, and Oracle. He is an expert in AI, AIOps, Observability, Cloud, and other enterprise software. I have read many of his articles in publications such as Forbes and Harvard Business Review. In this chat, we talk about the emergence of ChatGPT, Microsoft vs. Google AI, AI ethics, and the future of AI. More information and Links: Connect with Andy: linkedin.com/in/andythurai HBR article Andy co-authored: hbr.org/2022/09/ai-isnt-ready-to-make-unsupervised-decisions Visit Nandan on the web at nandan.info

36 min
09/26/2023

Workfellow

Workfellow is an AI-powered process intelligence solution designed for uninterrupted process analysis and operational excellence. In this chat with CEO Kustaa Kivelä, we talk about Kustaa’s journey and the frustrations that led to him founding Workfellow. We also talk about the options available in the process intelligence space, and the impact of generative AI. More information and Links: Connect with Kustaa Kivelä: linkedin.com/in/kustaakivela Visit Nandan on the web at nandan.info

27 min
09/15/2023

Sarah Burnet (KYP ai)

Sarah is a distinguished industry analyst advising enterprises on Intelligent automation technologies, sourcing, and market trends. She is a frequent speaker, an active blogger, and is named in Computer Weekly's Most Influential Women in Technology Hall of Fame. In this chat, we talk about the impact of AI on automation, Process mining, Task mining, and what KYP AI calls Productivity mining. We discuss a few emerging use cases with Generative AI and Automation. More information and Links: Connect with Sarah: linkedin.com/in/sarahburnett Sarah’s book: sarah-burnett.com/the-autonomous-enterprise Visit Nandan on the web at nandan.info

28 min

See All (46)

4.6

out of 5

11 Ratings

Great source of information on AI & Automation

Aug 9

JPMorgenthal

I've been subscribed to Bot Nirvana's newsletter for a few years now and have found it and the podcasts to be a great way to stay abreast of the fast moving automation and AI industries. I recently completed a podcast with Bot Nirvana regarding convergence within the automation space and the impact of AI.
Great relevant Automation content

12/12/2022

AhmdZ1216

Nandan has brought together a great cohort of thought leaders and produces very informative practical content that is usable and informative. Must listen for all in the automation space.
This is a must

04/15/2022

dixbaby

Great engaging content - have been following Nandan for a few years and he’s a thought leader in the space.
Very cool. Keep it up.

01/26/2021

RamkyKV

Very interesting topic and relevant. Congrats.

Bot Nirvana is a podcast on all things Intelligent Automation. We cover RPA, AI, Process Intelligence, Process Mining, and a host of other tools and techniques for intelligent automation.

Creator

Nandan Mullakara
Years Active

2020 - 2024
Episodes

46
Rating

Clean
Show Website

Bot Nirvana | AI & Automation Podcast

Technology

Technology

Updated Weekly
Technology

Technology

Updated Semiweekly

Bot Nirvana | AI & Automation Podcast

Agentic Process Automation (APA)

OCR 2.0

JP Morgenthal

Decoding the Gen AI secrets

Shreekant Mandvikar

Andy Thurai

Workfellow

Sarah Burnet (KYP ai)

Great source of information on AI & Automation

Great relevant Automation content

This is a must

Very cool. Keep it up.

About

Information

You Might Also Like

Bot Nirvana | AI & Automation Podcast

Episodes

Agentic Process Automation (APA)

OCR 2.0

JP Morgenthal

Decoding the Gen AI secrets

Shreekant Mandvikar

Andy Thurai

Workfellow

Sarah Burnet (KYP ai)

Ratings & Reviews

Great source of information on AI & Automation

Great relevant Automation content

This is a must

Very cool. Keep it up.

About

Information

You Might Also Like