AI Engineering Podcast

Tobias Macey

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

  1. قبل يومين

    From Probabilistic to Trustworthy: Building Orion, an Agentic Analytics Platform

    Summary In this episode of the AI Engineering Podcast Lucas Thelosen and Drew Gillson talk about Orion, their agentic analytics platform that delivers proactive, push-based insights to business users through asynchronous thinking with rich organizational context. Lucas and Drew share their approach to building trustworthy analysis by grounding in semantic layers, fact tables, and quality-assurance loops, as well as their focus on accuracy through parallel test-time compute and evolving from probabilistic steps to deterministic tools. They discuss the importance of context engineering, multi-agent orchestration, and security boundaries for enterprise deployments, and share lessons learned on consistency, tool design, user change management, and the emerging role of "AI manager" as a career path. The conversation highlights the future of AI knowledge workers collaborating across organizations and tools while simplifying UIs and raising the bar on actionable, trustworthy analytics. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Lucas Thelosen and Drew Gillson about their experiences building an agentic analytics platform and the challenges of ensuring accuracy to build trustInterview IntroductionHow did you get involved in machine learning?Can you describe what Orion is and the story behind it?Business analytics is a field that requires a high degree of accuracy and detail because of the potential for substantial impact on the business (positive and negative). These are areas that generative AI has struggled with achieving consistently. What was your process for building confidence in your ability to achieve that threshold before committing to the path you are on now?There are numerous ways that generative AI can be incorporated into the process of designing, building, and delivering analytical insights. How would you characterize the different strategies that data teams and vendors have approached that problem?What do you see as the organizational benefits of moving to a push-based model for analytics?Can you describe the system architecture of Orion?Agentic design patterns are still in the early days of being developed and proven out. Can you give a breakdown of the approach that you are using?How do you think about the responsibility boundaries, communication paths, temporal patterns, etc. across the different agents?Tool use is a key component of agentic architectures. What is your process for identifying, developing, validating, and securing the tools that you provide to your agents?What are the boundaries and extension points that you see when building agentic systems? What are the opportunities for using e.g. A2A for protocol for managing agentic hand-offs?What is your process for managing the experimentation loop for changes to your models, data, prompts, etc. as you iterate on your product?What are some of the ways that you are using the agents that power your system to identify and act on opportunities for self-improvement?What are the most interesting, innovative, or unexpected ways that you have seen Orion used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Orion?When is an agentic approach the wrong choice?What do you have planned for the future of Orion?Contact Info LucasLinkedInDrewLinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links GravityOrion Data Engineering Podcast EpisodeSite Reliability EngineeringAnthropic Claude Sonnet 4.5A2A (Agent2Agent) ProtocolSimon WillisonAI Lethal TrifectaBehavioral ScienceGrounded TheoryLLM as a JudgeRLHF == Reinforcement Learning from Human FeedbackThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ١ س ١٢ د
  2. ٧ أكتوبر

    Building Production-Ready AI Agents with Pydantic AI

    Summary In this episode of the AI Engineering Podcast Samuel Colvin, creator of Pydantic and founder of Pydantic Inc, talks about Pydantic AI - a type-safe framework for building structured AI agents in Python. Samuel explains why he built Pydantic AI to bring FastAPI-like ergonomics and production-grade engineering to agents, focusing on strong typing, minimal abstractions, and reliability, observability, and stability. He explores the evolving agent ecosystem, patterns for single vs. many agents, graphs vs. durable execution, and how Pydantic AI approaches structured I/O, tool calling, and MCP with type safety in mind. Samuel also shares insights on design trade-offs, model-provider churn, schema unification, safe code execution, security gaps, and the importance of open standards and OpenTelemetry for observability. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Samuel Colvin about the Pydantic AI framework for building structured AI agentsInterview IntroductionHow did you get involved in machine learning?Can you describe what Pydantic AI is and the story behind it?What are the core use cases and capabilities that you are focusing on with PydanticAI?The agent SDK landscape has been incredibly crowded and volatile since the introduction of LangChain and LlamaIndex. Can you give your summary of the current state of the ecosystem?What are the broad categories that you use when evaluating the various frameworks?Beyond the volatility of the frameworks, there is also a rapid pace of evolution in the different styles/patterns of agents. What are the patterns and integrations that Pydantic AI is best suited for?Can you describe the overall design/architecture of the Pydantic AI framework?How have the design and scope evolved since you first started working on it?For someone who wants to build a sophisticated, production-ready AI agent with Pydantic AI, what is your recommended path from idea to deployment?What are the elements of the framework that help engineers across those different stages of the lifecycle?What are some of the key learnings that you gained from all of your efforts on Pydantic that have been most helpful in developing and promoting Pydantic AI?What are some of the new and exciting failure modes that agentic applications introduce as compared to web/mobile/scientific/etc. applications?What are the most interesting, innovative, or unexpected ways that you have seen Pydantic AI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pydantic AI?When is Pydantic AI the wrong choice?What do you have planned for the future of Pydantic AI?Contact Info GitHubLinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links PydanticPydantic AIPydantic IncPydantic LogfireOpenAI AgentsGoogle ADKLangChainLlamaIndexCrewAIDurable ExecutionTemporalMCP == Model Context ProtocolClaude CodeTypescriptGemini Structured OutputOpenAI Structured OutputDottxt Outlines SDKsmolagentsLiteLLMOpenRouterOpenAI Responses APIFastAPISQLModelAI SDK JavaScriptLangGraphNextJSPyodideAI Elements frontend component libraryThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٥١ من الدقائق
  3. ٢٨ سبتمبر

    From GPUs to Workloads: Flex AI’s Blueprint for Fast, Cost‑Efficient AI

    Summary In this episode of the AI Engineering Podcast Brijesh Tripathi, CEO of Flex AI, talks about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. He discusses Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloadsInterview IntroductionHow did you get involved in machine learning?Can you describe what FlexAI is and the story behind it?What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?Can you describe the design and architecture of the FlexAI platform?How has the implementation evolved from when you first started working on it?For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?What are the elements of AI workloads and applications that you are explicitly not trying to solve for?What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?When is FlexAI the wrong choice?What do you have planned for the future of FlexAI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links Flex AIAurora Super ComputerCoreWeaveKubernetesCUDAROCmTensor Processing Unit (TPU)PyTorchTritonTrainiumASIC == Application Specific Integrated CircuitSOC == System On a ChipLoveableFlexAI BlueprintsTenstorrentThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٥٥ من الدقائق
  4. ٢٠ سبتمبر

    Right-Sizing AI: Small Language Models for Real-World Production

    Summary In this episode of the AI Engineering Podcast Steven Huels,  Vice President of AI Engineering & Product Strategy at Red Hat, talks about the practical applications of small language models (SLMs) for production workloads. He discusses how SLMs offer a pragmatic choice due to their ability to fit on single enterprise GPUs and provide model selection trade-offs. The conversation covers self-hosting vs using API providers, organizational capabilities needed for running production-grade LLMs, and the importance of guardrails and automated evaluation at scale. They also explore the rise of agentic systems and service-oriented approaches powered by smaller models, highlighting advances in customization and deployment strategies. Steven shares real-world examples and looks to the future of agent cataloging, continuous retraining, and resource efficiency in AI engineering. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Steven Huels about the benefits of small language models for production workloadsInterview IntroductionHow did you get involved in machine learning?Language models are available in a wide range of sizes, measured both in terms of parameters and disk space. What are your heuristics for deciding what qualifies as a "small" vs. "large" language model?What are the corresponding heuristics for when to use a small vs. large model?The predominant use case for small models is in self-hosted contexts, which requires a certain amount of organizational sophistication. What are some helpful questions to ask yourself when determining whether to implement a model-serving stack vs. relying on hosted options?What are some examples of "small" models that you have seen used effectively?The buzzword right now is "agentic" for AI driven workloads. How do small models fit in the context of agent-based workloads?When and where should you rely on larger models?When speaking of small models, one of the common requirements for making them truly useful is to fine-tune them for your problem domain and organizational data. How has the complexity and difficulty of that operation changed over the past ~2 years?Serving models requires several operational capabilities beyond the raw inference serving. What are the other infrastructure and organizational investments that teams should be aware of as they embark on that path?What are the most interesting, innovative, or unexpected ways that you have seen small language models used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on operationalizing inference and model customization?When is a small or self-hosted language model the wrong choice?What are your predictions for the near future of small language model capabilities/availability?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links RedHat AI EngineeringGenerative AIPredictive AIChatGPTQLORAHuggingFacevLLMOpenShift AILlama ModelsDeepSeekGPT-OSSMistralMixture of Experts (MoE)QwenInstructLabSFT == Supervised Fine TuningLORAThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٥١ من الدقائق
  5. ١٣ سبتمبر

    AI Agents and Identity Management

    Summary In this episode of the AI Engineering Podcast Julianna Lamb, co-founder and CTO of Stytch, talks about the complexities of managing identity and authentication in agentic workflows. She explores the evolving landscape of identity management in the context of machine learning and AI, highlighting the importance of flexible compute environments and seamless data exchange. The conversation covers implications of AI agents on identity management, including granular permissions, OAuth protocol, and adapting systems for agentic interactions. Julianna also discusses rate limiting, persistent identity, and evolving standards for managing identity in AI systems. She emphasizes the need to experiment with AI agents and prepare systems for integration to stay ahead in the rapidly advancing AI landscape. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Julianna Lamb about the complexities of managing identity and auth in agentic workflowsInterview IntroductionHow did you get involved in machine learning?The term "identity" is very overloaded. Can you start by giving your definition in the context of technical systems?What are some of the different ways that AI agents intersect with identity?We have decades of experience and effort in building identity infrastructure for the internet, what are the most significant ways in which that is insufficient for agent-based use cases?I have heard anecdotal references to the ways in which AI agents lead to a proliferation of "identities". How would you characterize the magnitude of the difference in scale between human-powered identity, deterministic automation (e.g. bots or bot-nets), and AI agents?The other major element of establishing and verifying "identity" is how that intersects with permissions or authorization. What are the major shortcomings of our existing investment in managing and auditing access and control once you are within a system?How does that get amplified with AI agents?Typically authentication has been done at the perimeter of a system. How does that architecture change when accounting for AI agents?How does that get complicated by where the agent originates? (e.g external agents interacting with a third-party system vs. internal agents operated by the service provider)What are the concrete steps that engineering teams should be taking today to start preparing their systems for agentic use-cases (internal or external)?How do agentic capabilities change the means of protecting against malicious bots? (e.g. bot detection, defensive agents, etc.)What are the most interesting, innovative, or unexpected ways that you have seen authn/authz/identity addressed for AI use cases?What are the most interesting, unexpected, or challenging lessons that you have learned while working on identity/auth(n|z) systems?What are your predictions for the future of identity as adoption and sophistication of AI systems progresses?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links StytchAI AgentMachine To Machine AuthenticationAPI AuthenticationMCP == Model Context ProtocolOAuthIdentity ProviderOAuth ScopesOAuth 2.1CaptchaRBAC == Role-Based Access ControlABAC == Attribute-Based Access ControlReBAC == Relationship-Based Access ControlGoogle ZanzibarIdempotenceDynamic Client RegistrationLarge Action ModelsClaude CodeThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٥٤ من الدقائق
  6. ٤ سبتمبر

    Revolutionizing Production Systems: The Resolve AI Approach

    Summary In this episode of the AI Engineering Podcast, CEO of Resolve AI Spiros Xanthos shares his insights on building agentic capabilities for operational systems. He discusses the limitations of traditional observability tools and the need for AI agents that can reason through complex systems to provide actionable insights and solutions. The conversation highlights the architecture of Resolve AI, which integrates with existing tools to build a comprehensive understanding of production environments, and emphasizes the importance of context and memory in AI systems. Spiros also touches on the evolving role of AI in production systems, the potential for AI to augment human operators, and the need for continuous learning and adaptation to fully leverage these advancements. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Spiros Xanthos about architecting agentic capabilities for operational challenges with managing production systems.Interview IntroductionHow did you get involved in machine learning?Can you describe what Resolve AI is and the story behind it?We have decades of experience as an industry in managing operational complexity. What are the critical failures in capabilities that you are addressing with the application of AI?Given the existing capabilities of dedicated platforms (e.g. Grafana, PagerDuty, Splunk, etc), what is your reasoning for building a new system vs. a new feature of existing operational product?Over the past couple of years the industry has developed a growing number of agent patterns. What was your approach in evaluating and selecting a particular approach for your product?One of the complications of building any platform that supports operational needs of engineering teams is the complexity of integrating with their technology stack. This is doubly true when building an AI system that needs rich context. What are the core primitives that you are relying on to build a robust offering?How are you managing the learning process for your systems to allow for iterative discovery and improvement?What are your strategies for personalizing those discoveries to a given customer and operating environment?One of the interesting challenges in agentic systems is managing the user experience for human-in-the-loop and machine to human handoffs in each direction. How are you thinking about that, especially given the criticality of the systems that you are interacting with?As more of the code that is running in production environments is co-developed with AI, what impact do you anticipate on the overall operational resilience of the systems being monitored?One of the challenges of working with LLMs is the cold start problem where every conversation starts from scratch. How are you approaching the overall problem of context engineering and ensuring that you are consistently providing the necessary information for the model to be effective in its role?What are the most interesting, innovative, or unexpected ways that you have seen Resolve AI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Resolve AI?When is Resolve AI the wrong choice?What do you have planned for the future of Resolve AI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links Resolve AISplunkOpenTelemetrySplunk ObservabilityContext EngineeringGrafanaKubernetesPagerDutyThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٥١ من الدقائق
  7. ٢٦ أغسطس

    Designing Scalable AI Systems with FastMCP: Challenges and Innovations

    Summary In this episode of the AI Engineering Podcast Jeremiah Lowin, founder and CEO of Prefect Technologies, talks about the FastMCP framework and the design of MCP servers. Jeremiah explains the evolution of FastMCP, from its initial creation as a simpler alternative to the MCP SDK to its current role in facilitating the deployment of AI tools. The discussion covers the complexities of designing MCP servers, the importance of context engineering, and the potential pitfalls of overwhelming AI agents with too many tools. Jeremiah also highlights the importance of simplicity and incremental adoption in software design, and shares insights into the future of MCP and the broader AI ecosystem. The episode concludes with a look at the challenges of authentication and authorization in AI applications and the exciting potential of MCP as a protocol for the future of AI-driven business logic. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Jeremiah Lowin about the FastMCP framework and how to design and build your own MCP serversInterview IntroductionHow did you get involved in machine learning?Can you start by describing what MCP is and its purpose in the ecosystem of AI applications?What is FastMCP and what motivated you to create it?Recognizing that MCP is relatively young, how would you characterize the landscape of MCP frameworks?What are some of the stumbling blocks on the path to building a well engineered MCP server?What are the potential ramifications of poorly designed and implemented MCP implementations?In the overall context of an AI-powered/agentic application, what are the tradeoffs of investing in the MCP protocol? (e.g. engineering effort, process isolation, tool creation, auth(n|z), etc.)In your experience, what are the architectural patterns that you see of MCP implementation and usage?There are a multitude of MCP servers available for a variety of use cases. What are the key factors that someone should be using to evaluate their viability for a production use case?Can you give an overview of the key characteristics of FastMCP and why someone might select it as their implementation target for a custom MCP server?How have the design, scope, and goals of the project evolved since you first started working on it?For someone who is using FastMCP as the framework for creating their own AI tools, what are some of the design considerations or best practices that they should be aware of?What are some of the ways that someone might consider integrating FastMCP into their existing Python-powered web applications (e.g. FastAPI, Django, Flask, etc.)As you continue to invest your time and energy into FastMCP, what is your overall goal for the project?What are the most interesting, innovative, or unexpected ways that you have seen FastMCP used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FastMCP?When is FastMCP the wrong choice?What do you have planned for the future of FastMCP?Contact Info LinkedInGitHubParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links FastMCPFastMCP CloudPrefectModel Context Protocol (MCP)AI ToolsFastAPIPython DecoratorWebsocketsSSE == Server-Sent EventsStreamable HTTPOAuthMCP GatewayMCP SamplingFlaskDjangoASGIMCP ElicitationAuthKitDynamic Client RegistrationsmolagentsLarge Active ModelsA2AThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ١ س ١٤ د
  8. ٢٣ أغسطس

    Proactive Monitoring in Heavy Industry: The Role of AI and Human Curiosity

    Summary In this episode of the AI Engineering Podcast Dr. Tara Javidi, CTO of KavAI, talks about developing AI systems for proactive monitoring in heavy industry. Dr. Javidi shares her background in mathematics and information theory, influenced by Claude Shannon's work, and discusses her approach to curiosity-driven AI that mimics human curiosity to improve data collection and predictive analytics. She explains how KavAI's platform uses generative AI models to enhance industrial monitoring by addressing informational blind spots and reducing reliance on human oversight. The conversation covers the architecture of KavAI's systems, integrating AI with existing workflows, building trust with operators, and the societal impact of AI in preventing environmental catastrophes, ultimately highlighting the future potential of information-centric AI models. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems.Your host is Tobias Macey and today I'm interviewing Dr. Tara Javidi about building AI systems for proactive monitoring of physical environments for heavy industryInterview IntroductionHow did you get involved in machine learning?Can you describe what KavAI is and the story behind it?What are some of the current state-of-the-art applications of AI/ML for monitoring and accident prevention in industrial environments?What are the shortcomings of those approaches?What are some examples of the types of harm that you are focused on preventing or mitigating with your platform?On your site it mentions that you have created a foundation model for physical awareness. What are some examples of the types of predictive/generative capabilities that your model provides?A perennial challenge when building any digital model of a physical system is the lack of absolute fidelity. What are the key sources of information acquisition that you rely on for your platform?In addition to your foundation model, what are the other systems that you incorporate to perform analysis and catalyze action?Can you describe the overall system architecture of your platform?What are some of the ways that you are able to integrate learnings across industries and environments to improve the overall capacity of your models?What are the most interesting, innovative, or unexpected ways that you have seen KavAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on KavAI?When is KavAI/Physical AI the wrong choice?What do you have planned for the future of KavAI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links KavAIInformation TheoryClaude ShannonThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

    ٤١ من الدقائق

التقييمات والمراجعات

٤٫٣
من ٥
‫٦ من التقييمات‬

حول

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

قد يعجبك أيضًا