Domesticating AI

SoyPete Tech

Domesticating AI is a bi-weekly podcast about practical AI for developers. We cover self-hosted models, local AI, homelabs, hardware, agents, security, and reliability so software engineers can build - Miriah Peterson: Software engineer, Go educator, and community builder focused on *production-first* AI. Runs SoyPete Tech (streams + writing + open-source). - Matt Sharp: AI Engineer/Strategist, co-author of *LLMs in Production*, MLOps practitioner. Writes **The Data Pioneer**. - Chris Brousseau: NLP practitioner, co-author of LLMs in Production, VP of AI at VEOX. You can find him as IMJONEZZ

エピソード

  1. 5月8日

    You’re Using AI Wrong: Build the System, Not Just the Prompt /w Lexi Pasi

    Recorded: April 14, 2026 Most people using AI today are still users. They open ChatGPT, call an API, and get an answer. And honestly… it works. But that’s not the same as building with AI. In this episode of Domesticating AI, we break down the difference between AI users and AI practitioners—and why that shift matters if you want reliable systems. We’re joined by Alexandra “Lexi” Pasi, PhD, CEO of Lucidity Sciences, to talk about what it actually means to own the system around AI: why calling an API is still user behaviorwhat changes when you build the harnesshow agent systems actually fail (loops, cost, drift)why switching models isn’t a reliability strategyhow to add layers—constraints, validation, and control flowwhy engineering discipline matters more with AI, not lessIf you’ve built your first AI agent, workflow, or coding loop—this is the “now what?” episode. Alexandra Pasi is the CEO of Lucidity Sciences, where she works at the intersection of mathematics, machine learning, and real-world system design. She holds a PhD in Mathematics from Baylor University and specializes in building analytical and algorithmic systems that bring structure to complex, uncertain environments. 🔗 LinkedIn: https://www.linkedin.com/in/alexandrapasi/🔗 Lucidity Sciences: https://luciditysciences.com Google TurboQuant (LLM compression research)https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/Anthropic Claude Mythos Preview (security-focused model)https://red.anthropic.com/2026/mythos-preview/Project Glasswing (Anthropic security initiative)https://www.anthropic.com/glasswingKarpathy Autoresearch (self-improving training loop)https://github.com/karpathy/autoresearchKitaru (durable agent execution framework)https://github.com/zenml-io/kitaruSubscribe on YouTubeFollow on Spotify & Apple PodcastsSupport the show on Patreon:👉 https://patreon.com/DomesticatingAIPodcastKeep your AI on a leash. 🧾 Episode Summary👤 Guest: Alexandra “Lexi” Pasi, PhD🔗 Topics & Links Mentioned🔔 Follow & Support

    43分
  2. 4月24日

    Hacking AI: Why Most AI Systems Are Insecure by Default

    Hosts: Miriah Peterson, Matt Sharp, Chris BrousseauRecorded: April 2026Status: Released Most AI systems today are designed to be helpful — not secure. In this episode, we break down how AI systems actually get exploited in production: a real supply chain attack on a widely used AI dependencyprompt injection and why it still worksimage-based (multimodal) exploitstool and agent abuseIf you’re building AI — especially at a startup — you are the security team. A widely used AI dependency was compromised via a malicious .pth file: executes automatically when Python startsno import requiredtargets credentials, SSH keys, and environment variables👉 Just installing the package was enough. This highlights a critical reality: Your AI system is only as secure as your dependencies. Models cannot distinguish between instructions and dataExternal content can override system behaviorStill one of the most common AI vulnerabilities🔗 https://learnprompting.org/docs/prompt_hacking/injection Hidden instructions embedded in imagesAI interprets images differently than humansExpands the attack surface significantly🔗 https://arxiv.org/abs/2306.11698 AI systems can take real-world actions via toolsPrompt injection → API calls, data leaks, unintended executionAgents amplify risk through autonomy and retriesIf you’re building AI systems today: separate instructions from datalimit tool permissionstreat outputs as untrustedvalidate everything before executionAI systems have an internet-sized attack surfaceSupply chain attacks bypass all AI safeguardsPrompt injection is a fundamental problemAI doesn’t fail safely — it fails wherever your system is weakestLiteLLM incident: https://github.com/BerriAI/litellm/issues/24512Attack breakdown: https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/LLM attack techniques: https://llm-attacks.org/OWASP LLM Top 10: https://owasp.org/www-project-top-10-for-large-language-model-applications/Gandalf challenge: https://gandalf.lakera.ai/We’ve launched a Patreon for Domesticating AI 🎉 Get: early access to episodesbehind-the-scenes contentbloopers and uncut moments👉 https://patreon.com/DomesticatingAIPodcast 🎥 YouTube: https://youtu.be/HTTxE7Y1skoWhat’s the weirdest way an AI system has broken for you? Keep your AI on a leash.

    43分
  3. 4月10日

    Coding with AI: Vibe Coding vs Real Engineering (with Tyler Folkman)

    AI can write code — but that doesn’t mean you should trust it. In this episode of Domesticating AI, we’re joined by Tyler Folkman (author of The AI Architect) to break down how engineers are actually using AI to build software — and why most people are still just vibe coding. Vibe coding vs real engineeringReasoning models vs coding modelsHow to plan and prompt AI effectivelyWhen to let AI take the wheel (and when not to)Local vs cloud coding agentsToken costs vs owning hardwareTyler Folkman — The AI ArchitectAnthropichttps://www.anthropic.comOpenAIhttps://openai.comOllamahttps://ollama.comMiniMax-M2.5https://ollama.com/library/minimax-m2.5GLM-5https://ollama.com/library/glm-5AmpCode Chroniclehttps://ampcode.com/chronicleAndrej Karpathy on Context Engineeringhttps://x.com/karpathy“Human in the Loop is Tired”(add link if you have it)Domesticating AI is a bi-weekly podcast about practical AI for developers. We help you brace the feral open-source AI landscape — so you can tame it instead of getting dragged by it. contact@domesticatingai.com Spotifyhttps://open.spotify.com/show/2WsAR4fvcXzp3vVZGVlkE2 Apple Podcastshttps://podcasts.apple.com/us/podcast/domesticating-ai/id1873338950 Are you vibe coding — or engineering with AI? Let us know your setup. Keep your AI on a leash. 🧠 What We Cover🔗 Links & ResourcesGuestModels & ToolsArticles / Mentions🎧 About the Podcast📬 Contact🔥 Follow👇 Join the Discussion

    40分
  4. 3月27日

    Securing Your Homelab: AI Infrastructure, Access Control & Why Docker Isn’t Isolation

    Recording Date: February 27, 2026Hosts: Miriah Peterson, Matt Sharp, Chris Brousseau Running AI locally is easier than ever.Running it securely is another story. In this episode of Domesticating AI, we break down the moment every homelab builder hits: The second you move from one machine to two machines…access becomes your first real engineering problem. We explore the real architecture questions behind self-hosting AI: Why a dedicated machine isn’t a sandbox Why Docker alone isn’t isolation How homelabs evolve from Plex servers to AI infrastructure The blast radius problem with local agents Why networking and access control matter more than model size We also discuss the surge in local AI hardware demand and the risks of running powerful agents on machines with unrestricted access. Whether you're running OpenClaw, Ollama, a NAS, Postgres, or a home automation stack, the same rule applies: Infrastructure without containment is just risk waiting to happen. High-memory Mac Minis are seeing long shipping delays as developers rush to build local AI systems. https://www.tomshardware.com/tech-industry/artificial-intelligence/openclaw-fueled-ordering-frenzy-creates-apple-mac-shortage-delivery-for-high-unified-memory-units-now-ranges-from-6-days-to-6-weeks Marketplace plugins and execution boundaries are becoming a growing security concern in agent systems. https://www.linkedin.com/posts/matthewsharp_i-use-to-do-nothing-but-post-about-clean-activity-7432832983339999232-iR04 Overview of risks around agent plugin ecosystems and execution boundaries. https://conscia.com/blog/the-openclaw-security-crisis/ Private mesh networking used to securely access homelabs. https://tailscale.com Local AI coding agent framework. https://openclaw.ai Local LLM runtime used for running models on personal machines. https://ollama.com Why people actually build homelabs Plex, NAS, and home automation as infrastructure entry points AI workloads vs dev workloads Why long-running services shouldn’t live on your laptop Networking architecture for homelabs RBAC-style access control between machines Secrets management mistakes developers make Containment and blast-radius thinking for AI agents Tailscale and private mesh networking Each host answers: If I had $0 What I would run What I would avoid If I had $1K What machine I’d buy How I’d isolate workloads If I had $5K How I’d segment infrastructure What monitoring I’d deploy What I would never expose to the internet Staff Data Engineer, content creator, and founder of SoyPete Tech.Miriah focuses on practical AI systems, Go infrastructure, and self-hosted AI engineering. She is also a Google Developer Expert in Go and organizer of Go West Conf. https://soypete.tech AI engineer and co-author of LLMs in Production.Matt focuses on applied AI systems, local model infrastructure, and developer-focused AI tooling. Software engineer and AI practitioner focused on practical applications of machine learning and developer infrastructure. Domesticating AI is supported by the SoyPete Tech community. If you enjoy the show: Subscribe on YouTube Follow on Spotify Join the Discord community Share the episode with another engineer building with AI More content and tutorials: https://soypetech.substack.com 📰 News DiscussedMac Mini Shortages from Local AI DemandOpenClaw Security DiscussionOpenClaw Security Concerns (Referenced)🧰 Tools & Technologies MentionedTailscaleOpenClawOllama🏗 Topics Covered⚡ Lightning Round🎙 HostsMiriah PetersonMatt SharpChris Brousseau🤝 Sponsors

    30分
  5. 2月13日

    From “Inference Box” to Dev Rig: What NVIDIA DGX Spark Actually Is | Ep 2

    Everyone keeps calling NVIDIA DGX Spark an “inference box”… but in practice it behaves more like a dev rig. In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI development + fine-tuning) vs what it isn’t (a magical drop-in inference server). We also dig into why unified memory changes the local-AI experience, the “gateway stack” (Ollama + Open WebUI), when you outgrow turnkey UIs, and how homelab economics + networking decisions shape what you should run at home. In this episode Training vs inference (and why “inference server” gets misused) Unified memory: what it changes for model loading + workflows Ollama + Open WebUI as the fastest on-ramp for local AI Fine-tuning workflows (QLoRA/Unsloth-style) and where Spark shines Homelab reality: Docker “recipes,” troubleshooting, and collaboration Safer remote access: Tailscale Cloud vs home economics (when cloud is cheaper… and when it explodes) NVIDIA / DGX Spark DGX Spark: https://www.nvidia.com/en-us/products/workstations/dgx-spark/ Build hub / recipes: https://build.nvidia.com/spark NIM on Spark playbook: https://build.nvidia.com/spark/nim-llm Local AI runners + UIs Ollama: https://ollama.com/ Open WebUI (GitHub): https://github.com/open-webui/open-webui Open WebUI docs: https://docs.openwebui.com/ llama.cpp: https://github.com/ggml-org/llama.cpp LM Studio: https://lmstudio.ai/ vLLM: https://github.com/vllm-project/vllm Jan: https://jan.ai/ Fine-tuning + workflows Unsloth: https://github.com/unslothai/unsloth Image generation tools (mentioned) ComfyUI: https://github.com/Comfy-Org/ComfyUI AUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui Networking / Remote access Tailscale: https://tailscale.com/ Cloud GPU alternatives (mentioned) Runpod pricing: https://www.runpod.io/pricing Modal pricing: https://modal.com/pricing Miriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.Connect: SoyPete Tech (YouTube): https://www.youtube.com/@SoyPete_Tech SoyPete Tech (Substack): https://soypetetech.substack.com/ LinkedIn: https://www.linkedin.com/in/miriah-peterson-35649b5b/ Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.Connect: Data Pioneer (Substack): https://thedatapioneer.substack.com/ Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.Connect: YouTube (IMJONEZZ): https://www.youtube.com/channel/UCPtkaw_x97yP4WevW7axk0g LinkedIn: https://www.linkedin.com/in/chris-brousseau/en 📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in-production Links & ResourcesHosts

    43分
  6. 1月30日

    Your First AI at Home

    Domesticating AI — S01E01: Your First AI at Home Hosts: Miriah Peterson, Matt Sharp, Chris Brousseau This episode is your practical on-ramp to running AI at home: why inference engines matter, what to install first, and how to make “local AI” feel stable instead of fragile. The hosts start with a hardware + market reality check (tinygrad’s tinybox-style “AI server appliance” idea and the ongoing memory/RAM crunch), then break down what an inference engine actually does, how popular runtimes compare (llama.cpp, vLLM, Ollama, TGI), and a sane starter workflow for getting from “downloaded a model” to “usable local AI.” ​Inference engines are the “runtime”: model loading, tokenization, KV cache/context handling, and the serving layer.​Pick your engine based on your goal: tinkering (llama.cpp) vs serving throughput (vLLM/TGI) vs it-just-works packaging (Ollama).​You don’t need a brand-new rig to start, but RAM/VRAM constraints will shape everything.​Use leaderboards as a hint, then validate with your own small eval prompts that match your workload.​If you’re exposing anything beyond your LAN: reverse proxy + TLS + don’t casually open ports.0:00 Intro + host chaos + what the show is 1:08 News: tinygrad / “AI server appliance” thinking (tinybox vibes) 2:44 News: RAM prices + the memory crunch for builders 8:26 Main: building your first AI at home (why now) 8:49 What is an inference engine? 12:30 Engines compared: llama.cpp vs vLLM vs Ollama vs TGI 15:42 Do you need to buy a new computer? (CPU vs GPU realities) 25:32 Models for home: fit-to-hardware, quantization, context 34:37 Leaderboards vs evals: picking models you can trust 44:00 Community + meetups + where to follow 45:22 Outro — “Keep your AI on a leash” News / context ​Tom’s Hardware: TinyBox production + multi-GPU appliance concept (Tom's Hardware)​Reuters: AI-driven memory shortage / supply-chain crunch (Reuters)​IDC: 2026 device impacts from the memory shortage (IDC)Inference engines ​llama.cpp (GGML org) (GitHub)​vLLM OpenAI-compatible server (docs.vllm.ai)​Ollama docs (quickstart) (Ollama Documentation)​Hugging Face Text Generation Inference (TGI) (GitHub)​Miriah Peterson: Software engineer, Go educator, and community builder focused on production-first AI. Runs SoyPete Tech (streams + writing + open-source).​Matt Sharp: AI Engineer/Strategist, co-author of LLMs in Production, MLOps practitioner. Writes The Data Pioneer. (thedatapioneer.substack.com)​Chris Brousseau: NLP practitioner, co-author of LLMs in Production, VP of AI at VEOX. You can find him as IMJONEZZ. (veox.ai)​SoyPete Tech (YouTube): (youtube.com)​SoyPete Tech (Substack): (soypetetech.substack.com)​Matt’s Substack (The Data Pioneer): (thedatapioneer.substack.com)​Chris on YouTube (IMJONEZZ): (youtube.com)​LLMs in Production (book): (Manning Publications)

    42分

番組について

Domesticating AI is a bi-weekly podcast about practical AI for developers. We cover self-hosted models, local AI, homelabs, hardware, agents, security, and reliability so software engineers can build - Miriah Peterson: Software engineer, Go educator, and community builder focused on *production-first* AI. Runs SoyPete Tech (streams + writing + open-source). - Matt Sharp: AI Engineer/Strategist, co-author of *LLMs in Production*, MLOps practitioner. Writes **The Data Pioneer**. - Chris Brousseau: NLP practitioner, co-author of LLMs in Production, VP of AI at VEOX. You can find him as IMJONEZZ

その他のおすすめ