Ship It Weekly - DevOps, SRE, and Platform Engineering News

Teller's Tech - DevOps SRE Podcast

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering. Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture. This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time. Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment. If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

  1. Ship It Conversations: Yvonne Young on Linux Foundations, Mentorship, and Getting Job Ready in Cloud

    1D AGO

    Ship It Conversations: Yvonne Young on Linux Foundations, Mentorship, and Getting Job Ready in Cloud

    This is a guest conversation episode of Ship It Weekly (separate from the weekly news recaps). In this Ship It: Conversations episode I talk with Yvonne Young, a cloud and Linux mentor active in the CloudWhistler community. We talk about the real path into cloud and DevOps, why Linux still matters as a foundation, what “job ready” actually means, and why focus, consistency, and business thinking matter more than chasing every new tool. Highlights Linux fundamentals still matter because so much of cloud and infra work sits on top of LinuxWhat “job ready” really means: prepare for both technical and behavioral interviews, know the basics, and show how you learn when you don’t know somethingWhy so many juniors stall out by trying to learn everything instead of picking a directionWhy daily reps beat cramming: short, consistent practice keeps skills fresh better than marathon study sessionsHow Yvonne thinks about certifications, including why hands-on certs like RHCSA stand outHands-on practice ideas: break things on purpose, troubleshoot, fix services, inspect ports, and use the help filesWhy tools matter less than the business problem they solveUsing Vault as an example of solving real issues like secret sprawl, rotation, and centralized accessHow to think about cloud learning: pick one provider, learn the concepts, and map your path to the kinds of companies you want to work forWhy mentorship and community matter, especially for juniors trying not to waste time or head in the wrong directionWhat seniors can do better: better onboarding, real availability, and giving juniors an actual lifeline when they get stuckYvonne’s links LinkedIn: https://www.linkedin.com/in/yvonne-youngStuff mentioned Ali Sohail on LinkedIn: https://www.linkedin.com/in/alisohailit/Tech With Engineers on LinkedIn: https://uk.linkedin.com/company/tech-with-engineersCloudWhistler community / training: training.cloudwhistler.comVault: https://www.hashicorp.com/en/products/vaultOpenBao: https://openbao.org/More episodes + details: https://shipitweekly.fm

    31 min
  2. AWS Bahrain/UAE Data Center Issues Amid Iran Strikes, ArgoCD vs Flux GitOps Failures, GitHub Actions Hackerbot-Claw Attacks (Trivy), RoguePilot Codespaces Prompt Injection, Block “AI Remake” Layoffs, Claude Code Security

    3D AGO

    AWS Bahrain/UAE Data Center Issues Amid Iran Strikes, ArgoCD vs Flux GitOps Failures, GitHub Actions Hackerbot-Claw Attacks (Trivy), RoguePilot Codespaces Prompt Injection, Block “AI Remake” Layoffs, Claude Code Security

    This week on Ship It Weekly, Brian looks at how the boundary of ops keeps expanding. We cover AWS flagging issues in Bahrain/UAE amid Iran strikes, ArgoCD vs Flux and why ArgoCD can get stuck in failed sync states, GitHub Actions being exploited at scale (plus Trivy’s incident), RoguePilot prompt injection meeting real credentials in Codespaces, Block’s “AI remake” layoffs, and Anthropic’s Claude Code Security for defenders. Lightning round: DeepSeek model access geopolitics, Vercel’s agentic security boundaries, a KEV CVE to patch, an MCP-atlassian SSRF-to-RCE chain, and Claude Cowork scheduled tasks. Links AWS Bahrain/UAE (Reuters) https://www.reuters.com/world/middle-east/amazon-cloud-unit-flags-issues-bahrain-uae-data-centers-amid-iran-strikes-2026-03-02/ ArgoCD to Flux https://hai.wxs.ro/migrations/argocd-to-flux/ GitHub Actions exploitation https://www.stepsecurity.io/blog/hackerbot-claw-github-actions-exploitation Trivy incident https://github.com/aquasecurity/trivy/discussions/10265 RoguePilot https://thehackernews.com/2026/02/roguepilot-flaw-in-github-codespaces.html Block layoffs (WSJ) https://www.wsj.com/business/jack-dorseys-block-to-lay-off-4-000-employees-in-ai-remake-28f0d869 Claude Code Security https://www.anthropic.com/news/claude-code-security DeepSeek (Reuters) https://www.reuters.com/world/china/deepseek-withholds-latest-ai-model-us-chipmakers-including-nvidia-sources-say-2026-02-25/ Agentic boundaries https://vercel.com/blog/security-boundaries-in-agentic-architectures CISA KEV https://www.cisa.gov/news-events/alerts/2026/03/03/cisa-adds-two-known-exploited-vulnerabilities-catalog mcp-atlassian CVE https://arcticwolf.com/resources/blog-uk/cve-2026-27825-critical-unauthenticated-rce-and-ssrf-in-mcp-atlassian/ Claude Cowork tasks https://support.claude.com/en/articles/13854387-schedule-recurring-tasks-in-cowork More: https://shipitweekly.fm

    18 min
  3. Cloudflare BYOIP BGP Withdrawals, Clerk’s Postgres Query-Plan Flip Outage, and AWS Kiro Permissions Lessons (Grafana Privesc + runc CVEs)

    FEB 27

    Cloudflare BYOIP BGP Withdrawals, Clerk’s Postgres Query-Plan Flip Outage, and AWS Kiro Permissions Lessons (Grafana Privesc + runc CVEs)

    This week on Ship It Weekly, Brian covers three “automation meets reality” stories that every DevOps, SRE, and platform team can learn from. Cloudflare accidentally withdrew customer BYOIP prefixes due to a buggy cleanup task, Clerk got knocked over by a Postgres auto-analyze query plan flip, and AWS responded to reports about its internal Kiro tooling by framing the incident as misconfigured access controls. Plus: a quick EKS node monitoring update, and a tight security lightning round. Links Cloudflare BYOIP outage postmortem https://blog.cloudflare.com/cloudflare-outage-february-20-2026/ Clerk outage postmortem (Feb 19, 2026) https://clerk.com/blog/2026-02-19-system-outage-postmortem AWS outage report (Reuters) https://www.reuters.com/business/retail-consumer/amazons-cloud-unit-hit-by-least-two-outages-involving-ai-tools-ft-says-2026-02-20/ AWS response on Kiro + access controls https://www.aboutamazon.com/news/aws/aws-service-outage-ai-bot-kiro EKS Node Monitoring Agent (open source) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-eks-node-monitoring-agent-open-source/ Grafana CVE-2026-21721 https://grafana.com/security/security-advisories/cve-2026-21721/ runc CVEs (AWS-2025-024) https://aws.amazon.com/security/security-bulletins/rss/aws-2025-024/ GitLab patch releases https://about.gitlab.com/releases/2025/11/26/patch-release-gitlab-18-6-1-released/ Atlassian Feb 2026 security bulletin https://confluence.atlassian.com/security/security-bulletin-february-17-2026-1722256046.html Human story: SRE Is Anti-Transactional (ACM Queue) https://queue.acm.org/detail.cfm?id=3773094 More episodes and show notes at https://shipitweekly.fm On Call Briefs at: https://oncallbrief.com

    18 min
  4. Ship It Conversations: Mike Lady on Day Two Readiness + Guardrails in the AI Era

    FEB 24

    Ship It Conversations: Mike Lady on Day Two Readiness + Guardrails in the AI Era

    This is a guest conversation episode of Ship It Weekly (separate from the weekly news recaps). In this Ship It: Conversations episode I talk with Mike Lady (Senior DevOps Engineer, distributed systems) from Enterprise Vibe Code on YouTube. We talk day two readiness, guardrails/quality gates, and why shipping safely matters even more now that AI can generate code fast. Highlights Day 0 vs Day 1 vs Day 2 (launching vs operating and evolving safely)What teams look like without guardrails (“hope is not a strategy”)Why guardrails speed you up long-term (less firefighting, more predictable delivery)Day-two audit checklist: source control/branches/PRs, branch protection, CI quality gates, secrets/config, staging→prod flowAI agents: they’ll “lie, cheat, and steal” to satisfy the goal unless you gate themMulti-model reviews (Claude/Gemini/Codex) as different perspectivesAI in prod: start read-only (logs/traces), then earn trust slowlyMike’s links YouTube: https://www.youtube.com/@EnterpriseVibeCodeSite: https://www.enterprisevibecode.com/LinkedIn: https://www.linkedin.com/in/mikelady/Stuff mentioned Vibe Coding (Gene Kim + Steve Yegge): https://www.simonandschuster.com/books/Vibe-Coding/Gene-Kim/9781966280026Beads (agent memory/issue tracker): https://github.com/steveyegge/beadsGas Town (agent orchestration): https://github.com/steveyegge/gastownAGENTS.md (agent instructions file): https://agents.md/OpenAI Codex: https://openai.com/codex/More episodes + details: https://shipitweekly.fm

    35 min
  5. GitHub Agentic Workflows, Gentoo Leaves GitHub, Argo CD 3.3 Upgrade Gotcha, AWS Config Scope Creep

    FEB 20

    GitHub Agentic Workflows, Gentoo Leaves GitHub, Argo CD 3.3 Upgrade Gotcha, AWS Config Scope Creep

    This week on Ship It Weekly, Brian hits five stories where the “defaults” are shifting under ops teams. GitHub is bringing Agentic Workflows into Actions, Gentoo is migrating off GitHub to Codeberg, Argo CD upgrades are forcing Server-Side Apply in some paths, AWS Config quietly expanded coverage again, and EC2 nested virtualization is now possible on virtual instances. Links YouTube episodes https://www.youtube.com/watch?v=tuuLlo2rbI0&list=PLYLi5KINFnO7dVMbhsJQTKRFXfSSwPmuL&pp=sAgC OnCallBrief https://oncallbrief.com Teller’s Tech Substack https://tellerstech.substack.com/ GitHub Agentic Workflows (preview) https://github.blog/changelog/2026-02-13-github-agentic-workflows-are-now-in-technical-preview/ Gentoo moves to Codeberg https://www.theregister.com/2026/02/17/gentoo_moves_to_codeberg_amid/ Argo CD upgrade guide: 3.2 -> 3.3 (SSA) https://argo-cd.readthedocs.io/en/latest/operator-manual/upgrading/3.2-3.3/ AWS Config: 30 new resource types https://aws.amazon.com/about-aws/whats-new/2026/02/aws-config-new-resource-types EC2 nested virtualization (virtual instances) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-ec2-nested-virtualization-on-virtual/ GitHub status page update https://github.blog/changelog/2026-02-13-updated-status-experience/ GitHub Actions: early Feb updates https://github.blog/changelog/2026-02-05-github-actions-early-february-2026-updates/ Runner min version enforcement extended https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/ Open Build Service postmortem https://openbuildservice.org/2026/02/02/post-mortem/ Human story: AI SRE vs incident management https://surfingcomplexity.blog/2026/02/14/lots-of-ai-sre-no-ai-incident-management/ More episodes and show info on https://shipitweekly.fm

    19 min
  6. Special: OpenClaw Security Timeline and Fallout: CVE-2026-25253 One-Click Token Leak, Malicious ClawHub Skills, Exposed Agent Control Panels, and Why Local AI Agents Are a New DevOps/SRE Control Plane (OpenAI Hires Founder)

    FEB 17

    Special: OpenClaw Security Timeline and Fallout: CVE-2026-25253 One-Click Token Leak, Malicious ClawHub Skills, Exposed Agent Control Panels, and Why Local AI Agents Are a New DevOps/SRE Control Plane (OpenAI Hires Founder)

    In this Ship It Weekly special, Brian breaks down the OpenClaw situation and why it’s bigger than “another CVE.” OpenClaw is a preview of what platform teams are about to deal with: autonomous agents running locally, wired into real tools, real APIs, and real credentials. When the trust model breaks, it’s not just data exposure. It’s an operator compromise. We walk through the recent timeline: mass internet exposure of OpenClaw control panels, CVE-2026-25253 (a one-click token leak that can turn your browser into the bridge to your local gateway), a skills marketplace that quickly became a malware delivery channel, and the Moltbook incident showing how “agent content” becomes a new supply chain problem. We close with the signal that agents are going mainstream: OpenAI hiring the OpenClaw creator. Chapters 1. What OpenClaw Actually Is2. The Situation in One Line3. Localhost Is Not a Boundary (The CVE Lesson)4. Exposed Control Panels (How “Local” Went Public)5. The Marketplace Problem (Skills Are Supply Chain)6. The Ecosystem Spills (Agent Platforms Leaking Real Data)7. Minimum Viable Safety for Local Agents8. The Plot Twist (OpenAI Hires the Creator)Links from this episode Censys exposure research https://censys.com/blog/openclaw-in-the-wild-mapping-the-public-exposure-of-a-viral-ai-assistant GitHub advisory (CVE-2026-25253) https://github.com/advisories/GHSA-g8p2-7wf7-98mq NVD entry https://nvd.nist.gov/vuln/detail/CVE-2026-25253 Koi Security: ClawHavoc / malicious skills https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting Moltbook leak coverage (Reuters) https://www.reuters.com/legal/litigation/moltbook-social-media-site-ai-agents-had-big-security-hole-cyber-firm-wiz-says-2026-02-02/ OpenClaw security docs https://docs.openclaw.ai/gateway/security OpenAI hire coverage (FT) https://www.ft.com/content/45b172e6-df8c-41a7-bba9-3e21e361d3aa More information and past episodes on https://shipitweekly.fm

    19 min
  7. When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep

    FEB 13

    When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep

    This week on Ship It Weekly, Brian hits four stories where the guardrails become the incident. GitHub had “Too Many Requests” caused by legacy abuse protections that outlived their moment. Takeaway: controls need owners, visibility, and a retirement plan. Kubernetes has a nasty edge case where nodes/proxy GET can turn into command execution via WebSocket behavior. If you’ve ever handed out “telemetry” RBAC broadly, go audit it. HashiCorp shared how HCP Vault handled a real AWS regional disruption: control plane wobbled, Dedicated data planes kept serving. Control plane vs data plane separation paying off. AWS expanded its PCI DSS compliance package with more services and the Asia Pacific (Taipei) region. Scope changes don’t break prod today, but they turn into evidence churn later if you don’t standardize proof. Human story: “reasonable assurance” turning into busywork. Links GitHub: When protections outlive their purpose (legacy defenses + lifecycle) https://github.blog/engineering/infrastructure/when-protections-outlive-their-purpose-a-lesson-on-managing-defense-systems-at-scale/ Kubernetes nodes/proxy GET → RCE (analysis) https://grahamhelton.com/blog/nodes-proxy-rce OpenFaaS guidance / mitigation notes https://www.openfaas.com/blog/kubernetes-node-proxy-rce/ HCP Vault resilience during real AWS regional outages https://www.hashicorp.com/blog/how-resilient-is-hcp-vault-during-real-aws-regional-outages AWS: Fall 2025 PCI DSS compliance package update https://aws.amazon.com/blogs/security/fall-2025-pci-dss-compliance-package-available-now/ GitHub Actions: self-hosted runner minimum version enforcement extended https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/ Headlamp in 2025: Project Highlights (SIG UI) https://kubernetes.io/blog/2026/01/22/headlamp-in-2025-project-highlights/ AWS Network Firewall Active Threat Defense (MadPot) https://aws.amazon.com/blogs/security/real-time-malware-defense-leveraging-aws-network-firewall-active-threat-defense/ Reasonable assurance turning into busywork (r/sre) https://www.reddit.com/r/sre/comments/1qvwbgf/at_what_point_does_reasonable_assurance_turn_into/ More episodes + details: https://shipitweekly.fm

    16 min

Ratings & Reviews

5
out of 5
9 Ratings

About

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering. Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture. This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time. Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment. If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

You Might Also Like