CloudChat

Carl and Brandon

Conversations about building software and designing architecture in the cloud natively.

  1. Local‑First Lifeboats: Architecting for Post‑EOL Usability

    2월 2일

    Local‑First Lifeboats: Architecting for Post‑EOL Usability

    Episode 0030 - Local‑First Lifeboats: Architecting for Post‑EOL Usability This episode is about designing for the last day, not just the launch day. Carl kicks off with the Bose SoundTouch situation: a vendor moves toward EOL on a cloud-tethered API, users push back, and the outcome (at least in spirit) becomes a blueprint we wish was more common: keep the hardware useful by enabling local control paths and leaning on protocols that already work without your cloud. From there we broaden the conversation to the bigger problem: products and services that do something totally reasonable in a LAN suddenly need a round trip to the internet just to respond to a button press. Carl and Brandon talk through concrete "this actually happened" examples and what good looks like. Belkin's Wemo sunset email is a solid reference: clear dates, repeated notices, and a reality check that local APIs and ecosystems like HomeKit and Matter can keep devices working even when a vendor endpoint is shut off. We contrast that with the messier side of the industry: thermostats and other home gear that still function locally, but lose their main value when the cloud connection is removed, and cloud-only platforms like Stadia where "no backend" means "hard stop" (with the one bright spot being things like refunds and a final firmware update to unlock a controller for normal Bluetooth use). On the builder side, we get practical about how to retire things without surprising your users. We cover technical signaling (Deprecation and Sunset headers), the need for human-friendly comms beyond "put it in the docs," and the architecture patterns that make "minimum viable offline" real: local-first state, local discovery and control surfaces, and fallbacks that do not require re-pairing or re-auth when identity systems go away. We also touch on SaaS escrow and continuity as a way to build trust (especially for startups) and close with a simple gut check: if your cloud disappeared tonight, what can your users still do tomorrow morning? Links News and examples we discussed Bose is open-sourcing its old smart speakers instead of bricking them | The Verge Belkin Wemo cloud service end-of-support notice Google Stadia - Strategy change and shutdown (2021–2023) | Wikipedia Google Stadia controller Bluetooth mode help article API deprecation and shutdown mechanics Deprecation HTTP response header (RFC 9745) Sunset HTTP response header (RFC 8594) Smart-home protocols and "local-first" connectivity Matter (Connectivity Standards Alliance) Thread protocol overview (Thread Group) Multicast DNS (mDNS) (RFC 6762) Tools and patterns Local-first software (Ink & Switch) Strangler Fig Application pattern (Martin Fowler) Automerge (CRDT) - GitHub Yjs (CRDT) - GitHub Contracts and continuity SaaS escrow overview (Escrow London) SaaS escrow overview (PRAXIS Escrow) Software escrow overview (EscrowTech) Other links of interest Microsoft Modern Lifecycle Policy EU Right to Repair overview (European Commission) Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    1시간 3분
  2. New Year's ☁️ Resolutions

    1월 5일

    New Year's ☁️ Resolutions

    Episode 0029 - New Year's ☁️ Resolutions "In 2026, your cloud is not allowed to have the same incidents for the same reasons as last year." Carl and Brandon treat this episode like a retrospective (the kind any good agile team would run), but instead of talking about sprint tickets, they write a New Year's resolution list on behalf of your cloud team. The format is simple: Stop, Start, Keep. Small, opinionated constraints that change day-to-day habits, not vague wishes about "better reliability, security, and cost." The Stop list hits the repeat-incident patterns: single-region "global" apps, treating infrastructure-as-code as optional (and living in the portal), mystery ownership with no clear tags or escalation path, one-off production fix scripts that never get documented, dashboards that are always green while users are hurting, and "temporary" exceptions that turn into permanent risk. The Start list is the muscle-building: run realistic failover/incident drills, measure change and recovery (DORA-style signals and MTTR, not just uptime), budget reliability and cost together, treat internal platforms like products with golden paths, standardize secrets and identity, and add a regular "delete day" so old environments and artifacts do not drag into the new year. The Keep list is what compounds: automate repetitive toil, invest in observability tied to real user flows, keep blameless postmortems with concrete follow-ups, and keep platform/SRE work visible so it does not get squeezed out by features. We hope you and your team are able to embrace some of these resolutions in the coming year, and hope that listening to more CloudChat is at the top of your list. Happy New Year everybody! Links DORA: What is DevOps? Site Reliability Engineering (SRE Book) Azure Well-Architected Framework AWS Well-Architected Framework Google Cloud Architecture Framework Azure Bicep documentation Terraform documentation Azure Key Vault overview Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    1시간 3분
  3. Respect My (DNS) Awe-Thor-Ih-TAY!!

    2025. 12. 01.

    Respect My (DNS) Awe-Thor-Ih-TAY!!

    Episode 0028 - Respect My (DNS) Awe-Thor-Ih-TAY!! Your cloud is humming along, then an edge breaks. What lever do you actually still have to steer users? In this episode, Carl and Brandon dig into DNS as a control plane and why "it is always DNS" keeps being true in 2025. DNS was designed for a slower internet with long TTLs and infrequent changes, but we now treat it like a real-time steering wheel for global failover. That mismatch shows up in outages where the backend is fine but nobody can resolve the hostname that front doors, CDNs, and APIs live behind. We unpack how TTL and caching really work (including negative caching and serve-stale), why modern edge products like Azure Front Door and Cloudflare can still turn into global single points of failure, and how DNS-based load balancers actually behave when you flip weights or priorities. From there we move into patterns and mitigations. We walk through hub-and-spoke vs mesh topologies and where public vs private DNS sit in each, plus concrete strategies for what to do when your edge is broken: bypass patterns, equivalent services, and multi-product designs that let you route around a failing front door. We also hit the observability side so "it is DNS" becomes a graph and an alert instead of a guess in a war room. We close with a look at emerging record types like SVCB/HTTPS and how they may help you advertise alternate endpoints and protocol hints without building another fragile tower of CNAMEs. Links DNS Fundamentals RFC 1034: Domain Names - Concepts and Facilities RFC 1035: Domain Names - Implementation and Specification RFC 2308: Negative Caching of DNS Queries RFC 8767: Serving Stale Data to Improve DNS Resiliency DNS Load Balancing and Edge Services Azure Traffic Manager documentation Azure DNS alias records Amazon Route 53 health checks and failover Cloudflare Load Balancing Akamai Global Traffic Management Azure, AWS, and Cloudflare Outage Reading Azure Front Door service documentation AWS DynamoDB and Route 53 service health history Cloudflare status history Architectures and Private DNS Azure Private DNS zones Azure DNS Private Resolver Azure Virtual WAN DNS guidance Emerging DNS Records and HTTP/3 Service binding (SVCB) and HTTPS resource records Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    1시간 5분
  4. Whoops, No VM's!!!

    2025. 11. 03.

    Whoops, No VM's!!!

    Episode 0027 - Whoops, No VM's!!! You've planned for redundancy, scaling, and failover, but what happens when the cloud itself runs out of space? In this episode, Carl and Brandon untangle capacity (what the provider physically or logically has available in a region or zone) versus quota (the soft limit on what you can consume). Mixing the two leads to painful surprises during scale events and failovers. We talk through how capacity shortfalls show up in real life—zones that are full, SKUs that vary by location, and limited supply for GPU-heavy instances, and the patterns that help: design for multiple zones and regions, add retry and fallback logic with flexible SKUs, balance spot with on-demand, and hold a baseline with reservations or time-bound commitments. We close on the business side: the price of headroom, when commitments make sense, and simple pipeline and monitoring checks so "no capacity" errors fail fast instead of 30 minutes into a deploy. Links AWS Auto Scaling allocation strategies AWS EC2 Capacity Reservations AWS insufficient capacity guidance AWS Savings Plans AWS Service Quotas Azure On-demand Capacity Reservations Azure quotas overview Azure region pairs Azure subscription and service limits Azure VM allocation failures Azure VM Scale Sets orchestration modes (Flexible) GCP Compute Engine Reservations GCP quota alerts and monitoring GCP Regional Managed Instance Groups GCP resource availability errors Google Cloud quotas overview Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    51분
  5. Are Your Cloud Costs Too Damn High???

    2025. 10. 06.

    Are Your Cloud Costs Too Damn High???

    Episode 0026 - Are Your Cloud Costs Too Damn High??? Cloud cost optimization is about designing systems that perform efficiently without wasting money. In this episode, Carl and Brandon break down how AWS, Azure, and Google Cloud help teams rightsize compute, manage storage tiers, and control networking costs. They talk through savings plans, spot instances, lifecycle management, and data transfer strategies that keep performance high and waste low. The discussion then moves into monitoring, automation, and FinOps culture, where budgets, policies, and shared accountability make optimization stick. They cover dashboards, tagging, auto-shutdown routines, and partner-led programs that unlock funding and deeper discounts. Real-world stories from enterprises and startups highlight one key truth: cost management is not a cleanup exercise, it is an ongoing habit that keeps cloud architectures both efficient and sustainable. Links AWS: Well-Architected Framework – Cost Optimization pillar AWS: How to Use AWS Well-Architected with Trusted Advisor for Cost Optimization AWS: AWS Savings Plans AWS: Amazon EC2 Spot Instances Azure: Microsoft Cost Management + Billing (overview) Azure: Quickstart: Start using Cost Analysis Azure: Common cost analysis uses in Cost Management Azure: Control Azure spending and manage bills (learning path) GCP: Create, edit, or delete budgets and budget alerts (Cloud Billing) GCP: Cloud Billing Budget API overview GCP: Committed Use Discounts (Compute) GCP: Understand your bill – pricing & billing (Google Developers) Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    59분
  6. The Sound of Security

    2025. 09. 08.

    The Sound of Security

    Episode 0025 - The Sound of Security Security is more than a feature, it's a pillar of the Well-Architected Framework. In this episode, Carl and Brandon explore how AWS, Azure, and GCP approach security across identity and access, infrastructure defense, data protection, monitoring, governance, and the shared responsibility model. They compare tools and practices like IAM, RBAC, and conditional access; network firewalls, WAFs, and DDoS protection; encryption at rest and in transit; and incident detection and automated remediation. The conversation also dives into security testing, drift detection with IaC, compliance posture, and how policy enforcement differs across the big three. The episode closes with a reminder that cloud security is always shared, and is never finished. Links AWS: Well-Architected Framework – Security pillar AWS: Identity and Access Management (IAM) AWS: AWS Shield and WAF AWS: Amazon Macie AWS: Amazon GuardDuty AWS: AWS Config Azure: Azure Well-Architected Framework – Security Azure: Microsoft Entra ID (Azure AD) Azure: Azure Role-Based Access Control (RBAC) Azure: Azure Key Vault Azure: Defender for Cloud Azure: Microsoft Sentinel Google Cloud: Google Cloud Architecture Framework – Security Google Cloud: IAM overview Google Cloud: Cloud Armor Google Cloud: Cloud KMS Google Cloud: Data Loss Prevention (DLP) API Google Cloud: Security Command Center Google Cloud: Assured Workloads Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    1시간 7분
  7. 2025. 08. 04.

    Operating Excellently

    Episode 0024 - Operating Excellently Operational excellence goes beyond uptime, it's about building and operating cloud systems with discipline, automation, and continuous improvement. Carl and Brandon break down what operational excellence really means, drawing a distinction between striving for perfection and building resilient, adaptable systems. They discuss how principles from AWS, Azure, and GCP converge around key practices like repeatable automation, structured change management, and process validation. The episode dives into real-world strategies for automation, incident readiness, and observability, including where and how to insert gates, use feature flags, and integrate infrastructure as code across cloud platforms. From avoiding certificate-induced outages to catching misconfigurations early, the key theme is consistency at scale. The discussion also emphasizes the cultural side, why shared ownership, retrospectives, and iterative postmortems matter just as much as tooling. Links Ansible: Ansible community documentation AWS Docs: Amazon CloudWatch documentation overview AWS Docs: Operational Excellence whitepaper AWS Docs: Prescriptive Guidance: Operational Excellence AWS Docs: Using CloudWatch dashboards and alarms AWS Docs: Well‑Architected Framework – Operational Excellence pillar AWS: Getting started with Amazon CloudWatch Google Cloud: Continuously improve and innovate Google Cloud: Manage incidents and problems Google Cloud: Operational Excellence pillar overview Google Cloud: Operational readiness & performance using CloudOps HashiCorp Docs: Terraform configuration language reference HashiCorp Docs: Terraform documentation Microsoft Docs: Automation of tasks with PowerShell in Power Platform Microsoft Learn: Azure Automation documentation Microsoft Learn: Azure Monitor documentation Microsoft Learn: Operational Excellence maturity model Microsoft Learn: Operational Excellence overview & quickstart Microsoft Learn: Operational Excellence principles (maturity model, practices) Microsoft Learn: PowerShell documentation PowerShell Universal Docs: PowerShell Universal platform guide Red Hat Docs: Ansible Automation Platform guide Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    53분
  8. Turbocharged: Mastering Performance in Cloud Architecture

    2025. 07. 07.

    Turbocharged: Mastering Performance in Cloud Architecture

    Episode 0023 - Turbocharged: Mastering Performance in Cloud Architecture Cloud performance is one of those words that everyone agrees matters, but often means different things depending on who you ask. Is it latency? Is it autoscaling? Is it picking the right SKU size? We cover the fundamentals of designing for performance in the cloud: how to select the right compute options, when to scale up or out, and what it takes to reduce latency across global workloads. We explore autoscaling strategies, observability tooling, cost tradeoffs, and real-world tuning stories—plus we wrap with a cheat sheet of optimization tools across AWS, Azure, and GCP. Performance isn't just about throwing more cores or RAM at a problem. It's a set of design choices you make continuously—choices that affect cost, scalability, and user experience. Use the principles and tools in your cloud provider to experiment, monitor, and improve. Producer's note: we encountered some technical issues during recording, so apologies for the audio quality in some parts. The content is still solid, and we hope you find it valuable! Links: AWS Trusted Advisor AWS Well-Architected Framework Azure Advisor Azure Well-Architected Framework – Performance Cloud Load Balancing (GCP) GCP Architecture Framework GCP Recommender PerfKit Benchmarker SLOs and SLIs (Google SRE Workbook) Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech cloudchatpodcast@gmail.com linkedin.com/company/cloudchat

    52분

평가 및 리뷰

5
최고 5점
5개의 평가

소개

Conversations about building software and designing architecture in the cloud natively.