As The Geek Learns

The Geek

0.0 (0)
Technology
Updated Weekly

Tools and training for IT professionals. Join James Cruce—a systems engineer with 25+ years managing enterprise VMware infrastructure—for practical PowerCLI tutorials, automation tips, and lessons from the trenches. Much to learn, there always is. astgl.com

Jun 10 · Bonus

Your DNS Changed and Nobody Told You. Here's the Nightly-Diff Pattern That Catches It.

It was a Tuesday at 2:17 PM, and the marketing team's contact form was returning 502s. Not 404. Not a timeout. A clean 502, which means something was answering, just not the thing it was supposed to be. An hour in, I'd checked the app server logs, restarted the nginx process twice, confirmed the SSL cert was valid, and pinged our cloud provider's status page like it owed me money. Everything looked fine everywhere I looked. Then, almost by accident, I ran `dig +short www.company.com CNAME` and saw a hostname I didn't recognize. Something like `legacy-assets.decommissioned-vendor-name.com`. Vendor had been off the account for four months. The CNAME had quietly repointed to their infrastructure during the migration wind-down, sat there untouched, and then their old infrastructure finally went dark. Nobody changed our DNS intentionally. Nobody got notified when it happened. We found out when a sales rep tried to submit a lead form. That was the day I stopped trusting that "nothing changed in DNS" was a statement anyone could actually verify. Why DNS Is the Silent-Failure Layer of Every Infrastructure DNS is configuration. It's just not a configuration you can store in your repo, lint on a commit, or review in a pull request. It lives in a registrar panel or a DNS provider dashboard, updated by humans who may or may not be following a change-control process, and it's completely invisible until something breaks. Every other layer of your stack has some kind of drift detection built in these days. Config management tools track the desired state of your servers. Container orchestrators know what's supposed to be running. Infrastructure-as-code tools will tell you if something drifted from the Terraform state. DNS gets none of that by default. You get a text field in a web UI, a change that takes effect whenever the TTL expires, and exactly zero notifications. The operational pattern most teams rely on is "we'll know when it breaks." And they're right. They will know. They'll know at 2 PM on a Tuesday when a customer reports it, after a sales lead gets lost, after the support team has spent 45 minutes ruling out everything else. The detection mechanism is user reports, which is among the worst possible monitoring strategies. There's also a subtler problem. The change usually isn't malicious. It's not a security incident, at least not at first. It's a vendor cleanup, a platform migration, someone at a partner org tidying up their infrastructure without realizing your CNAME still pointed at them. It's the kind of change that feels harmless to whoever made it and catastrophic to whoever depends on it. The fix isn't complicated. What you need is a declared source of truth for what your DNS should look like, a way to compare that against what it actually looks like right now, and something that runs that comparison regularly enough to catch drift before users do. That's the pattern. The implementation fits in a bash script. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. The Pattern, the Four States, and a Wrapper You Can Use Today The idea is straightforward: declare your expected DNS state once in a YAML file, then run a script nightly that queries your authoritative nameservers and compares what it finds against what you declared. Any gap between the two gets reported. The baseline file is the key piece. It's not generated. You write it manually, and that act of writing it is itself useful, because it forces you to actually look up what each record currently is and decide "yes, that's correct." Once it exists, it becomes your source of truth. Commit it to your repo. Update it when you make a legitimate DNS change. The baseline is always what you intend, and the script is always asking whether reality matches. When the diff runs, every record type for every domain you declared lands in one of four states: MATCH means the live record matches the baseline exactly. This is the quiet result. Nothing to do. NEW means a record exists in live DNS that isn't in your baseline. It could be a vendor auto-adding a TXT verification record. It could be someone provisioning a new subdomain. It could be something you should care about. The script surfaces it; you decide. MISSING means your baseline declared a record that doesn't exist in live DNS anymore. An A record that was decommissioned without cleaning up. An MX record that got deleted. A CNAME that was removed when a vendor migrated their platform. DRIFT means the baseline and live DNS both have records for a type, but the values don't match. This is the Tuesday-afternoon scenario: the CNAME target changed, the IP behind an A record flipped, the SPF policy was modified. NEW and MISSING and DRIFT all mean something in your environment changed without you being told. The script exits nonzero when any of those occur, which makes it trivially composable with cron, alerting pipelines, or anything else that reads exit codes. Here's a minimal working bash wrapper you can adapt right now. It keeps the dependencies to just `dig` and `bash`, uses a simple shell-array for your expected records instead of parsing YAML, and is short enough to read in under five minutes: #!/usr/bin/env bash # dns-check.sh # WHAT: Minimal DNS drift checker - declare expected records, diff against live DNS # WHY: Catches silent DNS changes before they become incidents # Usage: ./dns-check.sh # Add to cron: 0 2 * * * /path/to/dns-check.sh || echo "DNS DRIFT DETECTED" | mail -s "DNS Alert" you@example.com set -euo pipefail # ── CONFIGURATION ──────────────────────────────────────────────────────────── # Authoritative resolver to query against (use your domain's actual nameserver) # WHY: Querying authoritative NS catches changes before they propagate to resolvers RESOLVER="8.8.8.8" # Declare expected records as: "domain|TYPE|expected_value" # Get current values with: dig +short example.com A # Run once to populate, then treat this as your source of truth EXPECTED_RECORDS=( "example.com|A|93.184.216.34" "www.example.com|CNAME|example.com.cdn.cloudflare.net" "example.com|MX|10 mail.example.com" "example.com|TXT|v=spf1 include:_spf.google.com ~all" ) # ── DIFF ENGINE ────────────────────────────────────────────────────────────── DRIFT_FOUND=0 for record in "${EXPECTED_RECORDS[@]}"; do # Parse the declared record into its three parts domain="${record%%|*}" rest="${record#*|}" rtype="${rest%%|*}" expected="${rest#*|}" # Query live DNS at the authoritative resolver # WHY: +short gives us clean output; @resolver pins which nameserver answers actual=$(dig +short "@${RESOLVER}" "${domain}" "${rtype}" 2>/dev/null \ | sort \ | tr '\n' '|' \ | sed 's/\.$//g; s/|$//') # Normalize expected for comparison (sort, strip trailing dots) expected_norm=$(printf '%s\n' "${expected}" \ | sort \ | tr '\n' '|' \ | sed 's/\.$//g; s/|$//') # Compare and classify the result if [[ -z "${actual}" ]]; then # Record existed in baseline but dig returned nothing: MISSING printf "MISSING %s %s (expected: %s)\n" "${domain}" "${rtype}" "${expected}" DRIFT_FOUND=1 elif [[ "${actual}" != "${expected_norm}" ]]; then # Record exists but value changed: DRIFT printf "DRIFT %s %s\n expected: %s\n actual: %s\n" \ "${domain}" "${rtype}" "${expected}" "${actual}" DRIFT_FOUND=1 else # Values match: MATCH (silent - no output unless you add --verbose logic) : # nothing to report fi done # Exit nonzero on any drift - composable with cron, alerting, CI checks if [[ "${DRIFT_FOUND}" -eq 1 ]]; then printf "\nDrift detected. Review records above.\n" >&2 exit 1 fi printf "All %d declared records match live DNS.\n" "${#EXPECTED_RECORDS[@]}" exit 0 Save that, drop your actual records into `EXPECTED_RECORDS`, and run it once to confirm it sees what you expect. Then add it to your crontab: # Run nightly at 2 AM, email on drift 0 2 * * * /path/to/dns-check.sh || echo "DNS drift detected on $(hostname)" | mail -s "[ALERT] DNS Drift" you@example.com The "NEW record exists in live DNS" state isn't in this minimal version, since detecting it requires knowing which record types to scan for beyond what you declared. The four-state model handles that fully once you know which types to watch, which is what the complete kit covers. For a first pass, catching MISSING and DRIFT gets you most of the value. A few practical notes. Use `dig +short` rather than `dig` without `+short` or you'll spend time parsing the human-readable output format. Always query a specific nameserver with `@resolver` rather than relying on your local resolver, since caching can hide drift for hours. The MX record normalization is worth being careful about: `dig +short` returns the priority prefix as part of the value (`10 mail.example.com`), so your expected strings need to include it exactly that way. And commit the script alongside your baseline declaration. If the baseline lives in the repo, you get history, diffs, and code review for DNS changes as a side effect. What Else Lives in the Full Kit The script above covers the core pattern. The full DNS Drift Detector kit is what you reach for once you've outgrown the wrapper. The main `dns-drift-detector.sh` handles all five record types: A, AAAA, CNAME, MX, and TXT. That last group matters more than it seems. TXT records are where SPF policies live, where DKIM selectors sit, where domain verification tokens accumulate. Quiet SPF drift can break your email deliverability for days before anyone notices. DKIM drift means legitimate mail starts hitting spam folders. These aren't hypothetica
Jun 8

I Secured My AI Agent With a 7-Layer Threat Model

I have an autonomous AI agent running on my Mac Studio. It has full shell access, reads my calendar, manages my tasks, and sends iMessages on my behalf. It runs 24/7 as a background service. If that sentence doesn’t make you slightly nervous, you haven’t been paying attention. In February 2026, researchers found over 135,000 OpenClaw instances exposed to the public internet. A coordinated attack called ClawHavoc planted over a thousand malicious plugins in the community registry. Nine CVEs have been disclosed, including remote code execution. I needed to take security seriously. Not “I changed the default password” seriously. Threat-model seriously. MAESTRO: Seven Layers of Things That Can Go Wrong The Cloud Security Alliance published a framework called MAESTRO—a 7-layer threat model specifically designed for agentic AI systems. Ken Huang mapped it directly to OpenClaw’s codebase, identifying 35+ specific threats across every layer of the stack. Here are the seven layers, translated from security-paper language into “things that could actually ruin your day”: Layer 1: Foundation Models: Someone sends your agent a crafted message that hijacks its behavior. Prompt injection. Jailbreaks. System prompt leakage. Your agent does what an attacker tells it to instead of what you told it to. Layer 2: Data Operations: Your credentials are stored in plaintext JSON files. Your session logs contain every conversation forever. A malicious skill injects code through your workspace. Layer 3: Agent Frameworks: The agent misuses its own tools. It runs shell commands it shouldn’t. It spawns sessions without authorization. It escalates its own privileges. Layer 4: Deployment & Infrastructure: Your gateway is exposed to the network. Someone brute-forces the WebSocket token. A reverse proxy misconfiguration bypasses authentication entirely. Layer 5: Evaluation & Observability: Nobody’s watching the agent for anomalous behavior. There’s no audit trail. Logs can be tampered with. If the agent starts acting weird, nothing catches it. Layer 6: Security & Compliance: Your DM policy is misconfigured. Anyone can message the agent. Pairing codes can be brute-forced. Identity can be spoofed across channels. Layer 7: Agent Ecosystem: A malicious plugin gets installed. A legitimate plugin’s npm dependency gets compromised. The skill registry serves poisoned packages. The critical attack chain MAESTRO identifies: compromise the gateway (Layer 4) → access the session store (Layer 2) → poison conversation history (Layer 1) → control the agent (Layer 3) → spread via messaging (Layer 7). Reading this was humbling. I’d addressed some of these by instinct during setup. Loopback binding, directory permissions, and pairing-based access control were all implemented. But “some” isn’t a security posture. SecureClaw: The Audit SecureClaw is an open-source security tool built specifically for OpenClaw by Adversa AI. It maps to MAESTRO, OWASP, MITRE ATLAS, and NIST AI 100-2. The install is a git clone and a bash script, no npm install, no network calls, and no surprises. git clone https://github.com/adversa-ai/secureclaw.git bash secureclaw/secureclaw/skill/scripts/install.sh Then you run the audit: bash ~/.openclaw/skills/secureclaw/scripts/quick-audit.sh My baseline score: 57 out of 100. Zero criticals. Three HIGHs. Three MEDIUMs. Eight checks passing. Here’s what passed without any work: • Gateway bound to loopback (127.0.0.1) not exposed to network • Gateway authentication present • Directory permissions set to 700 (owner only) • No browser relay exposed • DM policy set to pairing (not open) • Skills clean of malicious patterns And here’s what failed: 🟠 HIGH Plaintext key exposure: Keys in openclaw.json and 5 backup files 🟠 HIGH Sandbox mode: commands run directly on host 🟠 HIGH Exec approval mode: agent acts without human approval 🟡 MED No cognitive file baselines: can’t detect tampering 🟡 MED Default control tokens: vulnerable to spoofing 🟡 MED No failure mode: no graceful degradation The Hardening Step 1: Clean up credential leaks. OpenClaw creates .bak files every time you change config. Each backup contains your full config, including Slack tokens and API keys. I had five of them sitting in the OpenClaw directory. Deleted them all. Set the main config to 600 permissions. This is the kind of thing that’s easy to miss and catastrophic to ignore. A single ls -la ~/.openclaw/ would show them. But who runs ls -la on their config directory after every change? Step 2: Create integrity baselines. SecureClaw’s hardener generates SHA256 hashes of your “cognitive files” IDENTITY.md, AGENTS.md, and HEARTBEAT.md. These are the files that define who your agent is and what it does. If an attacker or a hallucinating agent modifies them, the nightly integrity check will catch it. bash ~/.openclaw/skills/secureclaw/scripts/quick-harden.sh Step 3: Exec approvals. This is the big one. MAESTRO recommends human-in-the-loop approval for all shell commands. But my agent runs morning briefings and heartbeat checks on cron—unattended. Setting approvals to “always” would break all automation. The solution: an allowlist with on-miss approval. I created ~/.openclaw/exec-approvals.json with 17 safe command patterns: imsg, calctl, apple-reminders, cairn, and basic file operations. Tars can run these freely. Anything else; curl, rm, pip install, or any command not on the list, requires human approval. { “defaults”: { “security”: “allowlist”, “ask”: “on-miss” }, “agents”: { “main”: { “allowlist”: [ { “pattern”: “imsg *”, “note”: “iMessage send/read” }, { “pattern”: “calctl *”, “note”: “Apple Calendar” }, { “pattern”: “cairn *”, “note”: “Task management” } ] } } } This is the trade-off MAESTRO doesn’t talk about: security versus automation. Maximum security means every action needs approval. Maximum automation means the agent acts freely. The allowlist is the middle ground. Routine operations are pre-approved, and novel or dangerous operations require a human. Step 4: Full plugin install. Beyond the bash scripts, SecureClaw has a full npm plugin with 56 runtime audit checks, background monitors for config drift, and real-time integrity verification. Installing it required building from source (TypeScript → JavaScript) and registering it with OpenClaw’s plugin system. openclaw plugins install -l /path/to/secureclaw openclaw config set plugins.allow ‘[”secureclaw”]’ That plugins.allow line is important. By default, OpenClaw will auto-load any discovered plugin. Explicit trust means only plugins you’ve approved get loaded. Step 5: Nightly audit cron. A macOS LaunchAgent runs the full audit suite every night at 2 AM which includes quick-audit, integrity check, and supply chain scan. Results go to secureclaw-audit.log. If something changes overnight, it shows up in the morning. The Final Score After hardening: 64 out of 100. Nine checks passing. Zero criticals. The three remaining HIGHs are documented, accepted trade-offs: Findings I accepted (with reasoning)—Sandbox mode (Docker sandboxing would break imsg, calctl, and Apple Reminders); Plaintext keys in config (inherent to the platform config format, file is locked to 600); Exec approval not “always” (using allowlist + on-miss; full “always” breaks unattended cron automation). The two MEDIUMs, control token customization and failure mode configuration, aren’t supported in OpenClaw v2026.3.2’s config schema yet. SecureClaw checks for them proactively. They’ll be fixable when OpenClaw adds the config options. What I Actually Learned Security isn’t a feature you enable. It’s a series of trade-offs you make with your eyes open. Sandbox mode is “more secure” but breaks the tools that make the agent useful. Approval mode “always” is “more secure” but kills the automation that makes the agent worthwhile. The right security posture isn’t maximum restriction; it’s documented, intentional decisions about what risks you accept and why. Automated scanning is essential but insufficient. SecureClaw’s audit caught things I would have missed, including the .bak files with credentials, the missing integrity baselines, and the open exec policy. But the HIGHs it flagged as failures are things I’ve consciously accepted. No scanner can evaluate your specific trade-offs. The biggest threat isn’t external. In my setup (loopback-bound, pairing-gated, allowlist-filtered), the most likely security failure isn’t a network attacker. It’s a malicious skill, a compromised npm package, or the agent itself hallucinating destructive actions. Layer 7 (ecosystem) and Layer 1 (model behavior) are the real attack surfaces for a local-first setup. The exec approval allowlist is my primary defense for both. Clean up after yourself. OpenClaw creates backup files containing credentials on every config change. There’s no auto-cleanup. If you’re running OpenClaw, go check your directory right now: ls ~/.openclaw/*.bak*. You might be surprised. Quick Reference Hardening actions and commands: install, run audit, apply hardening, check integrity, scan skills, check for credential leaks, set exec approvals, set plugin trust. Commands target ~/.openclaw/skills/secureclaw/scripts/. Full command details in the image. Update—June 2026: What I Actually Did When I Moved to ClaudeClaw I wrote this piece in March, when OpenClaw was still the thing running my Mac Studio. By the end of April, I’d shut it down. Disabled the cron jobs, quarantined the LaunchAgents, and rebuilt the whole stack on the Claude Agent SDK. Based off of ClaudeClaw from the Early AI-Dopters AI learning group. The full post-mortem on why: Why? The short version is this: I couldn’t see into OpenClaw. Whic
Jun 4 · Bonus

5 Questions to Ask Before You Build the AI Project Your CEO Just Pitched

5 Questions to Ask Before You Build the AI Project Your CEO Just Pitched You know the email. It shows up Tuesday morning, forwarded with a few lines of enthusiasm and a ChatGPT-drafted proposal attached. "Saw this and thought of us. Can we do this?" The PDF has a logo, bullet points, and exactly zero integration requirements. It also has a six-week timeline and a budget that assumes nothing goes wrong. You have somewhere between 24 and 72 hours before your CEO follows up asking what you think. If you say yes, you're on the hook for a project you didn't scope. If you say no, you're the person who kills ideas. Neither answer is actually available to you. What you need is a third path: a structured evaluation that produces a defensible, professional response in the time it takes to drink your morning coffee. That's what the Technical Reality Check is. Five questions. One page. Every answer points directly at a commitment your organization will have to honor if this project moves forward. Here it is in full. The Technical Reality Check: 5 Questions That Surface What the Proposal Left Out Question 1: What specific business outcome does this solve, and how will we measure success? AI tools generate confident-sounding proposals that describe solutions, not problems. A proposal for "an AI-powered IT ticketing system" describes a technology. It doesn't describe what's broken right now, how broken it is, or what "fixed" looks like in measurable terms. Before any conversation about implementation, you need an answer to: what does success look like in six months, and how will we know we hit it? Ticket resolution time down 30%? First-contact resolution rate up 20%? Those are real answers. "Things will be more efficient" is not. Unmeasurable projects never officially fail. Which means they never stop consuming resources. This question isn't about being difficult. It's about making sure the organization is buying an outcome, not a technology. The red flag: Any proposal where the only success metric is "we deployed it." Question 2: Who owns the ongoing maintenance, security patching, and vendor relationship? Vendor proposals describe launch day. They are almost entirely silent about year two. Every new system creates a permanent maintenance obligation: patching, credential rotation, user access reviews, API deprecations, contract renewals, and a support relationship with a vendor whose incentives are not aligned with yours. If that obligation doesn't have a named owner before the project starts, IT inherits it by default. Forever. Without headcount. This question forces the conversation about operational reality before anyone has signed a contract. The answer also tells you a lot about how seriously the proposal was thought through. If nobody has asked "who maintains this?", nobody has thought past the demo. The red flag: "The vendor handles everything." Vendors handle their system. You handle the integration, the credentials, the user provisioning, the data pipeline, and the 2 AM alert when something breaks between their system and yours. Question 3: What happens to our existing systems, data, and processes? New systems don't exist in a vacuum. They touch your directory, your ticketing system, your identity provider, your backup scope, your audit logs. Each of those integration points is a potential failure mode, a migration cost, or a compliance question. AI-generated proposals routinely skip integration complexity. This isn't because the AI is being deceptive. It's because the AI generating the proposal doesn't know your stack. The proposal was written in a context-free environment. Your environment is anything but. Before committing, you need to know: what does this touch, and what has to move or change for it to work? And who does that work? Data migration alone can turn a "simple" deployment into a multi-month project. Asking this question early is how you find out. The red flag: "It integrates easily with your existing tools." That's a sales phrase, not an engineering estimate. "Easy" is undefined until your systems engineer has looked at the API docs. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Question 4: What's the realistic timeline and resource cost, not the optimistic one? Vendor timelines assume clean data, available staff, smooth approvals, and nothing else on the backlog. Your timeline accounts for your actual team, their current commitments, the security review cycle, the change management process, and the three things nobody predicted. The gap between those two numbers is usually where projects go sideways. Not because the technology failed, but because the plan never accounted for reality. This question also surfaces a common pattern: the timeline was set before IT was consulted. Any timeline that precedes a technical assessment is a guess dressed up as a schedule. You're the one who'll be explaining the delay when the guess turns out to be wrong. The red flag: A go-live date in the proposal. That's not a plan, it's a target somebody made up. Ask who set it and what it was based on. Question 5: What's the exit strategy if this doesn't work as expected? Every vendor says their product works. You need a plan for when it doesn't. When the pricing doubles at renewal. When the company gets acquired and support degrades. When a compliance requirement changes and the product doesn't keep up. Data portability, rollback procedures, and contractual exit terms are not pessimism. They're the difference between a manageable failure and a situation where you're paying for a system that doesn't work because migrating off it is too expensive to contemplate. This question also signals organizational maturity. IT teams that ask exit questions before they sign contracts don't get held hostage. IT teams that don't ask end up managing a five-year sunset project for a tool they stopped believing in three years ago. The red flag: "We can always just stop using it." Can you migrate your data? In what format? At what cost? How long does it take? If nobody has asked those questions, stopping isn't as simple as it sounds. The Checklist in Practice: Walking Through a Real Scenario Here's what a Technical Reality Check pass looks like when you actually run it. Your CEO forwards a ChatGPT-drafted proposal on a Monday morning. The subject line is "AI Agent for IT Ticket Triage." The proposal is two pages. It describes an AI system that reads incoming IT tickets, categorizes them by priority and type, routes them to the right team, and drafts first-response emails automatically. There's a mockup screenshot. There's a line about "easy integration with your existing ITSM." There's a timeline: six weeks to deployment. You open the Technical Reality Check. Q1: What specific business outcome does this solve? The proposal says "reduce response times and improve IT efficiency." No baseline. No metric. You check your current ITSM data: average first response is 4.2 hours, your SLA target is 2 hours, you're meeting it 71% of the time. Now you have a problem worth solving. You write it down: "We need first-response SLA compliance above 85%. Current state: 71%." That's the outcome. If the AI system can't demonstrate a path to that specific number, the conversation is premature. Q2: Who owns maintenance and the vendor relationship? Nobody is named in the proposal. You have a team of four. One of them is already carrying the ITSM admin role. You note: this needs a named owner and a rough estimate of ongoing hours before it can go to planning. You also flag the API integration dependency: your ITSM has a rate-limited API that's caused problems before. Someone needs to read the vendor's API docs before "easy integration" gets treated as a fact. Q3: What happens to existing systems and data? Your ticketing data includes ticket histories, customer records, and some attachments. The proposal doesn't mention data handling. You note two questions: where does ticket data go once the AI processes it, and what are the data residency requirements given that you handle some HIPAA-adjacent systems? That second question alone could be a blocker. You don't know yet, but you know to ask. Q4: What's the realistic timeline and resource cost? Six weeks assumes nothing else is happening. Your team is currently in the middle of a server migration that runs through the end of the month. Realistically, this project can't start until mid-next-month, and your most experienced engineer (the one who'd need to own the integration) is at 90% utilization. You write down: "Realistic start: six weeks out. Realistic deployment: 12-16 weeks from proposal receipt. Not 6." Q5: What's the exit strategy? The proposal doesn't mention it. You note: before any contract, you need to know the data export format, the contract term length, and what happens to stored ticket data at offboarding. That's it. You've just done a Technical Reality Check. Total time: 15 minutes. Now you can write a response. Not "no." Not "yes." Something like: "I've done a preliminary review. Before we can assess feasibility, I need answers to five specific questions. Here they are. Happy to set up 30 minutes to walk through them together." You've moved the conversation from enthusiasm to decision-ready. You've protected the organization without being obstructionist. And you have a written record of the questions you asked, which matters if the project later goes sideways without those answers ever being provided. That's the whole point of the Technical Reality Check. It's not a rejection letter. It's the question set that separates proposals worth pursuing from proposals worth deferring. What the Rest of the Toolkit Covers The Technical Reality Check is the first thing you run. It gets you to a defensible position in 15 minutes. But the full response (the one that protects your career, your team's credibility, and the
Jun 2 · Bonus

Anthropic Shipped an AI Security Scanner. Here's the Per-PR Cost Math.

The first time my manager asked, “Are we using AI to scan PRs for vulnerabilities yet?" I said I'd look into it. Then I spent four hours reading docs, pricing pages, and GitHub issues before I had a number I trusted enough to put in a Slack message. That should have taken twenty minutes. The number exists. The math is straightforward. Nobody had written it down in one place where a platform engineer could find it. Anthropic quietly shipped `anthropics/claude-code-security-review` as a first-party GitHub Action. You add a workflow file, point it at a secret, and it posts a findings comment on every pull request. The scanner reasons about code rather than matching signatures, which means it catches things like logic-level injection paths that a regex-based tool would miss. It also means the false-positive profile is different from what you're used to, and you need a triage process before you wire it to branch protection. This article gives you the cost math and the triage playbook in full. Both are things you'd need even if you built this yourself. Why "Just Run It" Isn't a Strategy Adding a CI step that calls an LLM API isn't free, and it isn't free to manage. There are two failure modes I see teams hit. The first is budget surprise. Someone adds the scanner, it runs for a month, the cloud bill shows up, and the conversation gets uncomfortable because nobody did the math upfront. The scanner doesn't cost a lot, but "not a lot" needs a number attached to it before you walk into a budget conversation. The second failure mode is alert fatigue. The scanner finds something on every PR. Engineers start skimming the findings comment the same way they skim Dependabot. One day there's a real SQL injection in a PR, it's buried in a list of five findings, and it merges. The triage process is what keeps findings meaningful instead of noise. Both problems are solvable. The math takes ten minutes. The triage rubric takes one meeting to agree on. Neither requires buying anything yet. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. What This Costs Per PR (The Real Numbers) Claude bills per token. One token is roughly four characters of text. A PR diff gets converted to tokens and sent to the model as input. The model's findings comment is output tokens. The formula is simple: Cost = (input_tokens × input_rate) + (output_tokens × output_rate) For Claude Sonnet 4.6, the rates are approximately $3 per million input tokens and $15 per million output tokens. (Verify current pricing at platform.anthropic.com before your next budget conversation. Rates change.) Scenario 1: A 200-Line PR Diff A focused bug fix or small feature. Maybe three files changed. Component Tokens Rate Cost ----------------------------------------------------------------------- System prompt + workflow context (input) 2,000 $3.00 / 1M $0.006 PR diff, ~200 lines (input) 1,300 $3.00 / 1M $0.004 Findings output, 1-2 findings (output) 600 $15.00 / 1M $0.009 ----------------------------------------------------------------------- Total per PR ~$0.019 Call it two cents. For a small PR, this is a rounding error. Scenario 2: A 2,000-Line PR Diff A refactor, a new feature, a dependency upgrade touching multiple services. Component Tokens Rate Cost ------------------------------------------------------------------------ System prompt + workflow context (input) 2,000 $3.00 / 1M $0.006 PR diff, ~2,000 lines (input) 13,000 $3.00 / 1M $0.039 Findings output, 2-4 findings (output) 1,500 $15.00 / 1M $0.023 ------------------------------------------------------------------------ Total per PR ~$0.068 Seven cents. Still noise for a single PR. Monthly Back-of-the-Envelope The question your manager will ask isn't “What does one PR cost?" It's “What does this cost per month?" If your team merges 80 PRs a month (about 4 per business day), with a mix of small and medium diffs averaging around $0.04 per scan: 80 PRs × $0.04 = $3.20/month Even if your average PR runs larger, say closer to the 2,000-line scenario at $0.07 each: 80 PRs × $0.07 = $5.60/month A busy multi-team repo at 400 PRs a month at $0.07 each is $28/month. That's less than one developer's Spotify subscription. The cost math isn't the obstacle here. The obstacle is having a triage process in place before you flip it on. One practical note: output token count varies with how many findings the scanner generates. Zero findings produces shorter output and costs less. Ten findings costs a bit more. The estimates above assume one to three findings per PR, which is realistic for an established codebase with existing security hygiene. The 3-Tier Triage Rubric Every finding the scanner posts needs to land in one of three buckets. Here's the decision framework. REAL: Block the merge. Fix it. A finding is REAL when it describes an exploitable path with proof. The scanner should show you the specific line, and explain how an attacker would reach it, and the explanation should hold up when you read the code yourself. SQL injection via string concatenation in a request handler is REAL. Hardcoded credentials that actually ship to production are REAL. The discriminator: "If an attacker had this codebase and five minutes, could they demonstrate this?" If yes, it's REAL. Block the PR and fix it before merge. PROBABLE: Human review required. A finding is PROBABLE when the pattern is plausible, but context matters. The scanner can see the diff, not the full runtime environment. A finding might flag a code path that looks injectable, but your framework wraps every database call with prepared statements at a layer the scanner can't see. Or the flagged code only runs in a context that requires prior authentication the scanner doesn't know about. The discriminator: "This could be real, but I need someone who knows this codebase to confirm." Don't block the PR automatically. Route it to the PR author or a senior engineer. Give it a two-hour resolution window before it escalates. DISCARD: Suppress it with a documented rule. A finding is DISCARD when it's structurally a false positive. The scanner flagged test code that never runs in production. It flagged a generated file you don't own. It flagged a template placeholder in an IaC file that gets substituted at deploy time. It flagged a public API URL as a hardcoded credential because the word "key" appeared in the variable name. The discriminator: "Would an attacker gain anything by knowing this?" If no, it's a DISCARD. The important part is that you document why. Suppressing without a comment is how you end up silently ignoring real findings six months later when the context is gone. A Worked Example: The SQLAlchemy False Positive Here's the kind of finding that will show up on your team in the first two weeks if you use any ORM. A PR adds a new search endpoint. Somewhere in the diff, there's code like this: def search_users(search_term: str): results = db.session.query(User).filter( User.name.ilike(f"%{search_term}%") ).all() return results The scanner flags it as a potential SQL injection vulnerability. The finding explains that `search_term` appears to be user-controlled input and is being interpolated into a query string. Severity: HIGH. A human reading this would notice a few things. The code uses SQLAlchemy's ORM layer. The `.ilike()` method is a SQLAlchemy query construct, not a raw SQL string. SQLAlchemy sends the query to the database as a parameterized statement with the value bound separately, which is exactly the defense against SQL injection. The `f"%{search_term}%"` is constructing the pattern string in Python, but that pattern gets passed as a bound parameter by the driver. This is a DISCARD. The scanner saw string interpolation near a database call and correctly identified that as a pattern worth flagging. It couldn't see that the ORM handles parameterization automatically. The suppression note you'd document reads something like: SQLAlchemy ORM calls via `.filter()`, `.ilike()`, `.like()`, and similar query methods use parameterized queries automatically. String interpolation to construct pattern values (e.g., for LIKE clauses) does not create injection risk when using these methods. Do not flag SQLAlchemy ORM filter calls as SQL injection. That note goes into a filter file your workflow references. The same class of finding stops appearing on every PR that touches a database query. Two things to notice about this example. First, the scanner wasn't wrong to flag it. Without ORM context, string interpolation near a SQL-like method call is exactly what a good scanner should notice. Second, the suppression is better than just dismissing it, because the documented rule now covers every future PR using the same pattern. You pay the triage cost once. What the Full Kit Covers The cost math and triage rubric are the foundation, but they don't tell you how to wire any of this into GitHub. The full guide covers the GitHub Actions workflow YAML itself (the one that calls `anthropics/claude-code-security-review` and handles the findings response), how to set up branch protection so that HIGH findings actually block merges instead of just posting a comment, and the in-workflow automation that runs the REAL/PROBABLE/DISCARD classification before the comment lands on the PR. There's also a head-to-head with GPT-4o as a second-opinion scanner. They're not equivalent tools. The Anthropic action is purpose-built for this job. The GPT-4o path is a chat completions API call with a security prompt, which costs about seven times less per PR but produces more variable results. The comparison matri
Jun 1

Stop Paying for Cloud APIs: Building a Local AI Stack on Mac Studio

Running LLMs locally usually feels like a compromise. You either get tiny, fast models that can't think or massive models that crawl at one word per minute. But with the right hardware, you can break that trade-off and replace your cloud billing entirely. The Setup The dilemma most developers face is a choice between two bad options. On one side, you have cloud APIs like OpenAI or Anthropic. They are easy to use and incredibly smart, but they come with a heavy "API tax" and privacy concerns. If you're processing proprietary code or sensitive customer data, sending that information to a third-party server is a massive risk. On the other side, you have traditional local setups. Usually, you're limited by the VRAM on your GPU. If you have a standard consumer card with 12 GB or 24 GB of VRAM, you're stuck with small models. You can't run the heavy-hitters that actually compete with GPT-5. This creates a wall where local AI is only good for "toy" problems, while production workloads stay in the cloud. The Hardware Math The real secret to breaking this wall is Apple Silicon's unified memory. On a Mac Studio with an M3 Ultra, the 256 GB of memory is shared between the CPU and the GPU. This eliminates the VRAM bottleneck that kills most local setups. You aren't limited by a tiny slice of video memory; you're limited by the total pool of system memory. When you look at the actual numbers, the math becomes very clear. Here is how I structure my model loading on this machine: If you load all of these concurrently, you're using roughly 107 GB of memory. That leaves about 149 GB for the macOS, your browser, your IDE, and everything else. This allows you to run a 32B model for writing, a 72B for research, and an 8B for quick checks all at the same time. The economics are just as compelling. A Mac Studio setup costs anywhere from $4,000 to $7,000 as a one-time purchase. If your production workflows are costing you $200 to $500 per month in cloud tokens, the hardware pays for itself in 12 to 18 months. After that, the "cost" of running a massive model is basically just the electricity it uses. Plus, you finally own your data. Temperature Is a Randomness Dial, Not a Quality Dial I see a lot of tutorials that suggest using a temperature of 0.7 for every single prompt. That is a mistake. Temperature doesn't make a model "smarter" or "better." It is simply a randomness dial. It controls how much the model is allowed to deviate from the most likely next word. If you use the same temperature for everything, your pipeline will fail. For tasks requiring high precision, a high temperature will introduce hallucinations. For creative tasks, a low temperature will make the output feel robotic and repetitive. In my production newsletter pipeline, I use a specific routing table to manage this: There are two key takeaways here. First, for fact-checking, you want the temperature at 0.1. This makes claim extraction repeatable and ensures your verdicts are consistent every time you run the script. Second, setting the temperature to 0.8 for "humanization" might seem counterintuitive, but it works. A higher temperature allows the model to make less predictable word choices, which actually produces more natural, less "AI-sounding" prose. The OpenAI Compatibility Trick As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. The best part about using Ollama for this setup is that you don't have to rewrite your entire codebase. Ollama exposes an OpenAI-compatible API at `localhost:11434/v1`. This means any tool, library, or SDK that respects the `OPENAI_BASE_URL` environment variable can be redirected to your local machine with almost zero effort. You can point your existing Python scripts or LangChain agents to your local Mac by simply setting these variables in your terminal: export OPENAI_BASE_URL=http://localhost:11434/v1 export OPENAI_API_KEY=ollama # Any value works; Ollama doesn't check this If you are working within a configuration file, such as a JSON config for a custom agent, it looks like this: { "model": "openai/qwen3:32b-fast", "openai_base_url": "http://localhost:11434/v1", "openai_api_key": "ollama" } Every LangChain chain, every summarization script, and every SDK that follows the OpenAI protocol becomes a free local-model call. You can migrate an entire project from GPT-4 to your local M3 Ultra in about 30 seconds. Why This Pattern Matters This isn't just about saving money on API credits. It is about architectural sovereignty. When you move your core intelligence layer to local hardware, you remove the dependency on a single vendor's uptime, pricing changes, and content filtering policies. The pattern of using unified memory to host multiple specialized models at different temperatures allows you to build a "factory" of intelligence. You have a high-speed 8B model for sorting, a balanced 32B model for drafting, and a heavy 70B model for deep reasoning, all running in the same memory space. This is how you build a production-grade AI stack that is private, permanent, and incredibly cost-effective. ( This cost calculation was based on 6-month-ago pricing when I bought my Mac Studio. Since then the availability of Mac Studios with large amounts of unified memory has evaporated. This has driven up pricing. Hopefully this is temporary. ) Quick Reference Key Commands * Set local base URL: `export OPENAI_BASE_URL=http://localhost:11434/v1` * Check running models: `ollama ps` Temperature Cheat Sheet * 0.1 to 0.3: Extraction, coding, fact-checking, and structured data (JSON). * 0.7: General purpose, drafting, and summarization. * 0.8 to 1.0: Creative writing, brainstorming, and persona simulation. Found this useful? I share practical lessons from my systems engineering and AI journey at As The Geek Learns Get full access to As The Geek Learns at astgl.com/subscribe
May 25

I Built a Self-Improving AI Swarm. After 100 Runs It Was No Better Than Run One.

I spent twelve hours watching a leaderboard that refused to move. The setup was simple: six AI agents tasked with writing technical articles. They were designed to be a closed loop. The drafter would write, the grader would score, and the agents would then "evolve" their own configs to chase a higher score. I hit "go" on my Mac Studio, went to bed, and woke up to a flat line. After 100 iterations, the average score had crawled from 63.0 to 63.9. The all-time peak was 69.0 at iteration 79, but the system never stayed there. It was a C-minus. Indistinguishable from noise. I had fallen for the Autonomy Fallacy. I assumed that if I gave a swarm of LLMs the right knobs—temperature, max_tokens, and the ability to append "prompt additions" to their system prompts—they would naturally drift toward quality. I was wrong. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. When I opened config/agents/drafter.yaml to see what the agent had "learned," I found a disaster. The prompt_additions list had evolved into five overlapping phrases of pure SEO buzzword soup. It was telling itself to be "semantically rich," "data-dense," and to "enhance semantic alignment by including keyword-integrated background information." The drafter hadn't learned how to write a better article. It had learned how to trick the grader. The Smoking Gun The smoking gun was the model choice. I was using qwen3:8b as the grader to judge the output of qwen3:32b-fast. I had a smaller, weaker model acting as the quality gate for a larger, smarter one. The 8B model couldn't tell the difference between a nuanced technical insight and a paragraph full of "semantically rich context." To the grader, the buzzwords looked like "density." The agents converged on what the grader liked, not on what a human would actually publish. This wasn't self-improvement; it was reward hacking. To make it worse, the first twenty iterations were a total wash. I had a silent JSON parse failure in the config-evolution logic: Expecting value: line 1 column 1 (char 0). The agents were trying to mutate their configs and failing, but the loop kept running. By the time I pushed the fix in commit c28a611, the system had already drifted into a local maximum of corporate-speak. I realized that self-improvement requires an external pull. You cannot have a system where the performer and the judge are of the same pedigree, or worse, where the judge is the weaker link. The Rebuild I tore the architecture down and built v2. First, I moved the "brain" of the operation. The performance stayed local. I used gemma4:31b on the Mac Studio to generate the text, but I moved the judging to the cloud. I plugged in Sonnet 4.6. I decided the cheapest place to spend API tokens wasn't on generating 2,000-word drafts, but on grading them. Second, I killed the "single-shot mutation" approach. In v1, the agent changed its prompt, ran once, and if the score went up, the change stuck. That's too much noise. I replaced it with a tournament. Now, the system samples three different prompt templates from a versioned library. The performer generates three candidates. Sonnet ranks them using a structured rubric and a single API call. Then I implemented an Elo system for the templates. # src/prompt_library.py (excerpt) def record_tournament(self, ranking: list[str]) -> dict: for i in range(len(ranking) - 1): winner = self.templates[ranking[i]] loser = self.templates[ranking[i + 1]] expected_w = 1 / (1 + 10 ** ((loser.elo - winner.elo) / 400)) delta = ELO_K_FACTOR * (1 - expected_w) winner.elo += delta loser.elo -= delta self._maybe_retire_losers() # Templates below Elo 1300 are deleted The templates that consistently win the tournament climb the leaderboard; the ones that produce buzzword soup are automatically retired. What Happened Next The difference was immediate. On the very first run of v2, the drafter scored 81.45. That's twelve points higher than v1's all-time best. Over 25 pinned verification runs, the mean score was 82.67 with a standard deviation of 2.18. The worst draft in that run scored 75.4—still above v1's ceiling of 69.0. The most satisfying part was the judge's feedback. When the system tested the v1-baseline template, Sonnet didn't just give it a low score. It wrote: "The headline 'The Rust Revolution' is pure SEO-speak and the opening paragraph is a textbook AI tell... it's the kind of breathless corporate copy that kills trust immediately." That is exactly the failure mode the local 8B grader had been blind to for 100 iterations. The cost is roughly four cents per tournament. For the price of a coffee, I can run 125 iterations and actually trust that the line on the graph is moving upward. What I'd Tell Myself a Week Ago If you're building a self-improving loop, don't trust the autonomy. You need three things: * A judge stronger than the performer. If the judge is weaker, you aren't optimizing for quality; you're optimizing for the judge's biases. * Tournament selection. Single-shot mutation is just a random walk. You need multi-candidate comparisons to clear the noise floor. * A human-review gate. No automated judge is calibrated forever. Build in a pause where you manually pick the winner and anchor the next round. Stop trying to make the agents smarter. Just buy a better mirror. Improvement isn't about the engine—it’s about the feedback loop. Get full access to As The Geek Learns at astgl.com/subscribe
May 16

Managing Anthropic Agent SDK Costs: A Post-June 15 Billing Playbook

Your background agents are about to run out of money. Anthropic's new credit pool system means your automation could die in a single week. Here is how I re-engineered my stack to stay under budget without breaking my workflows. The Setup You've built a small fleet of agents. They sort your mail, watch your repos, file your daily briefings. My current setup before the June 15th cutover: Then May 13 lands, and Anthropic announces the change: on June 15, every programmatic Claude call moves into a metered monthly credit pool. $100 a month on Max 5x. No rollover. Run the math against your actual schedule. If you've got anything polling on the order of minutes (cron pipelines, hourly digests, watchdog sweeps), that pool drains in 7 to 10 days. And here's the kicker. Your interactive Claude Code keeps working. Your headless automation just stops. You wake up to a dead pipeline, a drained pool, and a subscription that still says active. What's Actually Going On This isn't just a random pricing tweak. There is a clear economic driver here. Throughout early 2026, many third-party tools used the Agent SDK at a $20 Pro subscription rate to run workloads that would cost hundreds at standard API rates. It was essentially compute arbitrage at scale. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Anthropic started cracking down in April, but the May 13 announcement is the structural fix. They are moving to dedicated monthly credit pools to restore access under metered billing. The reality is that most agentic operating systems are built directly on the Agent SDK. Because these agents lack a human in the loop to throttle their usage, they are now metered by default. Interactive sessions stay on the flat-rate subscription because the human provides the natural brake. Programmatic agents do not. The Fix I implemented a two-phase mitigation to deploy before the June 15 deadline. Phase 1 was a hot patch designed to provide immediate protection. I added a BILLING_MODE environment variable with three states: unmetered, metered, and paused. The paused state blocks every programmatic call across all providers, while metered enforces a strict cap on the Anthropic route. I also added a file-backed JSON ledger at store/billing-ledger.json to track monthly costs. It uses a write-then-rename pattern to ensure crash safety during updates. To handle errors, I introduced a BillingCapExceeded error class. I used the same instanceof pattern as my KillSwitchRefusal logic so a typo in a message cannot accidentally trigger a retry loop. The logic lives in a single chokepoint: runAgent() in src/agent.ts. The pre-call gate checks the cap, and the post-call gate records result.totalCostUsd from the SDK, firing a Telegram alert if a threshold is crossed. As a final safety measure, I cut the cadence on my two highest-frequency tasks: the pipeline-advance cron moved from 15 minutes to hourly, and I paused the council-evening task entirely under metered mode. // src/config.ts — tri-state env that gates programmatic agent calls export const BILLING_MODE = optional('BILLING_MODE', 'unmetered'); export const BILLING_CAP_USD = number('BILLING_CAP_USD', 80); // src/agent.ts — pre-call gate in the dispatcher function assertBillingAllowed(provider: Provider): void { if (BILLING_MODE === 'paused') { throw new BillingCapExceeded( 'BILLING_MODE=paused — programmatic agent calls are disabled.', ); } if (provider === 'anthropic' && BILLING_MODE === 'metered') { const total = getMonthlyTotal(); if (total >= BILLING_CAP_USD) { throw new BillingCapExceeded( `Anthropic monthly credit cap reached: $${total.toFixed(2)} >= $${BILLING_CAP_USD.toFixed(2)}.`, ); } } } export async function runAgent(opts: AgentOptions): Promise { assertEnabled('AGENTS_ENABLED'); const provider: Provider = opts.provider ?? 'anthropic'; assertBillingAllowed(provider); if (provider === 'ollama') return runOllamaAgent(opts); if (provider === 'codex') return runCodexAgent(opts); return runAnthropicAgent(opts); } Phase 2 focuses on the long-term router infrastructure. I promoted runAgent() from a direct SDK caller to a dispatcher that can route across anthropic, ollama, and codex providers. I also extended the agent.yaml schema with provider: and local_model: fields. I shipped a single-turn Ollama runner that wraps the local-LLM client. It returns totalCostUsd: 0 and a model tag like ollama:llama4:scout. I deliberately avoided tool calls in this initial version to keep the scope small. # agents//agent.yaml — new fields, validated at load id: scout name: SCOUT model: claude-sonnet-4-6 provider: anthropic # default. flip to 'ollama' to route locally. # local_model: llama4:scout # used when provider: ollama To be honest, I did not actually flip any agents to Ollama in this specific PR. The agents I need to move, like STEWARD or WATCHMAN, execute Bash and SQLite queries. A local runner without tool-call support would break them silently. Building a proper tool-call shim takes a few more days, but the cadence reduction and the billing breaker alone are enough to keep my spend under $80 per month. Why This Matters Every person using an agent OS is in the same boat. Whether you use ClaudeClaw, Cline, Aider, or Roo Code, the underlying SDK is the same, and the June 15 cliff is approaching. The playbook I used generalizes: you need one chokepoint, one ledger, and one way to audit your cadence. We also need to be honest about workload requirements. Tasks like editorial review or complex code deliberation still justify the Sonnet price tag. However, simple tasks like classification, routing, or summarization run perfectly fine on a local model with zero metered cost. The router infrastructure makes this migration a simple config flip rather than a massive code refactor. Finally, this reflects where the industry is heading. OpenAI has used usage-based pricing for a long time, and GitHub Copilot is moving toward credit pools. In the next year, more vendors will split consumption between interactive flat-rate plans and programmatic metered usage. Building this abstraction now means you won't have to scramble the next time a vendor changes their terms. Quick Reference * Single Chokepoint: Ensure every agent call flows through one function. This turned a three-week refactor into a one-week job. * Cadence over Architecture: Reducing task frequency (e.g., 15m to 1h) cuts spend faster than migrating to local models. * Ship the Breaker First: Implement the cost ledger and the BillingCapExceeded error as insurance before you attempt the complex provider migration. # The cutover, June 14: flip the env, restart, reseed, smoke-test BILLING_MODE=metered BILLING_CAP_USD=80 # then launchctl kickstart -k gui/$(id -u)/com.claudeclaw.app npm run pipeline -- schedule-advance npm run schedule -- pause council-evening Found this useful? I share practical lessons from my systems engineering journey at As The Geek Learns Get full access to As The Geek Learns at astgl.com/subscribe
May 8

ChatGPT Just Invented an Entirely Fake Version of My MCP Server

I asked ChatGPT to tell me about my own MCP server. It returned about a thousand words of confident, beautifully formatted, completely fabricated nonsense. Tables. Comparisons. A made-up acronym. A "thinking substrate" that sits above data and below agents. None of it is real, and that's the part worth talking about. The Setup My project is called `mcp-astgl-knowledge`. It's an MCP server with 15 tools for searching my newsletter articles, backed by sqlite-vec and Ollama. The whole thing fits on a laptop. ASTGL stands for "As The Geek Learns," which is the name of this newsletter. I wrote it. I shipped it. There is a public GitHub repo and a public package.json. So when a friend asked me what the MCP server actually does, I figured I'd see how each big AI assistant explained it. ChatGPT was first up. I typed in "ASTGL MCP Knowledge" and hit enter. What I got back wasn't an answer. It was a hallucination wearing the suit of an answer. "ASTGL (Abstract Semantic Task Graph Layer) MCP Knowledge Server is an emerging MCP server focused on structured knowledge representation and reasoning... it turns knowledge into graph-based, machine-reasonable structures that agents can query and evolve." That paragraph alone has three fabrications: the acronym expansion (made up), the "graph-based, machine-reasonable structures" (the server stores text chunks with vector embeddings, no graph), and "evolve" (the index is static, refreshed every six hours by a cron job, agents do not edit it). Then it kept going. A four-row "MCP stack" table positioning ASTGL as "the thinking substrate" between data and agents. A comparison matrix against fictional products called "Totem" and "SwarmClaw" that don't exist. A capabilities list including "task decomposition" and "reasoning over structure." Use cases. "Real-world examples." A confident sign-off: "If AST-grep is about seeing code better, then ASTGL is about thinking better." Every word of it written with the calm, structured, lightly-emoji'd authority that makes ChatGPT sound right by default. As The Geek Learns is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. What's Actually Going On When you ask an LLM about a topic it doesn't have indexed, it has two options: say "I don't know," or fill in the gap with something plausible. In practice, models default to the second one. They're trained to be helpful, and "I don't know" reads as unhelpful. So the gap gets filled. The result is what I'd call a fluency hallucination. The output has no factual grounding, but the writing is structured well enough that a casual reader can't tell. There are bullet points. There are tables. There's a "👉 In plain terms" callout. The rhetorical scaffolding looks like a real explainer because it's been pattern-matched to one. The contents underneath are pure fiction. This is a worse failure mode than search engines have. When Google doesn't know about you, you don't appear in results, and the user can see the gap. When an LLM doesn't know about you, the user gets a beautifully written description of someone the LLM made up, and your real work is still missing, but now there's a fake version sitting in front of it. For under-indexed creators (which, right now, is most of us), this is the default. Not the edge case. The Fix There's no quick patch for this on the engine side. The model isn't broken. It's doing what it was trained to do. The only handle I have is on my own side: make sure my real content reaches the retrieval surface, and measure whether it's working. So I built a citation tester. It's a small TypeScript script that hits Perplexity, Claude, and ChatGPT through their APIs, asks each one twenty target questions tied to articles I've already published, and parses the cited URLs from the response. If `astgl.ai` shows up, that's a hit. If it doesn't, that's the data. The point isn't that the floor is bad. I knew it would be. The point is that without a number, "improve our AEO" is a vibe, not a project. Every Monday at 9am the script runs again, writes a fresh row to a SQLite table, and tells me whether the floor moved. When it does move, I'll know which engine moved first, on which questions, and at what citation position. That's the actual feedback loop. Same root cause as the hallucination: my content isn't reaching the retrieval surface. Same fix: get it there. Different observability. Why This Matters If you write online and you care whether AI assistants represent you accurately, this is the thing to internalize: the alternative to being cited is not being silent. It's being replaced. Replaced by a confident summary of work you didn't do, opinions you don't hold, and product features you'd never ship. People who ask an LLM about your work and read its answer don't know they're reading fiction. They walk away with a model of you that you didn't write. The traditional AEO playbook talks about ranking, authority, and citation rate. All real, all worth measuring. But there's a tier underneath that, and it's the one most independent creators are stuck on right now: existence. Until your content is in the index, ranking doesn't apply. You aren't competing with anyone. You're competing with the LLM's imagination of you. Measurement is the cheapest part of fixing it, and it's the part most people skip. Quick Reference Four things that matter, in order: 1. Pick 20 questions your articles should answer. Tie each one to a specific URL on your site. 2. Hit each engine via API weekly. Perplexity returns a `citations[]` array. Claude returns search results in `web_search_tool_result` blocks. OpenAI returns `url_citation` annotations on `output_text` items. 3. Record the result to a small database, not a spreadsheet. You want trend data, not a snapshot. 4. Look at the floor first. Zero is a fine starting number as long as you're tracking it. The full script I'm using, including the gotcha where Node's `--env-file` silently dropped my Anthropic key on a fresh keypair, is in the repo. The article about the Anthropic key bug is coming separately. Found this useful? I share practical lessons from my systems engineering journey at As The Geek Learns Get full access to As The Geek Learns at astgl.com/subscribe

See All (18)

Creator

The Geek
Years Active

2025 - 2026
Episodes

18
Rating

Clean
Show Website

As The Geek Learns

Careers

Careers

Updated Weekly

As The Geek Learns

Your DNS Changed and Nobody Told You. Here's the Nightly-Diff Pattern That Catches It.

I Secured My AI Agent With a 7-Layer Threat Model

5 Questions to Ask Before You Build the AI Project Your CEO Just Pitched

Anthropic Shipped an AI Security Scanner. Here's the Per-PR Cost Math.

Stop Paying for Cloud APIs: Building a Local AI Stack on Mac Studio

I Built a Self-Improving AI Swarm. After 100 Runs It Was No Better Than Run One.

Managing Anthropic Agent SDK Costs: A Post-June 15 Billing Playbook

ChatGPT Just Invented an Entirely Fake Version of My MCP Server

About

Information

You Might Also Like

As The Geek Learns

Episodes

Your DNS Changed and Nobody Told You. Here's the Nightly-Diff Pattern That Catches It.

I Secured My AI Agent With a 7-Layer Threat Model

5 Questions to Ask Before You Build the AI Project Your CEO Just Pitched

Anthropic Shipped an AI Security Scanner. Here's the Per-PR Cost Math.

Stop Paying for Cloud APIs: Building a Local AI Stack on Mac Studio

I Built a Self-Improving AI Swarm. After 100 Runs It Was No Better Than Run One.

Managing Anthropic Agent SDK Costs: A Post-June 15 Billing Playbook

ChatGPT Just Invented an Entirely Fake Version of My MCP Server

About

Information

You Might Also Like