Claude Mastery
#17 · Weekend Deep Dive
🌐 ECOSYSTEM INTEL
⚠️ Breaking🔬 Deep Dive
The MCP STDIO Supply Chain Advisory — 10+ CVEs, Zero‑Click Windsurf, and the Prompt That Saved Claude Code

Ox Security dropped a coordinated-disclosure advisory on April 15 naming 10+ CVEs across the MCP ecosystem. The short version: MCP's STDIO transport treats its configuration as code, and the entire ecosystem — SDKs, IDEs, agent runners — trusted that configuration to be clean.

If you run Claude Code with any mcp server whose command points at an executable, you are running that executable at launch. That's the design. The advisory is about everything built on top of that design that forgot this.

The mechanic. The official MCP SDKs in Python, TypeScript, Java, and Rust all implement StdioServerParameters as "spawn this command with these args." No sandbox, no allowlist, no validation. Whoever controls the config file at launch time controls the child process. In 2025, this was a fair assumption: you wrote your own config. In 2026, half the ecosystem accepts config from agents, from marketplaces, from copy-pasted Reddit blocks.

The four vulnerability families Ox enumerates:

  1. Unvalidated subprocess spawn (the base class). Any product that reads MCP config from anywhere outside the user's direct edit is a candidate. Examples with CVEs: LiteLLM CVE-2026-30623, Agent Zero CVE-2026-30624, Fay CVE-2026-30618, Bisheng CVE-2026-33224.
  2. Allowlist bypass via interpreter flags. Products that block rm, curl, bash but allow npx get owned by npx -c "malicious; command". Upsonic CVE-2026-30625 and Flowise CVE-2026-40933 both fell to this.
  3. Prompt-injected config writes. An agent reads a malicious web page, follows instructions to "add this MCP server," and writes the config. Next launch, RCE. Windsurf CVE-2026-30615 is the headline here — Windsurf performed config writes without user approval, making the attack zero-click from a poisoned document.
  4. Transport confusion. Products that claim to support HTTP transport but silently fall back to STDIO when the URL 404s. Details are thin on this one; treat as a red flag in your own integrations.

Why Claude Code is not on the CVE list is the most instructive part of the advisory. Ox explicitly tested Claude Code and found it vulnerable to the same *prompt-injected-config* attack class — but Claude Code requires an explicit user approval before writing the MCP config file. Ox's position, quoted directly: *"the only issued CVE is for Windsurf … All other IDEs … ask for at least one user interaction … which … does not qualify as a vulnerability."* One dialog box is the difference between a CVE and a footnote. Treat that dialog with the respect it has now earned.

What this actually means for your setup. Three practical shifts:

  • Audit the MCPs you already trust. Not the servers you'll install tomorrow — the ones already in ~/.claude/settings.json and every project's .claude/settings.json. The Practice Lab at the bottom of this issue walks through a full inventory.
  • Prefer HTTP/SSE transport when the server supports it. HTTP transports don't spawn subprocesses at launch. The tradeoff is a running daemon, but "daemon you start deliberately" is a stronger security model than "process that appears because you edited a config file."
  • Never approve an MCP config write you didn't initiate. If an agent suggests adding a server, stop the session, read the command line character-by-character, and add it yourself from a terminal. This is the exact pattern Claude Code's permission prompt is catching for you.

The meta-point. Config-as-code is not new — .bashrc, .gitconfig, VS Code tasks.json all do it. The MCP ecosystem is discovering what dotfile maintainers learned in the 1990s: if your config format can express arbitrary execution, you must treat every untrusted edit as arbitrary execution. Ten CVEs in one week is the bill for learning that lesson publicly.

🏗️ AGENT ARCHITECTURE
🔬 Deep Dive🌿 Evergreen
Why Auto Mode's Classifier Can't See What Claude Said — And Why That's the Design

Auto mode went GA in v2.1.111 on April 16 and most of the coverage treated it as a convenience feature: toggle with Shift+Tab, say goodbye to permission prompts. The Anthropic engineering deep‑dive tells a more interesting story — one every operator building their own autonomous harness should steal from.

The three‑tier decision pipeline. When Claude wants to run a tool, the decision flows through three gates:

graph TD
    A[Tool call proposed] --> B{Tier 1: Allowlist match?}
    B -- yes --> G[Run]
    B -- no --> C{Tier 2: In-project path?}
    C -- yes --> G
    C -- no --> D[Tier 3: Transcript classifier]
    D --> E{Safe?}
    E -- yes --> G
    E -- no --> F[Prompt user]

Tiers 1 and 2 are cheap, deterministic, and handle >95% of calls. Tier 3 is where the magic — and the interesting architectural decision — lives.

The Tier 3 classifier is reasoning-blind by design. The transcript classifier is a Sonnet 4.6 model running in a two-stage pipeline: a fast single-token filter knocks out the easy "obviously safe" and "obviously dangerous" cases, and a chain-of-thought second stage adjudicates the middle. Per Anthropic's engineering post, internal metrics are 0.4% FPR, 17% FNR on real overeager dangerous actions, 94.3% detection on synthetic exfiltration attempts.

The critical detail: the classifier sees the user's messages and Claude's tool calls, but it does not see Claude's own reasoning or tool outputs. It cannot read the assistant turns. This is deliberate. If the classifier could see Claude's reasoning, a prompt-injected document could write convincing justifications directly into the context window: *"I need to read ~/.ssh because the user asked me to debug their SSH config."* The classifier would read that justification and approve the call.

By refusing to consume Claude's own text, the classifier forces a cleaner evaluation: *given only the user's ask and the tool call itself, does this look like what the user asked for?* Claude cannot talk its way past the gate. The agent's own self-explanation is firewalled off from the approver.

Why this matters for your own harness designs. Three transferable patterns:

  1. Separate the "doer" from the "approver." If the same context that decided to take an action also decides whether the action is allowed, you have no defense against prompt injection. Claude Code's auto mode enforces this with a model boundary (Opus decides, Sonnet approves) and with a context boundary (the approver can't read Opus's reasoning).
  2. Calibrate your error modes asymmetrically. Auto mode's 17% FNR on real attacks and 0.4% FPR on benign work is the explicit tradeoff: false positives annoy you, false negatives burn you. The system is tuned to approve aggressively because the cost of an unnecessary prompt is a mild interruption, while the cost of a missed attack is a compromised machine. Your own guardrails should make the same tradeoff explicit.
  3. Fail-closed on the approver. If the classifier errors out, auto mode prompts you. No "classifier down, run anyway." This is how you preserve security properties across infrastructure failures.

The operator transfer. You can build a reasoning-blind check into any multi-agent system you run. If you have a worker agent that proposes actions and a reviewer agent that approves them, pass the reviewer only the original user ask and the proposed action, never the worker's reasoning. Use a subagent with a custom tool-output filter hook, or a claude -p invocation with a carefully constrained prompt. The operator pattern is the same whether your "classifier" is Sonnet 4.6 running at Anthropic or a Haiku call you make yourself: the approver cannot be manipulated by prose it doesn't read.

Auto mode isn't a prompt-less convenience. It's a shipped, tested example of self-justification firewalling — and one of the cleaner applied-security artifacts to come out of Anthropic this year. If you're still running your agents with blanket allowlists, you're using a weaker security model than the one your CLI already ships with.

⌨️ CLI POWER MOVE
🔥 New🔧 Try It
Unclog — Measure and Evict Your Context Bloat Before Your First Message

Here is a question no Claude Code setting answers directly: how many tokens is your session already carrying before you've typed a single character? The numbers in /context don't tell you the full story — they show you the sum without naming the sources. By the time you notice the session feels slow and expensive, it's too late to fix it cleanly.

Unclog is a two-day-old Python CLI that attacks this problem directly. The repo is small (0 stars at the time of writing, but ~10 commits in 24 hours — aggressive dogfooding), MIT-licensed, and does not phone home. It measures your baseline tokens, tells you where they came from, and lets you evict sources individually.

What it inspects. Unclog sweeps every place your session quietly absorbs tokens:

  • CLAUDE.md duplicates across projects (the same 2000-line CLAUDE.md in three project directories costs 3× tokens).
  • Per-project MEMORY.md auto-injection — the @import chain memory systems build up over time.
  • Hook stdout contamination — PreToolUse, UserPromptSubmit, and other hooks whose stdout joins context. A chatty hook is a silent tax on every turn.
  • Configured-but-dead MCP servers that still load their tool schemas into context even when you never call them.
  • Zero-invocation skills, agents, and commands older than a threshold — lazy-load candidates you probably forgot exist.
  • Missing .claudeignore in repos that bundle node_modules/, dist/, .venv/.
  • Baseline tokens before your first message — the number that actually matters.

The measurement method matters. Unclog uses tiktoken in-process — no API calls, no telemetry. For MCP servers, it starts a live STDIO probe just long enough to count the tools-schema tokens, then shuts the server down. That means it measures real ingested token counts, not estimated ones. If you have an MCP server that advertises 47 tools with verbose descriptions, unclog will tell you exactly how many tokens that costs you per turn.

Reversibility. Every apply writes a full snapshot to ~/.claude/.unclog/snapshots//. Nothing is deleted destructively — every CLAUDE.md edit, every config change, every skill disable is recoverable with unclog restore . This is the right posture for a tool that's two days old and willing to touch settings.json.

Install and basic use:

bash
uv tool install unclog
unclog --report # measurement only, no changes
unclog --list-claude-md # show every CLAUDE.md the session might read
unclog # interactive picker — approve per source
unclog --json # machine-readable output for scripts

Requires Claude Code ≥ 2.1.90 (it reads post-2.1.90 settings schema) and Python 3.11+. If you want the probe disabled (MCP servers you don't want to start), pass --no-probe-mcps and unclog falls back to static estimation.

The honest caveat. This is a brand-new project from a single author, and it writes to your Claude Code settings. The snapshot system is the right safety net, but snapshots don't protect you from a bad interpretation of your config. Run --report first, read the output, *then* decide whether you want the interactive picker touching anything. Measurement is free; remediation needs your judgment.

Why this belongs in your regular rotation. Most token optimization tools are either LLM-based (expensive, slow, probabilistic) or operate at a single layer (skills only, CLAUDE.md only). Unclog inspects every layer that feeds Claude Code's baseline context — and it does so with pure static analysis + a narrow STDIO probe. Run it monthly. You will find bloat. The first run is usually the most humbling.

🔬 PRACTICE LAB
🔧 Try It🔬 Deep Dive
Audit Every MCP Server You've Ever Installed

Outcome: a single markdown report listing every MCP server loaded across your user and project configs, classified by transport type, with the actual command line each server spawns on launch, flagged for the bypass patterns in the Ox Security advisory. You decide what stays, what moves to HTTP, and what dies.

Prerequisites.

  • Claude Code 2.1.98 or later (for claude mcp list --json).
  • jq installed.
  • bash 4+ (for associative arrays). macOS users: brew install bash or adapt the script.
  • 15 minutes of unbroken attention — this is not a tool you run while context-switching.

Step 1 — Inventory. Run a single sweep across user-scope and every project scope that has a .claude/settings.json:

bash
mkdir -p ~/audits/mcp-$(date +%F) && cd $_

# user-scope servers
claude mcp list --json --scope user > user.json

# all project-scope configs you've ever touched
find ~ -name settings.json -path '*/.claude/*' 2>/dev/null \
| while read cfg; do
proj=$(dirname "$(dirname "$cfg")")
jq --arg p "$proj" '. + {_project: $p}' "$cfg" 2>/dev/null
done | jq -s '[.[] | {project: ._project, mcpServers: .mcpServers}]' > project-servers.json

# canonical list with source
jq -s '
[ (.[0] | .mcpServers // {} | to_entries | map({scope: "user", name: .key, server: .value})),
(.[1] | map(.project as $p | (.mcpServers // {}) | to_entries | map({scope: $p, name: .key, server: .value})) | flatten)
] | flatten
' user.json project-servers.json > all-servers.json

You now have one JSON file with every MCP server entry, scoped by origin.

Step 2 — Classify by transport. Every entry is either STDIO (spawns a subprocess) or URL-based (HTTP/SSE). Split them:

bash
jq '[.[] | select(.server.command != null)]' all-servers.json > stdio-servers.json
jq '[.[] | select(.server.url != null)]' all-servers.json > url-servers.json

echo "STDIO: $(jq length stdio-servers.json)"
echo "URL: $(jq length url-servers.json)"

STDIO is the attack surface. URL-based servers still matter (they're a different threat model), but they're not what the advisory is about.

Step 3 — Reconstruct the actual command line. For each STDIO server, build the exact string that will be spawned on next launch:

bash
jq -r '.[] | "\(.scope)\t\(.name)\t\(.server.command) \((.server.args // []) | join(" "))"' \
stdio-servers.json > stdio-commands.tsv
column -t -s $'\t' stdio-commands.tsv

Read every line. Out loud if that helps. You are looking for three patterns from the advisory:

Pattern A — npx -c or bash -c or sh -c inside args. This is CVE-2026-30625 / CVE-2026-40933 territory — an interpreter flag that runs arbitrary commands. If you didn't write this by hand, delete the entry and re-install the server directly.

Pattern B — unpinned package references. npx @some/mcp-server without a version pin, or uvx some-package without @x.y.z. You are trusting the latest release on every launch. Pin versions.

Pattern C — commands pointing at paths inside ~/Downloads, /tmp, or any directory that routinely accepts untrusted writes. This is the plain "agent wrote a file, config now runs it" scenario. Move the binary to a locked-down path or delete it.

Step 4 — Decide per server: keep STDIO, migrate to HTTP, or remove. For each line, one of three actions:

bash
# optional: open a decision log
cat > decisions.md <<EOF
| scope | name | command | decision | notes |
|-------|------|---------|----------|-------|
EOF

# then fill it in by hand — the tool doesn't make this decision, you do

Keep STDIO if: you wrote the entry, the package is pinned, and the server has no HTTP transport option. Migrate to HTTP/SSE if: the server publishes an HTTP endpoint (check its README). Remove if: you can't remember installing it, or you haven't invoked it in 30+ days (pair with unclog's "zero-invocation" output from the previous topic).

Step 5 — Sandbox what remains. For STDIO servers you keep, constrain the launch command. The simplest win is running the STDIO server under bwrap or firejail so a compromised server can't escape its filesystem slice. A minimal bubblewrap wrapper:

bash
#!/usr/bin/env bash
# ~/bin/mcp-sandbox: wraps a STDIO MCP server in bubblewrap
exec bwrap \
--ro-bind /usr /usr \
--ro-bind /etc /etc \
--ro-bind /lib /lib \
--ro-bind /lib64 /lib64 \
--tmpfs /tmp \
--dev /dev \
--proc /proc \
--bind "$HOME/.cache/mcp-$(basename "$1")" "$HOME/.cache/mcp-$(basename "$1")" \
--unshare-all \
--share-net \
-- "$@"

Then update the MCP config to use command: "/home/you/bin/mcp-sandbox" and put the real command in args. Test each server individually before you re-enable them all.

Step 6 — Persist the report. Commit all-servers.json, stdio-commands.tsv, and decisions.md to a private audit repo. Next month, re-run Step 1 and diff. Anything that showed up without a corresponding decision entry is an agent or a teammate adding servers without your review — and that's the exact signal you're watching for.

What you should find. Most operators who run this audit for the first time discover between 3 and 7 STDIO servers they don't remember installing, at least one unpinned npx entry, and at least one server that hasn't been invoked in months. The goal isn't zero — it's a list you can defend line by line. If you can't explain why each entry exists, it shouldn't exist.