Buried at the bottom of today's changelog is the fix a lot of you have been feeling without knowing: Claude Code was computing /context percentages against a 200K window for Opus 4.7 sessions, even though Opus 4.7's native window is 1M. That made autocompaction fire ~5x too early. Upgrade and you get five-times-the-runway on long sessions without touching a setting.
Other v2.1.117 highlights worth knowing about:
- Native
GlobandGrepare gone on macOS/Linux native builds — replaced by embeddedbfsandugrepinvoked through the Bash tool. One fewer tool round-trip per search. If you wrote hooks that match on theGloborGreptool name, they will stop firing on native builds. (Windows and npm-installed builds unchanged.) - Default effort bumped to
highfor Pro/Max subscribers on Opus 4.6 and Sonnet 4.6 (wasmedium). Opus 4.7 already defaulted to adaptive thinking, so this only moves the older models. Expect higher per-turn cost and better reasoning — use/effortto override. /modelselections now persist across restarts even when the project pins a different model; the startup header tells you when the active model is coming from a project or managed-settings pin./resumenow offers to summarize stale, large sessions before re-reading them — matches--resumebehavior.CLAUDE_CODE_FORK_SUBAGENT=1enables forked subagents on external builds (was an internal flag).- Agent frontmatter
mcpServers:now loads for main-thread agent sessions via--agent— full MCP parity between main-thread and subagent. - Plugin install auto-resolves missing dependencies from configured marketplaces;
blockedMarketplacesandstrictKnownMarketplacesare now enforced on install, update, refresh, and autoupdate. - OTel gets a
command_name/command_sourceonuser_promptevents and aneffortattribute on cost/token/api spans. - Bug fixes worth calling out:
WebFetchno longer hangs on very large HTML, Plain-CLI OAuth tokens now refresh reactively on 401 mid-session, subagents running a different model no longer false-positive the malware file-read warning, and idle re-render loops no longer leak memory on Linux.
No v2.1.115 shipped. v2.1.113 → v2.1.114 → v2.1.116 → v2.1.117 in 4 days.
For a few hours yesterday, the Anthropic pricing page swapped the checkmark next to "Claude Code" on the Pro tier for an X. The docs were edited to mention only Max. Then it flipped back. Anthropic's head of growth Amol Avasare confirmed on X that it's a live experiment: "~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected… when we do land on something, if it affects existing subscribers you'll get plenty of notice."
What's actually happening vs. not happening:
- Existing Pro subscribers keep Claude Code. The $20/month plan still includes it today for anyone already on it.
- 2% of new signups are seeing Pro-without-Claude-Code as a price-sensitivity test. That's running whether or not the pricing page shows it.
- No grandfathering guarantee for future Pro renewals. Avasare explicitly said subscription plans were "designed before Claude Code's popularity surge and new features like Cowork and long-running agents."
- The HN thread is largely confused — commenters on item 47856073 and 47856900 couldn't even agree on whether the change shipped, because it was reverted before most people saw it.
The practical take isn't "panic" — it's "stop letting your workflow depend on a single billing SKU." If your CLAUDE.md, hooks, skills, and slash commands are written to Claude-Code-specific plumbing, a future repricing event (or a bad quarter on the Anthropic side) makes your whole stack brittle. Three cheap hedges:
- Keep your CLAUDE.md provider-agnostic. Reference tool *behaviors* ("use the ripgrep-equivalent tool"), not Claude-Code-specific tool names where you can. Your skills should run under any ACP-compatible harness.
- Know your
ANTHROPIC_BASE_URLescape hatch. You can point Claude Code at any Anthropic-compatible proxy (LiteLLM, Ollama v0.14+, local LM Studio). That's not a daily workflow, but it's an insurance policy. - Install at least one cross-vendor plugin so you've validated the pattern. The Gemini plugin below is one example; Codex has similar ACP-bridge plugins. Running a second model behind the same slash-command surface means you can swap primaries if pricing moves.
This story isn't about $20 vs $100. It's about whether Anthropic's product economics align with yours long-term. Don't build like they always will.
The pattern that Codex-for-Claude-Code plugins normalized — keep Claude as the primary, delegate specific work to a second CLI over ACP — now has a Gemini counterpart. m-ghalib/gemini-plugin-cc shipped v0.1.0 this week. Apache 2.0, 4 stars as of this write-up, early but architecturally clean.
What ships:
- Six slash commands:
/gemini:review,/gemini:adversarial-review,/gemini:rescue,/gemini:status,/gemini:result,/gemini:cancel. - A
gemini-rescuesubagent that Claude can proactively delegate substantial investigations to. - An ACP broker that keeps one Gemini CLI process alive across multiple invocations —
--resumeworks, so follow-ups don't re-pay context.
Install, assuming you already have Gemini CLI set up (@google/gemini-cli@0.38.2+) and Node 18.18+:
/plugin install gemini@gemini-plugin-cc
/reload-plugins
/gemini:setup
The plugin uses JSON-RPC 2.0 over stdio, the same wire protocol as the Codex bridge, so if you've audited one, the trust model is familiar: Gemini can read and write files inside your repo (scoped by its own sandbox), OAuth or API key lives in ~/.gemini/settings.json, and requests flow to Google's servers — not Anthropic's. Commit or stash before you run a rescue.
This is a cheap way to get adversarial-review as a second opinion on a plan or diff, without leaving your Claude Code session. Worth ten minutes tonight.
Screenata/open-chronicle is a macOS menubar app that captures your screen every ~10 seconds, OCRs it on-device with Apple Vision, groups captures into 1-minute summaries via a local (Ollama) or hosted LLM, and exposes the rollups as an MCP server. Claude Code can then ask "what was I looking at when this broke?" instead of getting re-briefed by you.
The concept is genuinely new for Claude Code memory — everything we've covered before (Engram, Collabmem, MemoryBank, Entroly) treats memory as text artifacts. Chronicle treats the operating system itself as the memory substrate. That's a different axis of context.
Caveats before you install (9 stars, v0.1.3 as of Apr 22 2026):
- Mac-only. Swift menubar app (84%) + Node MCP process (16%). No Linux/Windows plan.
- macOS 14+ required, plus Xcode Command Line Tools and Node 20+.
swift run -c release. - LLM dependency. Needs an API key (Anthropic/OpenAI/Fireworks) OR a local Ollama install to generate the summaries. With Ollama, zero outbound calls.
- 30-minute retention by default — captures auto-expire. Password managers, mail, and messaging are excluded from capture out of the box. Everything else is fair game, so decide whether you trust the exclusion list on your shared machine.
- Privacy footprint is real. Even with on-device OCR, text *from* the screen is being fed to whatever LLM you point it at. A fabricated screen doesn't look different from a real one in the summary index.
Ignore it if you're on Linux. Watch it if you're on Mac — the architecture is the right shape for "continue where I left off" workflows, and v0.2.x will tell us whether the single-author sustains it.
What you'll do: Upgrade to v2.1.117, then prove the Opus 4.7 autocompaction fix landed by watching /context track against 1M instead of 200K on your next real session.
Prerequisites:
- Claude Code already installed (npm or native)
- An Opus 4.7 Max session you can run for a few minutes
jqif you want to parse version metadata; optional
Steps:
- Upgrade and confirm the version:
claude --version
# expect: 2.1.117 or later
- Start a fresh session and pin the model:
/model opus-4-7
- Paste in a deliberately large chunk — the output of
find ~ -type f -printf '%p\n' | head -20000or a 50k-token document — so you cross the old 200K threshold. - Run
/contextand record the percentage. Pre-v2.1.117, a 250K-token session would show ~125% and autocompact on the next turn. Post-fix, the same session should show ~25%. - Confirm the startup header says the model is pinned by the session, not by project/managed settings, so you're actually hitting 4.7's 1M window and not a silently-overridden model.
Expected outcome: /context percentage is roughly 5x lower than it was before the upgrade for the same transcript size, and autocompaction does not fire inside the first 200K tokens of conversation.
Verify: Run the session up to ~300K tokens of context. Pre-fix, you would have already seen the "Compacting conversation…" banner. Post-fix, you should see nothing until you get close to the real 1M ceiling. If you do see autocompaction firing early, either your /model pin didn't hold (check the startup header) or your effort setting is forcing a non-adaptive thinking path — run /effort and inspect.
Bonus: if you run telemetry, the new effort attribute on OTel token.usage spans means you can now graph compaction events against effort level directly. Handy for spotting whether your autocompaction tuning needs rethinking at high effort.