Do I need all of this to get value from Claude Code?

No. A good CLAUDE.md and two or three useful skills cover most of the gain. Add subagents when your context gets noisy, hooks when you're tired of enforcing the same thing by hand, and MCP servers one at a time.

CLAUDE.md vs skills, how do I decide?

If it must be true on every turn, it's a CLAUDE.md rule. If it's a procedure you only run sometimes, it's a skill. The cost difference is the whole reason: CLAUDE.md loads every session, skills load only when triggered.

Why subagents instead of just asking in the main chat?

Context hygiene. A subagent reads the messy stuff in its own window and returns a clean answer, so your main thread doesn't get slower and dumber with every file it ingests.

Are git worktrees actually worth it?

If you ever want two tasks running at once, yes. Separate directories mean separate sessions with separate contexts and no branch-switching conflicts.

How do I keep MCP servers from bloating my context?

Be ruthless. Each server's tools cost tokens whether you use them or not. Keep only the ones that remove a recurring manual loop, and turn the rest off.

I Spent 6 Months Tuning Claude Code. Here's the Exact Setup That Finally Worked.

The exact Claude Code setup after six months: a lean CLAUDE.md, lazy-loaded skills, subagents, hooks, git worktrees, and the five MCP servers that earn their place.

CLAUDE.md, subagents, hooks, skills, worktrees, and the five MCP servers that earn their place.

For the first month, I used Claude Code the way most people do: a chat box in my terminal that happened to be able to edit files. It was fast and impressive and produced a steady stream of code I didn't fully trust. Six months later it runs a real chunk of my workflow across a Go crawler backend, a pile of Next.js frontends, and a fleet of FastAPI services, and it does it without me babysitting every diff.

The difference wasn't a better model. It was finally understanding that Claude Code is not a chatbot. It's an agent loop with a handful of extension points, and each one has a different context cost. Once that clicked, the whole thing got predictable.

This is the setup I landed on. Not a docs summary, the actual layout, the rules, the things I stopped doing.

The one idea that fixed everything: context cost

Every extension point plugs into a different part of the loop, and the only question that matters is when does this load into the context window.

CLAUDE.md is loaded every single session. It's always-on. Expensive by default.
Skills load lazily, Claude sees only a name and a one-line description until a task actually matches, then it pulls the body in.
Subagents run in their own separate context and hand back only the result.
Hooks don't touch the context at all. They're shell commands that fire on events.
MCP servers add tools, and tools have a token weight whether you call them or not.

Once you think in those terms, the placement decisions answer themselves. If a rule must be true on every turn, it goes in CLAUDE.md. If it's a procedure you need sometimes, it's a skill. If it's noisy research that would pollute your main thread, it's a subagent. If it's an automatic guardrail, it's a hook. Stop arguing with the model in chat, encode the decision once and move on.

That framing is straight from how the tool is built. Claude Code launched in early 2025 as a terminal tool that could edit files and run bash; the extension layers arrived over the following year, MCP, then subagents, hooks, skills, plugins, and most recently agent teams. S1 S2

A CONTEXT card loading into a CLAUDE.md file, surrounded by project folders

Context cost in one picture: everything funnels through the context window, so where a rule loads is really a decision about what it costs.

Layer 1, CLAUDE.md: the rules I'm tired of repeating

CLAUDE.md is project memory. It's the thing Claude reads before it does anything, so the temptation is to dump your whole engineering handbook in there. Don't. Every token in this file is paid on every turn. Mine is deliberately short, stable, non-negotiable rules only.

The actual core of mine, trimmed:

# Engineering rules (always apply)
 
## Frontend
 
- Tailwind only. No CSS modules, no styled-components, no inline style objects.
- pnpm. Never npm or yarn.
- Strict TypeScript. `any` is banned, if you reach for it, stop and ask.
 
## Backend (Python)
 
- Poetry for deps. No bare pip.
- FastAPI routes are async-only.
- Every request/response model is a Pydantic schema. No raw dicts crossing a boundary.
 
## Infra
 
- Every Dockerfile/compose service declares memory limits.
- Commits are atomic and conventional. One logical change per commit.
 
## Workflow
 
- When I describe a bug, feature, or task, draft a Jira ticket for it
  (title, description, acceptance criteria, priority, labels) before writing code.

That last rule is the one that quietly changed how I work. I describe what I'm about to do, and a structured ticket comes back before a single line is written. The work documents itself.

The discipline here is editing it like code. A recurring mistake or a review comment I keep leaving is a CLAUDE.md edit, not a one-off correction I'll have to repeat next week. If I find myself typing "no, use Tailwind" twice, that's a bug in this file.

Core changes (rules, events, reasoning, types) flowing into a tracked pull request

The mental model: each extension point loads into context at a different time, so where a rule lives is really a decision about what it costs.

Layer 2, Skills: the procedures I don't want to paste

Skills are the part that genuinely surprised me. A skill is a folder with a SKILL.md, YAML frontmatter (a name and a description) plus instructions, and optionally scripts and templates. S5 S8 At session start Claude only sees the name and description, roughly a hundred tokens. When a task matches, it reads the full body. If the body points at scripts, those run in bash and only their output comes back, the source never burns tokens. S8

That progressive-disclosure model means you can keep a big library of workflows around at almost no standing cost. I invoke them as slash commands or let Claude pull them in on its own.

The ones that earn their keep for me:

new-service, scaffolds a FastAPI service the way I actually structure them: async routes, Pydantic schemas, Poetry, Dockerfile with memory limits, the lot.
pr-review, runs my pre-PR checklist before I ever open the PR. Catches the any that slipped in, the missing test, the non-atomic commit.
webp-pipeline, my S3 image-optimization routine, because I have run the "convert these to WebP and re-upload" dance enough times to never want to type it again.

Rule of thumb: if I've pasted the same 150-word prompt twice, it becomes a skill.

A workflow that goes explore, then plan, then review, then build

A skill captures a repeatable procedure (explore, plan, review, build) and loads only when a task actually matches.

Layer 3, Subagents: keeping the main thread clean

A subagent is a separate Claude instance with its own context window that does one job and returns only the result. S4 You define it in .claude/agents/ as Markdown with frontmatter, name, description, the tools it's allowed, and a system prompt.

---
name: schema-auditor
description: Reviews DB queries and Pydantic models for N+1s, missing indexes, and unvalidated input
tools: Read, Grep, Glob
---
 
You are a database and schema reviewer. Inspect the changed files only.
Report: risky queries, missing indexes, any boundary taking a raw dict
instead of a validated schema. Be specific with file and line. Do not edit.

The reason subagents matter isn't delegation for its own sake, it's context hygiene. When I ask the main session to "go read these forty files and tell me where auth is enforced," all forty files end up in my main context and everything afterward is slower and dumber. Hand that to a subagent and I get back a two-paragraph answer while my main thread stays focused on the actual task.

One trap worth flagging, because it bit me: if you leave the tools field off a subagent, it implicitly inherits every tool, including your MCP tools. S8 Scope them deliberately. A reviewer that can only read should not be able to write or hit your database.

Layer 4, Hooks: automation that doesn't depend on me asking

Hooks are the layer I underused the longest. They're shell commands wired to lifecycle events, think git hooks, but for the agent. You configure them in settings.json (or interactively via /hooks), with an event, a matcher, and a command. S3 They don't ask permission and they don't cost context; they just fire.

My two workhorses:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "pnpm prettier --write $CLAUDE_FILE 2>/dev/null; true"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [{ "type": "command", "command": "./.claude/guard-bash.sh" }]
      }
    ]
  }
}

The PostToolUse formatter means I stopped caring whether Claude's output is formatted, it always is, deterministically, without a single token spent telling it to. The PreToolUse gate is a small script that blocks the commands I never want run unattended (anything touching rm -rf, force-pushes, prod env files). The model proposing something dangerous and a script refusing to run it are two different safety layers, and I want both.

If a rule can be enforced by a program instead of a prompt, enforce it with a program. Prompts are probabilistic. Hooks are not.

Layer 5, Worktrees: how I run three of these at once

This isn't a Claude Code feature, it's plain git, but it's the thing that took my throughput from "one task at a time" to genuinely parallel. git worktree lets you check out multiple branches into separate directories backed by the same repo. One Claude Code session per worktree, each on its own branch, each with its own clean context.

git worktree add ../helixcrawl-rate-control feature/rate-control
git worktree add ../helixcrawl-bugfix     fix/sitemap-parser
# now open a Claude Code session in each directory

While one session refactors the rate-control service, another chases a parser bug, and I review whichever finishes first. No stashing, no branch-switching mid-thought, no two agents fighting over the same working tree. For anything bigger than a quick edit, this is how I work now.

The five MCP servers that earn their place

MCP is the highest-setup, highest-temptation layer. Every server you connect adds tools, and tools cost context whether or not you call them. I've installed and removed a lot of them. These five stayed, because each one removes a real copy-paste loop from my day. (Edit the config directly, .mcp.json for project scope, rather than fighting the CLI wizard. S8)

Several machines and tools (a database, a browser, a cloud) wired together into one loop

MCP wires external tools (Jira, GitHub, Postgres, a browser, live docs) into the agent loop. Every server adds tools, and tools cost context.

Atlassian / Jira. Non-negotiable for me. My CLAUDE.md drafts tickets; this server actually files them, links them to branches, and updates status. The loop from "I'm starting this" to a tracked ticket is now one sentence.
GitHub. PRs, issues, and review comments without leaving the session. It reads the diff, drafts the PR body against my checklist, and pulls review threads back in when I'm addressing feedback.
Postgres. Schema-aware. Instead of guessing at column names or pasting \d+ output, Claude inspects the live schema before it writes a query or a migration. This alone killed a whole category of "that column doesn't exist" errors.
Playwright. Browser automation for E2E checks and for poking at the frontends I'm building. Given my crawler work, having the agent drive a real browser to verify a flow, instead of asserting it "should" work, is worth the tool weight.
Context7 (live library docs). The honest fix for hallucinated APIs. It pulls current documentation for whatever library is in play, so the agent writes against the version I'm actually on instead of whatever it half-remembers. For fast-moving packages this is the difference between working code and confident nonsense.

Everything else (the calendar servers, the chat servers, the dozen "cool demo" integrations) I turned off. If a server isn't removing a recurring loop, it's just tax on every prompt.

The whole thing, on disk

.claude/
├── CLAUDE.md            # always-on rules (short on purpose)
├── settings.json        # hooks: formatter + bash guard
├── .mcp.json            # the five servers
├── guard-bash.sh        # pre-tool safety gate
├── agents/
│   ├── schema-auditor.md
│   ├── pr-reviewer.md
│   └── docs-researcher.md
└── skills/
    ├── new-service/SKILL.md
    ├── pr-review/SKILL.md
    └── webp-pipeline/SKILL.md

A hand-drawn .claude folder tree: rules, agents, skills, settings, MCP

The whole setup on disk: a lean .claude folder, rules, agents, skills, settings, MCP. Nothing in it is long.

Nothing in here is long. CLAUDE.md is a few hundred tokens. Each subagent is maybe thirty lines. The hooks are one gate and one formatter. The power isn't in any single file, it's in not making the model re-derive my conventions every session.

What I stopped doing

Stuffing CLAUDE.md. My first version was a 2,000-token wall of preferences. Half of them were "sometimes" rules that belonged in skills. Cutting it made every response sharper.
Arguing in chat. Correcting the same thing twice is a config smell. The fix lives in a file, not in my patience.
Trusting "done." The model is much better than it was at flagging its own uncertainty, but I still let hooks and a review subagent be the thing that says done, not the chat message. A green test suite is a fact; "I've completed the task" is a claim.
Collecting MCP servers. Fun to connect, expensive to keep. Five is plenty.

The takeaway

Six months in, the setup that works isn't clever. It's boring and it's layered: stable rules that are always loaded, procedures that load on demand, research that happens off to the side, guardrails that run as code, and just enough external tools to close the loops I actually hit. The model got better over those six months too, but the thing that made it trustworthy was deciding, for every rule, where it should live and what it should cost.

Start with CLAUDE.md. Add the rest when a real trigger shows up. Don't try to wire up all six extension points on day one, you'll just build a slower agent.

FAQ

Do I need all of this to get value from Claude Code?
No. A good CLAUDE.md and two or three useful skills cover most of the gain. Add subagents when your context gets noisy, hooks when you're tired of enforcing the same thing by hand, and MCP servers one at a time.
CLAUDE.md vs skills, how do I decide?
If it must be true on every turn, it's a CLAUDE.md rule. If it's a procedure you only run sometimes, it's a skill. The cost difference is the whole reason: CLAUDE.md loads every session, skills load only when triggered.
Why subagents instead of just asking in the main chat?
Context hygiene. A subagent reads the messy stuff in its own window and returns a clean answer, so your main thread doesn't get slower and dumber with every file it ingests.
Are git worktrees actually worth it?
If you ever want two tasks running at once, yes. Separate directories mean separate sessions with separate contexts and no branch-switching conflicts.
How do I keep MCP servers from bloating my context?
Be ruthless. Each server's tools cost tokens whether you use them or not. Keep only the ones that remove a recurring manual loop, and turn the rest off.

Sources

S1Claude Code, "Extend Claude Code" (features overview), 2026, code.claude.com
S2Towards AI, "Claude Code Extensions Explained: Skills, MCP, Hooks, Subagents, Agent Teams & Plugins," April 2026, pub.towardsai.net
S3Claude Code documentation, Hooks configuration, docs.claude.com
S4Level Up Coding, "A Mental Model for Claude Code: Skills, Subagents, and Plugins," March 2026, levelup.gitconnected.com
S5Smith Horn Group, "Choosing between skills, subagents, and MCP servers in Claude Code," 2025, smithhorngroup.substack.com

Written by

Syed Moinuddin

Full Stack Engineer.

Notes on AI tooling, agentic systems, and building things that survive contact with production.