The Secret of Skills — When Natural Language Becomes Logic
/review is actually just a well-crafted prompt — 4 levers for designing agent behavior
What I expected: skills must have something special going on
Honestly, when I first looked at a Claude Code skill file, my reaction was “what even is this?” It’s a .md file. CLAUDE.md is .md, commands are .md, skills are .md. They all looked the same to me, and I couldn’t figure out what made them different.
Then I started using /review and got curious. I asked what the difference was between typing /review and just saying “please review my code” — and the answer that came back cut right to the point.
“Honestly? It’s just a well-crafted prompt.”
That wasn’t what I expected. I figured there was some special internal mechanism at work, but the essence was just a well-structured prompt template. The value of /review isn’t magic — it’s that you get to use that well-written prompt “in one word, consistently, in a version-controlled form.”
What actually happens: cramming everything into CLAUDE.md causes gaps
I’d been putting all my rules and conventions into CLAUDE.md. That seemed right at first. It’s always loaded, so it’ll always be followed — or so I thought.
Reality was different. The longer a conversation went, the more certain rules started getting ignored. The pattern was “it’s in there, but gets forgotten.” The reason was simple: CLAUDE.md gets read every message, but the longer the file gets, the more the model’s attention gets diluted across individual rules. They’re buried deep, competing with everything else.
That’s when I understood why skills exist. A skill’s body gets freshly injected right at the moment it’s triggered, right next to the task at hand. No competition — it surfaces exactly when it’s needed.
Why it works: “when doing this task, handle it this way” — that’s all there is
As the conversation went on, I tried to reason through the structure of a skill myself.
“When this kind of task comes in, handle it in this particular way?”
The response was striking. “Yeah, exactly. That’s the essence of a skill.”
A skill is made of exactly two parts:
- when (description) — “when a task like this comes in”
- how (body) — “handle it like this”
And here’s what I’d been missing: the description does more than half the work. If the trigger condition is vague, the skill either won’t fire when you actually need it, or it’ll butt in at the wrong moment.
With commands, I explicitly type /name so there’s no need for a when. But with skills, Claude has to match “this situation qualifies” on its own. That means how you write the when is almost everything about skill quality.
“Stock-related” (✗) vs “when writing or modifying P&L calculations or profit/loss logic” (✓) — that difference determines whether the skill actually fires.
How natural language becomes logic
Once I got this, one thing became clear. Traditional code expresses logic in a language machines read. Skills express logic in a language LLMs read — natural language. The execution engine is different, but both are describing “conditions + procedures.”
So writing a good skill is structurally the same as writing good code. If the condition (when) is clear and the procedure (how) is specific, that natural language spec becomes the logic.
4 levers, each on a different axis
Understanding skills alone isn’t enough. What this conversation crystallized was the full picture of the 4 levers.
- CLAUDE.md — ambient context that’s always loaded. Universal rules that apply to every task. Keep it short and universal, or dilution kicks in.
- Skills — auto-triggered when a task matches. Optional, but must fire reliably when needed.
- Commands — run when I explicitly type
/name. For moments when I need to say “do this, now.” - Hooks — shell commands that run deterministically, with no model judgment. For rules that absolutely cannot be skipped.
Without this distinction, everything ends up in CLAUDE.md and you’re stuck wondering “why does it keep forgetting?” Once you understand it, choosing which lever to pull where is agent design.
That said, there’s a tradeoff worth knowing. Moving something to a skill changes the nature of failure. Instead of CLAUDE.md’s “applied inconsistently,” you get a skill’s “if it fails to trigger, nothing fires at all.” A weak description can actually be worse than just leaving things as a convention.
Going forward: writing specific whens is everything
Designing agent behavior ultimately comes down to this: figuring out whether a behavior needs to always be present, needs to auto-surface, or absolutely cannot be skipped — then pulling the right lever.
For anyone who feels like their CLAUDE.md is overloaded, I have one suggestion. Pick the rule that gets missed most often and break it out into a skill. Write the when condition as specifically as you can — something like “P&L calculations.” The moment you watch it fire correctly, you’ll understand in your gut what it means to write logic in natural language.
Next →: If your agent isn’t listening — the 4-lever method Deep dive: How build-plan skill cuts development time · Background: I built a second brain because I couldn’t stand losing my AI conversations