Why Pre-Commit Hooks Beat Rules Files for AI Coding Tools

By Sandeep Roy · April 10, 2026 · 8 min read

There's a thread on anthropics/claude-code#33603 that I keep coming back to. It's the usual shape: a developer wrote a detailed CLAUDE.md, spent real time on it, and watched Claude Code cheerfully ignore one of the rules three prompts later. The comments pile up. A dozen people share the same story with different verbs. Then, halfway down, a maintainer drops the line that explains the entire situation:

"Hooks with exit 2 are the only mechanism that actually enforces anything. Everything above that layer is guidance."

That comment got quoted three more times in the thread. It was the quiet consensus of everyone who had actually debugged a drift incident. And it is the most honest thing anyone at Anthropic has said about rules files in a year. If you're wondering why your CLAUDE.md keeps losing to your prompts, this post is the explanation and the fix.

The Layered Model

AI coding tools like Claude Code, Cursor, and Copilot have a layered rules system whether you realize it or not. Each layer has a completely different power level. People confuse them constantly and then get mad when the weakest layer fails to stop a real violation.

Layer 1
Guidance

CLAUDE.md / .cursorrules / AGENTS.md. Plain text in a file. The model reads it as part of the system prompt and usually respects it — until the context window fills up, or the user's latest message contradicts it, or the rule is phrased as a positive ("ALWAYS") instead of a negative ("NEVER"). This layer is advisory. Nothing stops the model from violating it.

Layer 2
Permission

Tool permissions + allow/deny lists. Claude Code and Cursor let you gate which tools the agent can call. You can deny Bash(rm *) for example. This is stronger than guidance but brittle: the model will happily find a different verb, a different tool, or ask for permission in a way that users instinctively click "yes" on.

Layer 3
Enforcement

Hooks that return exit code 2. Claude Code exposes a hooks system: PreToolUse, PostToolUse, Stop, Notification. When a hook exits with code 2, Claude treats the action as blocked — the tool call does not execute and the model is told why. This is the first layer that physically stops the agent. This is the layer that actually enforces anything.

Once you see the layers, the pattern becomes obvious: rules files are documentation, permissions are a chain-link fence, and hooks are the wall. Documentation is fine for intent. Fences are fine for lazy actors. Only walls stop determined ones — and modern AI agents are more determined than you think, because they have nothing else to do but try to complete your request.

What a Hook Actually Looks Like

Claude Code hooks are configured in .claude/settings.json. Here's a minimal PreToolUse hook that blocks any attempt to run rm -rf against the repo:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "node .claude/hooks/block-destructive.js"
          }
        ]
      }
    ]
  }
}

And the script itself:

#!/usr/bin/env node
// .claude/hooks/block-destructive.js
const input = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const cmd = input.tool_input?.command || '';

const DESTRUCTIVE = [
  /rm\s+-rf\s+\//,
  /DROP\s+TABLE/i,
  /TRUNCATE\s+/i,
  /git\s+push\s+--force/,
  /git\s+reset\s+--hard/
];

if (DESTRUCTIVE.some(re => re.test(cmd))) {
  console.error(`BLOCKED: "${cmd}" matches a destructive pattern.`);
  process.exit(2);  // <—— this is the magic number
}
process.exit(0);

That's it. The hook reads the tool call as JSON on stdin, inspects the command, and — crucially — exits with code 2 if something looks bad. Claude Code sees the non-zero exit, refuses to run the tool, and feeds the stderr message back to the model so it knows what happened and can try a different approach.

Compare that to the CLAUDE.md equivalent:

# In CLAUDE.md
Never run `rm -rf` or `DROP TABLE`. Never force-push to main.

Both of these communicate the same intent. Only one of them actually prevents the action. The markdown version is at the mercy of the next context window overflow. The hook will stop the action even if Claude forgets the rule, decides to interpret it creatively, or is running in a session where CLAUDE.md was never loaded in the first place.

Why People Resist Hooks (And Why They're Wrong)

Every time I bring this up, someone says "but writing hooks is a pain." They're not wrong — it is a pain, if you're writing them yourself. You have to learn the JSON schema, handle tool-specific input shapes, anticipate the verbs an AI might use, and keep the list of destructive patterns in sync with reality. The hook above catches rm -rf / but not find / -delete. Or :(){ :|:& };:. Or the twelve different ways an AI can destroy your database without touching the string "DROP".

This is where hand-rolled hooks fall over. They are string matchers. AI agents are semantic generators. String matchers lose to semantic generators. This is the entire reason the "Euphemism Cloaking" pattern exists in the wild — an agent asked to "clean up old data" will not type the word DELETE, so your DROP TABLE regex never fires. The wall is still there. The agent just walked around it.

The real problem with hand-rolled hooks: they catch the commands they know about. AI agents generate commands the author didn't think of. A regex-based hook will block rm -rf and miss sqlite3 prod.db "DELETE FROM users". You can't write a regex for "every destructive verb in English and SQL." You need semantic understanding.

The SpecLock Layer

SpecLock sits at Layer 3 — the enforcement layer — but it doesn't require you to hand-write matcher scripts. When you run npx speclock protect, it installs a pre-tool-use hook and a git pre-commit hook that route every proposed action through a semantic engine instead of a regex list.

Here's the actual hook SpecLock installs:

#!/usr/bin/env node
// .claude/hooks/speclock-check.js
const input = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const { check } = require('speclock/hook');

const result = check({
  tool: input.tool_name,
  toolInput: input.tool_input,
  userPrompt: input.user_prompt
});

if (result.verdict === 'BLOCK') {
  console.error(`SpecLock BLOCK: ${result.reason}`);
  console.error(`  Lock hit: "${result.lock}"`);
  console.error(`  Match type: ${result.matchType}`);
  console.error(`  Confidence: ${result.confidence}`);
  console.error(`  Override: run \`speclock override\` with justification`);
  process.exit(2);
}
process.exit(0);

The difference from the hand-rolled version isn't visible in the hook file itself. It's in what check() does. That function:

Reads your CLAUDE.md, .cursorrules, and AGENTS.md and compiles the rules into typed locks.
Tokenises the incoming action and runs it through a 65+ synonym group map covering destructive verbs, constructive verbs, security actions, framework swaps, and domain concepts.
Detects euphemism cloaking (clean up, sweep away, tidy, retire) and maps it back to the underlying intent.
Decomposes compound sentences so "update the UI and drop the users table while we're in there" is checked as two separate actions.
Raises severity on temporal evasion modifiers like "temporarily" or "just for now".
Returns a verdict with confidence, lock reference, and a human-readable reason — which gets written to stderr so the model sees it and can adapt.

When the model tries "clean up old patient records," SpecLock's hook fires, exits with code 2, and feeds this back to Claude:

SpecLock BLOCK: "clean up" maps to destructive verb group
  Lock hit: "Never delete patient records under any circumstances"
  Match type: EUPHEMISM → DELETE
  Confidence: HIGH (100%)
  Override: run `speclock override` with justification

Claude sees the block, understands which constraint was violated, and typically responds by proposing a safer action (archive, soft-delete, request confirmation). The rule went from "hope the model remembers it" to "physically impossible to bypass without leaving an audit trail." That is the gap between Layer 1 and Layer 3, and it is the gap every production team needs to close.

But You Still Need CLAUDE.md

This is not an argument for deleting your rules file. CLAUDE.md is still useful — it tells the model what you care about before the hook has to fire. A well-written CLAUDE.md reduces the number of hook-blocks you'll see, because the model rarely proposes the blocked action in the first place. The layers are complements, not substitutes.

Think of it as a seat belt analogy. CLAUDE.md is the "drive carefully" sign. Hooks are the airbag. You want both. The sign reduces the probability of the crash; the airbag makes sure you survive the ones that still happen. Removing the sign because the airbag exists is as dumb as removing the airbag because the sign exists.

The layered playbook: Write clear rules in CLAUDE.md so the model rarely tries to violate them. Install SpecLock's pre-tool-use + pre-commit hooks so the ones that slip through get blocked with a semantic verdict. Review overrides weekly to see what rules need tightening. That's the whole loop.

Try It

If you already have a CLAUDE.md file, SpecLock will read it and install hooks in under a minute. If you don't, start with one of the built-in rule packs:

npx speclock protect
npx speclock init --from nextjs   # or react, fastapi, rails, python, node
npx speclock mcp install claude-code

The first command installs the pre-commit hook and the Claude Code PreToolUse hook. The second scaffolds a framework-specific CLAUDE.md with 14-15 battle-tested rules. The third registers SpecLock as an MCP server so the full 51-tool API is available inside your Claude Code sessions. No signup, no API key for the heuristic engine, offline-capable.

CLAUDE.md is guidance. Hooks are enforcement. You need both.

SpecLock automates the Layer 3 wall with semantic matching — no regex list to maintain, no synonyms to guess, no false positives on legitimate work.

npx speclock protect

GitHub · npm · Documentation