We outgrew CLAUDE.md: building a knowledge layer that compounds

Earlier this year, Boris Cherny, the creator of Claude Code, published a thread on how he and his team use the CLI they built. A dozen tips covering everything from running parallel sessions to slash commands to subagents. The one I kept circling back to: the shared CLAUDE.md that their entire team feeds into (what he calls compound engineering, borrowing from Dan Shipper).

Our team shares a single CLAUDE.md for the Claude Code repo. We check it into git, and the whole team contributes multiple times a week. Anytime we see Claude do something incorrectly we add it to the CLAUDE.md, so Claude knows not to do it next time.

— Boris Cherny

I've been running with that idea at Wisetax, and ended up extending it into a broader knowledge layer. Here's why, what we built, and how it compounds.

README.md, CLAUDE.md, then what?

Every repo already has a README.md. That's for us humans to read: onboarding, setup, contributing.

Then there's CLAUDE.md. Essentially agent literature. It gets loaded into every Claude Code session: repo-wide conventions, common commands, guardrails, environment assumptions. Following Cherny's idea, the whole team updates it constantly. In practice, that means it grows. Fast.

In some of our repos, it hit 700+ lines. Ok but. Turns out the official docs recommend keeping CLAUDE.md under ~200 lines. Anthropic's context engineering guide frames it this way: context is a "finite resource with diminishing marginal returns". The more low-signal tokens you load, the less reliable the agent becomes. Stuffing everything into one auto-loaded file works against that.

Like humans, who have limited working memory capacity, LLMs have an "attention budget" that they draw on when parsing large volumes of context. Every new token introduced depletes this budget by some amount.

— Effective context engineering for AI agents, Anthropic

In our experience, domain-specific context, design rationale for particular subsystems: all of that is too detailed and too volatile for a single file. So we created .claude/knowledge/. Inside: one markdown file per topic, versioned in the repo. A place to capture tribal knowledge, the stuff that lives in developers' heads and gets lost between sessions.

Each file covers a slice of the system: a specific piece of the data pipeline, the intricacies of a retrieval layer, the motivation behind the design of the evaluation framework, etc.

In CLAUDE.md, we point to them so Claude knows when to read what:

<!-- CLAUDE.md -->
...

## Knowledge

| File                                  | When to read                                               |
|---------------------------------------|------------------------------------------------------------|
| `.claude/knowledge/architecture.md`   | Overall system design, endpoints, routing, streaming       |
| `.claude/knowledge/agent-framework.md`| Building/modifying agents with WisebrainAgent              |
| `.claude/knowledge/autonomous-agent.md`| Working on the main chat agent, its tools, or prompts     |
| `.claude/knowledge/retrieval.md`      | Search system, semantic/keyword retrieval, corpus taxonomy |
| `.claude/knowledge/elasticsearch.md`  | ES indices, query builders, document structure             |
| `.claude/knowledge/plan-navigation.md`| BOFIP/LEGI plan traversal, plan controllers                |
| `.claude/knowledge/testing.md`        | Writing or running tests, fixtures, evaluation scripts     |
| `.claude/knowledge/evaluations.md`    | Agent evaluation datasets, evaluators, LangSmith setup     |
...

This way, Claude only reads the files it needs to. If the current PR is about evaluations, it pulls in the evaluations.md knowledge file and ignores the rest. This is what makes the approach scale: each file can go into specifics that genuinely help the agent, because only relevant files get loaded into context.

What about Claude's auto memory?

Claude Code has an auto memory feature: notes Claude writes for itself based on corrections and preferences. They are stored locally under ~/.claude/projects/<repo>/memory/. It's per-machine and auto-managed. Not versioned, not shared. If a collaborator starts a session, they don't get my memory.

Repo-scoped knowledge is the opposite. It's checked in. It goes through PRs. Every engineer on the team gets the same context. Every session starts from the same baseline regardless of who's running it. The name is deliberate: "knowledge", not "memory", to keep the two separate in practice.

What to store

Knowledge files hold two kinds of things: know-why and know-how.

Know-why is arguably the more obvious one. I learn something about the codebase I didn't know, or had forgotten. The moment I document it, that's the last time Claude figures it out from scratch. I might forget next time. Claude won't.

At Wisetax, we curate an enriched Elasticsearch index built from a large corpus of French legal texts. At retrieval, our search system splits queries across corpus groups to mitigate language-level differences in the embedding model. Without that context, Claude doesn't know why the retrieval code fans out into three separate queries instead of one. It would figure it out eventually, after burning time and tokens reading through the code:

<!-- .claude/knowledge/retrieval.md -->
...

## Partitioned retrieval

Splits corpus into 3 groups to handle language style differences in embeddings:
- `law`: CGI, LPF, CIBS, CCOM, CSS, CTRAV, CMF, CCIV, CJA
- `guidelines`: BOI, BOSS, COMPTA
- `other`: EUR, INT, DOUANE, CADF, JADE, INCA, CASS, CAPP, CGIANX*, PLF, PLFSS, EXTERNAL, NOTICE

Runs each group as a parallel batch with retry (3 attempts).
...

Know-how is procedure: the sequence of jobs, the commands to run, the filenames that matter. At Wisetax, we ingest legal texts from multiple sources, each with its own pipeline. Take "BOSS" (for Bulletin officiel de la sécurité sociale), a French government publication that goes through seven distinct jobs before it's indexed and searchable:

<!-- .claude/knowledge/boss-pipeline.md -->
...

## Pipeline order

1. **Crawler** (manual, docker compose in `scripts/boss_crawler/`) -> `boss-raw` ES staging index
2. **SOURCE BOSS** (pace5/3d) -> compares `boss-raw` latest timestamp vs last batch, creates new batch
3. **REGISTER BOSS** (pace4/24h) -> fetches HTML, parses versions, uploads to S3, inserts docs in PG
4. **INDEX** (pace0/1min, shared) -> indexes docs to `wisepipe-all-3` ES
5. **PLAN BOSS** (pace5/3d) -> builds static plan from ontology config, indexes to ES
6. **VERSION BOSS** (pace4/24h) -> sets VIGUEUR/MODIFIE on versioned docs
7. **SELECT CHUNKABLE / CHUNK / EMBED / INDEX CHUNK** (shared, frequent)
...

The full file covers about a hundred lines: section codes, S3 path conventions, Elasticsearch gotchas, differences from our other main pipeline. This wasn't born from a single session. It's the accumulated map of a dense subsystem, built up over time.

Before it existed, every BOSS PR started the same way: find the relevant Notion tickets, skim through past PRs for context, paste a summary into Claude to catch it up. Now the context is sitting in a knowledge file, ready for Claude to pull in the moment BOSS comes up.

What to leave out

What files were changed (git knows this)
Config values, function signatures (read the code)
Session logs or step-by-step accounts
Deprecated features (delete from knowledge files)

The heuristic we use: "Would this help Claude understand the system 3 months from now?" If not, cut it.

The update loop

None of this works if the knowledge goes stale. What makes it compound over time: at the end of each session, we call /update-knowledge, a custom skill that reviews the conversation, checks existing knowledge files, and decides whether the repo's instruction surface needs updating.

No friction, we just run it. Half of the time, nothing changes. When it does, it's almost always a targeted edit to a knowledge file. CLAUDE.md moves rarely, maybe once every few weeks when a repo-wide rule shifts. Either way, it goes through a PR like any other change.

Here's what the skill looks like:

<!-- .claude/skills/update-knowledge.md -->

---
name: update-knowledge
description: Update project knowledge base after a session
---

Review the current conversation and update project knowledge.

## Steps

1. Read `CLAUDE.md` to see the knowledge table and existing file list
2. Read all files in `.claude/knowledge/`
3. Review the conversation and recent git diff
4. Update knowledge:
   - **Existing file needs update**: edit the relevant `.md`
   - **New feature/topic**: create a new `.md` and add a row
     to the Knowledge table in `CLAUDE.md`
   - **General pattern discovered**: add to `CLAUDE.md` directly
5. Cleanup pass: check all existing knowledge files for stale
   content. Trim or delete as needed.
...

Claude decides what's worth keeping, and what to cut: stale content gets trimmed, not archived.

After a few weeks of running this, the effect is hard to miss. Claude kickstarts sessions with context that used to require manual pasting. No need to tag a file or invoke a command. Claude reads the knowledge table, sees what's relevant, and loads it. The knowledge files fill in gradually, effortlessly.

Plus. The worflow changes how we think about sessions. We used to keep a single session alive as long as possible to preserve context. Now we work on a piece of the problem, run /update-knowledge, and start fresh. Shorter sessions mean a leaner context window. A leaner context means a more reliable Claude.

Every session leaves the next one a little better equipped.