Stop feeding raw docs to your LLM on every query. Build a three-layer vault in Obsidian — immutable sources, an LLM-maintained wiki, and generated outputs — and let Claude Code handle the bookkeeping. Knowledge compounds instead of getting reinvented each time.
Most people use LLMs like a search engine with amnesia. Upload docs, ask, get chunks, repeat. Tomorrow the same question pulls three different chunks and nothing sticks. Andrej Karpathy's knowledge base pattern fixes that by making the LLM maintain a persistent wiki instead of searching raw files every time.
The setup is three folders and one config file. The outcome is a knowledge base that gets richer with every source and every question you add to it.
Why RAG alone doesn't compound
RAG — retrieval-augmented generation — is how NotebookLM, ChatGPT file upload, and most document Q&A tools work. The LLM searches chunks per query and discards the result. That has real costs:
- No accumulation. Every question starts from zero.
- No synthesis. Information across sources doesn't get linked.
- No updating. New data doesn't correct old conclusions.
- No structure. Knowledge is scattered chunks with no cross-references.
Collect 50 articles and the LLM still pulls three of them per question. You never see the full picture.
The pattern: a persistent, compounding wiki
Instead of searching raw docs, the LLM incrementally builds and maintains a structured collection of markdown files. Cross-links are pre-computed. Contradictions are flagged. Synthesis reflects everything you've read so far.
The split of responsibilities is the whole trick:
- You curate sources, set direction, ask questions, think about meaning.
- The LLM summarizes, cross-references, updates on new data, keeps the bookkeeping consistent.
People abandon wikis because maintenance grows faster than value. An agent that never gets tired and can update fifteen files in one pass changes that math.
Three layers, one config file
The whole architecture fits on a napkin:
raw/— immutable source documents. You add, the LLM reads, nobody modifies.wiki/— the LLM-generated wiki. The agent owns it fully.outputs/— reports and answers. Valuable ones get promoted back into the wiki.CLAUDE.md— the schema that tells Claude how the system works.
Two files inside wiki/ carry most of the weight:
wiki/index.md— a catalog of every page grouped by domain. Claude reads it first on every query.wiki/log.md— an append-only chronological log. Every ingest, query, and lint gets a line.
Setup in 15 minutes
You need Obsidian for the graph view and Claude Code as the agent. Open a terminal inside your vault:
mkdir -p raw/articles raw/papers raw/notes raw/telegram raw/images
mkdir -p wiki/_templates outputs
Or skip it and tell Claude Code "create the LLM Knowledge Base structure per Karpathy's pattern" — it will set everything up, including starter index.md and log.md files.
CLAUDE.md: the brain of the system
Without CLAUDE.md, Claude is a chatbot. With it, Claude is a disciplined wiki maintainer. The file lives at the vault root and should cover:
- The three-layer architecture (raw, wiki, outputs)
- Frontmatter schema for every wiki page
- Naming conventions (kebab-case for wiki, date-prefix for articles)
- The three operations: Ingest, Query, Lint
- Your domains of interest
A minimal frontmatter schema to require on every wiki page:
---
type: entity | concept | project | person | summary
domain: ai | communities | infrastructure | content
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: ["[[raw/articles/filename]]"]
tags: []
---
Start small. Karpathy's own advice: "super simple and flat." Expand CLAUDE.md only when you hit a real problem.
Three operations that run the whole system
Ingest
You drop a file into raw/ and say: "Process the new file in raw/articles/2026-04-06-title.md." Claude reads the source in full, discusses takeaways with you, creates or updates wiki pages, refreshes index.md and log.md, and wires up [[wiki-links]] between related pages. One source typically touches 5-15 wiki pages.
Query
Once you have ten or more pages, start asking real questions:
- "What are the three main trends in AI agents across everything in
wiki/?" - "Compare article X and article Y. Where do they diverge?"
- "Write a 500-word brief on topic X using only materials in the base."
Claude reads index.md, pulls the relevant pages, and synthesizes with citations. Save the valuable answers back into outputs/ or promote them to a new wiki/ page. That's the compounding step.
Lint
Once a month, ask: "Check the entire wiki/. Find contradictions between pages, orphan pages, claims without sources, topics mentioned but not developed. Suggest three articles to fill gaps."
Getting data in
Most of your sources will come through one of these:
- Obsidian Web Clipper — browser extension that drops
.mdstraight intoraw/articles/. Easiest path. - Copy-paste — create a markdown file manually.
- Claude Code — "save this article to raw/" and it handles the fetch.
- yt-dlp — for YouTube transcripts:
yt-dlp --write-auto-sub --sub-lang en --skip-download -o "raw/notes/%(title)s" URL
After a new file lands, run Ingest. Don't let raw/ turn into a graveyard of unprocessed files — value lives in the wiki, not the pile of sources.
Using Obsidian as the viewer
Obsidian isn't strictly required, but its two features earn their keep:
- Graph view (
Ctrl/Cmd+G) — hubs are concepts, orphans need links or deletion, clusters show topic neighborhoods. - Dataview plugin — SQL-ish queries over frontmatter. Useful for "show me every page updated this week by domain."
For vaults under ~100 pages, Obsidian's built-in search plus index.md is enough. Beyond that, add qmd for local BM25 + vector search with LLM re-ranking.
Do and don't
Do: load sources one at a time and participate in the discussion, save valuable answers back into the wiki, run Lint monthly, use [[wiki-links]] liberally, keep CLAUDE.md current, commit everything to git.
Don't: edit wiki pages by hand (ask the LLM), modify raw/ files, chase a plugin stack (47 Obsidian plugins is the Notion trap), or hoard raw/ without processing.
The bottom line
Three folders, one schema file, one agent. Pick a topic you actually care about, drop in the notes and articles you've already collected, and let Claude do the bookkeeping. In a month you'll have a wiki that answers questions your previous self couldn't. For deeper prompting technique, see the prompt engineering guide. To stretch your token budget while you build it, read Save Claude Tokens by Integrating NotebookLM, or jump to the full Claude Code cheatsheet.