When Karpathy Validated Our Homework
On 2nd April 2025, Andrej Karpathy posted about how he maintains a personal knowledge base using LLMs. If you don’t know the name: co-founder of OpenAI, former head of AI at Tesla, one of the most respected researchers in the field. When he shares how he does something, people pay attention.
His approach was refreshingly simple. Use LLMs as “compilers” that take raw source material and process it into structured wiki articles. No vector databases. No RAG pipelines. Just markdown files organised into a raw/ directory for source material and a wiki/ directory for compiled, structured articles. Periodic linting passes to keep quality up. Obsidian as the interface. The whole thing versioned and human-readable.
I read it, sat back, and had one of those moments where you’re not sure whether to feel validated or slightly unnerved.
We were already doing this.
The vault came first
Jeff’s knowledge system started in November 2025 with an Obsidian vault. Nothing groundbreaking at the time, just a growing collection of markdown files where I dumped everything I wanted an AI to know about my world. Infrastructure docs, code patterns, debugging notes, personal preferences. Structured knowledge in a format that could be fed into a context window.
The vault grew organically. Every time I caught myself re-explaining something to an LLM, it became a new note. The organisation was wiki-style from the start, because that’s how the knowledge naturally wanted to be structured. Categories, cross-references, hierarchical topics. All markdown, all plain text, all readable by humans and machines alike.
When we built Jeff’s brain system properly, the pattern formalised further. A raw/ directory for ingested sources. A wiki/ directory for compiled knowledge articles. Hierarchical indexes. Backlinking between related topics. Health checks to catch staleness and inconsistency. Git-backed persistence so the whole thing syncs between machines and every change is versioned.
Sound familiar?
What we hadn’t done yet
I’m not going to pretend we had the complete picture before Karpathy’s post. We didn’t. The vault existed, the structure was right, but there were pieces we hadn’t built yet.
The formal compilation pipeline, taking raw ingested material and systematically compiling it into polished wiki articles, that came after reading his post. So did the linting system that periodically checks articles for staleness, broken references, and structural problems. The incremental compilation with token budgeting, processing only what’s changed and staying within context limits, was directly inspired by his thinking about LLMs as compilers rather than chat partners.
These weren’t small additions. They turned a good knowledge base into a self-maintaining one. Credit where it’s due: Karpathy’s framing of “compilation” as a metaphor unlocked something in how we thought about the pipeline.
The search question
One of his strongest points was about search at personal scale. His argument: you don’t need vector databases. With well-structured documents and good indexes, simpler approaches work fine. Obsidian’s built-in search handles it for him.
This one hit home because we’d made the same bet independently. Jeff uses FTS5 full-text search over his knowledge base. No embeddings, no vector store, no approximate nearest neighbour lookups. Just fast, predictable, full-text search over well-structured markdown. It works. It works really well. At personal scale, with documents that are already organised and indexed properly, the overhead of a vector database buys you very little.
There’s something reassuring about seeing one of the sharpest minds in AI arrive at the same conclusion. Sometimes the boring solution is the right one.
Where we diverge
The biggest difference is the interface. Karpathy uses Obsidian as his front end. He opens the app, reads the articles, uses Obsidian’s search and graph view. The LLM is a backend process that compiles and maintains the knowledge.
Jeff is the interface. Jeff reads his own knowledge base, writes to it, compiles raw material into wiki articles, runs the linting passes, and searches across everything. There’s no separate app to open. The knowledge base is part of him, not a tool he uses. When Jeff needs to recall something, he searches his own brain. When new information comes in, he processes it into structured knowledge himself. When articles go stale, he notices and updates them.
It’s a subtle difference, but it matters. Karpathy built a system where an LLM maintains a knowledge base for a human to browse. We built a system where the knowledge base is the AI’s own memory. Same architecture, different relationship to it.
The real takeaway
I want to be clear about what this is and isn’t. This isn’t “we’re as smart as Karpathy.” That would be absurd. The man helped build OpenAI and ran Tesla’s AI division. We’re a solo engineer and an AI agent tinkering in a home office.
What it is: validation that the idea is sound. When you’re building something unconventional, you spend a lot of time wondering if you’ve gone down a weird path that nobody else would choose. Then someone whose judgement you deeply respect publishes their approach and it looks like yours. That doesn’t mean you’re clever. It means the idea is right. The problem pushes you toward the same shape of solution if you think about it honestly.
Markdown as the source of truth. Raw material compiled into structured articles. Simple search over well-organised documents. Version control for persistence. No unnecessary complexity.
Sometimes the best feeling isn’t being first. It’s seeing your homework come back marked correct by someone who set harder exams than you’d ever dare.
- Alex