Andrej Karpathy's LLM Wiki: Why the Future of AI Memory Isn't RAG

Most people use LLMs wrong.

You upload documents, the model retrieves chunks, generates an answer. This is RAG. It works. But the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask something that requires synthesizing five documents, and the model has to find and piece together fragments every time.

Nothing is built up. Nothing compounds.

In April 2026, Andrej Karpathy — OpenAI co-founder, former Director of AI at Tesla — published a GitHub gist that describes a fundamentally different approach. Instead of retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki. A structured, interlinked collection of markdown files that sits between you and your sources.

This is the pattern every AI builder should understand.

The Core Insight: Compilation Over Retrieval

Karpathy's insight is deceptively simple: treat knowledge management like code compilation.

With RAG, you're running an interpreter. Every query re-parses the source material, re-derives relationships, re-synthesizes conclusions. It's like running python script.py from scratch every time instead of compiling once.

The LLM Wiki pattern is different. When you add a new source, the LLM doesn't just index it for later retrieval. It reads the source, extracts key information, and integrates it into an existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims.

The wiki is a persistent, compounding artifact. Cross-references are already there. Contradictions have been flagged. Synthesis reflects everything you've read. The knowledge is compiled once and kept current.

You never write the wiki yourself. The LLM does all the grunt work — summarizing, cross-referencing, filing, bookkeeping. You curate sources, ask questions, direct the analysis.

Three-Layer Architecture

The pattern has three layers:

1. Raw sources — Your curated collection of documents. Articles, papers, images, data files. These are immutable. The LLM reads from them but never modifies them. This is your source of truth.

2. The wiki — A directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references.

3. The schema — A configuration document (like CLAUDE.md) that tells the LLM how the wiki is structured, what conventions to follow, what workflows to use. This is what makes the LLM a disciplined wiki maintainer rather than a generic chatbot.

Karpathy uses Obsidian as the viewer: "Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase."

Operations: Ingest, Query, Lint

Ingest. Add a source, LLM processes it — writes summary, updates index, touches 10-15 wiki pages with cross-references. Knowledge integrates immediately.

Query. Ask questions, get synthesized answers with citations. Key insight: good answers get filed back as new pages. Explorations compound.

Lint. Periodic health-check — find contradictions, stale claims, orphan pages, missing links. The LLM suggests what to investigate next.

Why This Works: The Maintenance Problem

Humans abandon wikis because maintenance burden grows faster than value. Cross-references, summaries, consistency — the bookkeeping kills you.

LLMs don't get bored. They can touch 15 files in one pass. Maintenance cost drops to near zero.

This is Vannevar Bush's Memex (1945) finally realized. The part Bush couldn't solve was who maintains it. The LLM handles that.

Practical Applications

Research. Deep dives over months. Papers, articles, reports synthesized into evolving wiki. The synthesis exists when you need to write.

Business/team. Internal wiki fed by Slack, meetings, customer calls. LLM maintains what no one wants to maintain.

Personal. Goals, health, self-improvement. Journal entries and notes building a structured picture over time.

The Scaling Question

Can you skip filesystem and use conversation-as-wiki? Build a knowledge graph directly in chat, evolve it by talking.

The limitation: context window degradation. You hit a ceiling filesystem doesn't have.

Different tradeoff — thinking-in-the-moment over durable accumulation. Works for exploration, not for compounding knowledge.

What This Means for AI Agent Memory

Karpathy's pattern validates a principle: Memory shouldn't be retrieval. Memory should be synthesis.

Current RAG pipelines are glorified search algorithms. Store everything, embed it, shift the problem to retrieval. But the next leap isn't better retrieval — it's focusing on what and how to store in the first place.

There are six critical problems any serious memory system must solve:

Relationship — facts don't exist in isolation; they connect to other facts
Temporal — what was true yesterday may not be true today
Consolidation — contradictions must be resolved, knowledge must be merged
Decay — memory that only grows drowns in noise
Abstention — knowing when you don't know
Verification — tracing where any answer came from

Karpathy's wiki demonstrates relationship elegantly — interlinked markdown pages ARE a graph structure. His Lint operation gestures at consolidation (finding contradictions, stale claims). But temporal tracking, principled decay, abstention, and verification remain unaddressed.

This is exactly why his demo matters: it proves relationship-first memory works, and shows where the gaps are.

Karpathy's wiki is markdown with cross-references. But relational data storage at its best is a hypergraph — entities and relationships connecting to multiple nodes simultaneously, capturing knowledge structure natively.

We're building Hypabase to address all six problems: agentic memory that handles relationships, tracks time, consolidates automatically, decays gracefully, knows its boundaries, and maintains provenance.

Key Takeaways

RAG retrieves. Wikis compile. The difference is whether knowledge compounds or gets re-derived.
Maintenance is the bottleneck. LLMs solve it by making maintenance cost near-zero.
Three-layer separation. Immutable sources, LLM-owned wiki, schema configuration. Clean boundaries.
Explorations become assets. Good queries get filed back into the wiki. Nothing disappears into chat history.
The pattern scales to teams. Same architecture, with humans reviewing LLM updates.

Karpathy's LLM Wiki isn't a product. It's a pattern. And it's the clearest articulation yet of what AI-native knowledge management should look like.

Try Hypabase Memory — agentic memory built on these principles.