Garry Tan's GBrain: The Memex We Were Promised
How Y Combinator's President built a production AI memory system that reads before every response and writes after learning. The pattern every agent builder should understand.

How Y Combinator's President built a production AI memory system that reads before every response and writes after learning. The pattern every agent builder should understand.

Your AI agent is smart but it doesn't know anything about your life.
That's Garry Tan's diagnosis. And in April 2026, Y Combinator's President open-sourced his solution: GBrain — a personal knowledge system that turns meetings, emails, tweets, calendar events, and original ideas into a searchable brain that your AI reads before every response and writes to after every conversation.
The repo calls it "the Memex Vannevar Bush imagined, built for people who think for a living."
Within a week of use, Garry had 10,000+ markdown files, 3,000+ people pages with compiled dossiers, 13 years of calendar data, 280+ meeting transcripts, and 300+ captured original ideas. The agent runs while he sleeps, enriching pages and consolidating memory.
This isn't a demo. It's a production system. And it validates principles that every AI memory builder should understand.
Every page in GBrain follows one structure:
---
type: person
title: Pedro Franceschi
---
Co-founder of Brex. Met at YC W17 Demo Day. Strong opinions on
infrastructure, frequently references first-principles thinking.
Currently focused on AI applications for fintech.
---
- 2017-03-15: Met at Demo Day, discussed payment infrastructure
- 2023-06-20: Dinner in SF, talked about founder mental health
- 2024-11-08: Called about potential investment, interested in AI infra
- 2025-02-14: Email thread on hiring senior engineers
Above the separator: compiled truth. Your current best understanding. Gets rewritten when new evidence changes the picture.
Below: timeline. Append-only evidence trail. Never edited, only added to.
The compiled truth is the answer. The timeline is the proof.
This separation is crucial. Most memory systems store facts without distinguishing "what I believe" from "why I believe it." When beliefs need updating, there's no trail to follow. GBrain forces the distinction.
GBrain isn't a static knowledge base. It's a compounding system:
Signal arrives (meeting, email, tweet)
→ Agent detects entities (people, companies, ideas)
→ READ: check the brain first
→ Respond with full context
→ WRITE: update brain pages with new information
→ Sync: index changes for next query
Every cycle adds knowledge. The agent enriches a person page after a meeting. Next time that person comes up, the agent already has context. You never start from zero.
An agent without this loop answers from stale context. An agent with it gets smarter every conversation. The difference compounds daily.
The brain repo (plain markdown, git-versioned) is the system of record. GBrain adds Postgres + pgvector as the retrieval layer. The agent reads and writes through both.
Human always wins — edit any markdown file directly and gbrain sync picks up the changes. Memory systems that lock you out of your own data create brittleness. Markdown stays readable, editable, portable.
The most interesting piece: GBrain runs while you sleep.
The dream cycle is a nightly job that:
You wake up and the brain is smarter than when you went to sleep.
A brain is only useful if data flows into it. GBrain ships integration recipes:
| Integration | What It Does |
|---|---|
| Meeting Sync | Circleback transcripts → brain pages with attendee links |
| Voice-to-Brain | Phone calls via Twilio → transcripts with entity detection |
| Email-to-Brain | Gmail → entity pages (people, companies mentioned) |
| Calendar-to-Brain | Google Calendar → searchable daily pages |
| X-to-Brain | Twitter timeline + mentions → brain pages |
Each recipe is a self-contained markdown installer — the agent reads it, asks for API keys, validates, and sets up the cron. Meeting transcripts from Circleback flow in with attendee cross-references automatically created.
GBrain's search is hybrid:
Query
→ Multi-query expansion (Claude Haiku)
→ Vector search (HNSW cosine) + Keyword search (tsvector)
→ RRF Fusion: score = sum(1/(60 + rank))
→ 4-layer dedup
→ Stale alerts
→ Results
Keyword search alone misses conceptual matches. "Ignore conventional wisdom" won't find an essay titled "The Bus Ticket Theory of Genius" even though it's exactly about that.
Vector search alone misses exact phrases when the embedding is diluted by surrounding text.
RRF fusion gets both right. Multi-query expansion catches phrasings you didn't think of.
Specific design choices that matter:
Entity-centric structure. People, companies, concepts each get dedicated pages. Cross-references link them. The gbrain graph command traverses relationships. This is relational thinking, not flat fact storage.
Temporal awareness. Every timeline entry has a date. Stale alerts fire when compiled truth is older than the latest timeline evidence. The system knows when knowledge might be outdated.
Source attribution. The timeline IS the citation. Every belief traces back to when and where you learned it.
Operational discipline. The docs include a verify runbook — sync ran ≠ sync worked. Checks embedding coverage, page counts, proves the loop is healthy. This is operational verification, not answer provenance, but it's more than most systems offer.
GBrain validates a principle: Memory that compounds beats memory that just retrieves.
There are six critical problems any serious memory system must solve:
GBrain demonstrates relationship strongly — entity pages with cross-references and graph traversal. The temporal pattern (compiled truth + timeline with stale alerts) is production-tested. Consolidation via the dream cycle is the standout feature — nightly entity sweeps, citation fixes, memory merging.
But the gaps are real: decay is stale detection without principled forgetting (the brain only grows). Abstention is prompt-level ("say I don't know"), not architectural. Verification is operational health checks, not answer provenance.
The practical constraints matter too: frontier model required (Opus 4.6 / GPT-5.4), Supabase Pro at $25/mo, and 30-minute setup assuming you already have an agent running.
This is exactly why GBrain matters: it proves entity-centric memory with nightly consolidation works in production, and shows where the hard problems remain.
GBrain's markdown pages with typed links are a graph, but a constrained one. Relational data at its best is a hypergraph — relationships connecting multiple nodes simultaneously. We're building Hypabase to address all six problems: hypergraph-native storage, temporal tracking built in, principled decay, abstention by design, and full provenance.
Compiled truth + timeline. Separate what you believe from why you believe it. The pattern that makes knowledge updateable.
Brain-agent loop. Read before responding, write after learning. The cycle that makes knowledge compound.
Dream cycle. Consolidate while idle. The feature that makes the brain smarter, not just bigger.
Entity-centric structure. People, companies, concepts as first-class pages with cross-references. Not flat facts.
Integrations matter. Meeting transcripts, emails, calendar — data has to flow in automatically or the brain starves.
GBrain isn't a toy. It's a production system from someone running YC who needed his AI to actually know his world. The patterns are validated. The gaps are clear.
Try Hypabase Memory — hypergraph-native memory that addresses all six problems.
Related: Karpathy's LLM Wiki: Why the Future of AI Memory Isn't RAG