Garry Tan's GBrain: The Memex We Were Promised

Your AI agent is smart but it doesn't know anything about your life.

That's Garry Tan's diagnosis. And in April 2026, Y Combinator's President open-sourced his solution: GBrain — a personal knowledge system that turns meetings, emails, tweets, calendar events, and original ideas into a searchable brain that your AI reads before every response and writes to after every conversation.

The repo calls it "the Memex Vannevar Bush imagined, built for people who think for a living."

Within a week of use, Garry had 10,000+ markdown files, 3,000+ people pages with compiled dossiers, 13 years of calendar data, 280+ meeting transcripts, and 300+ captured original ideas. The agent runs while he sleeps, enriching pages and consolidating memory.

This isn't a demo. It's a production system. And it validates principles that every AI memory builder should understand.

The Core Pattern: Compiled Truth + Timeline

Every page in GBrain follows one structure:

---
type: person
title: Pedro Franceschi
---

Co-founder of Brex. Met at YC W17 Demo Day. Strong opinions on 
infrastructure, frequently references first-principles thinking.
Currently focused on AI applications for fintech.

---

- 2017-03-15: Met at Demo Day, discussed payment infrastructure
- 2023-06-20: Dinner in SF, talked about founder mental health
- 2024-11-08: Called about potential investment, interested in AI infra
- 2025-02-14: Email thread on hiring senior engineers

Above the separator: compiled truth. Your current best understanding. Gets rewritten when new evidence changes the picture.

Below: timeline. Append-only evidence trail. Never edited, only added to.

The compiled truth is the answer. The timeline is the proof.

This separation is crucial. Most memory systems store facts without distinguishing "what I believe" from "why I believe it." When beliefs need updating, there's no trail to follow. GBrain forces the distinction.

The Brain-Agent Loop

GBrain isn't a static knowledge base. It's a compounding system:

Signal arrives (meeting, email, tweet)
  → Agent detects entities (people, companies, ideas)
  → READ: check the brain first
  → Respond with full context
  → WRITE: update brain pages with new information
  → Sync: index changes for next query

Every cycle adds knowledge. The agent enriches a person page after a meeting. Next time that person comes up, the agent already has context. You never start from zero.

An agent without this loop answers from stale context. An agent with it gets smarter every conversation. The difference compounds daily.

Architecture: Markdown Stays the Source of Truth

The brain repo (plain markdown, git-versioned) is the system of record. GBrain adds Postgres + pgvector as the retrieval layer. The agent reads and writes through both.

Human always wins — edit any markdown file directly and gbrain sync picks up the changes. Memory systems that lock you out of your own data create brittleness. Markdown stays readable, editable, portable.

The Dream Cycle

The most interesting piece: GBrain runs while you sleep.

The dream cycle is a nightly job that:

Scans every conversation from the day
Enriches pages with missing entities
Fixes broken citations
Consolidates memory across pages
Detects contradictions and stale compiled truth

You wake up and the brain is smarter than when you went to sleep.

Getting Data In: Integrations

A brain is only useful if data flows into it. GBrain ships integration recipes:

Integration	What It Does
Meeting Sync	Circleback transcripts → brain pages with attendee links
Voice-to-Brain	Phone calls via Twilio → transcripts with entity detection
Email-to-Brain	Gmail → entity pages (people, companies mentioned)
Calendar-to-Brain	Google Calendar → searchable daily pages
X-to-Brain	Twitter timeline + mentions → brain pages

Each recipe is a self-contained markdown installer — the agent reads it, asks for API keys, validates, and sets up the cron. Meeting transcripts from Circleback flow in with attendee cross-references automatically created.

Search: Hybrid Retrieval

GBrain's search is hybrid:

Query
  → Multi-query expansion (Claude Haiku)
  → Vector search (HNSW cosine) + Keyword search (tsvector)
  → RRF Fusion: score = sum(1/(60 + rank))
  → 4-layer dedup
  → Stale alerts
  → Results

Keyword search alone misses conceptual matches. "Ignore conventional wisdom" won't find an essay titled "The Bus Ticket Theory of Genius" even though it's exactly about that.

Vector search alone misses exact phrases when the embedding is diluted by surrounding text.

RRF fusion gets both right. Multi-query expansion catches phrasings you didn't think of.

What GBrain Gets Right

Specific design choices that matter:

Entity-centric structure. People, companies, concepts each get dedicated pages. Cross-references link them. The gbrain graph command traverses relationships. This is relational thinking, not flat fact storage.

Temporal awareness. Every timeline entry has a date. Stale alerts fire when compiled truth is older than the latest timeline evidence. The system knows when knowledge might be outdated.

Source attribution. The timeline IS the citation. Every belief traces back to when and where you learned it.

Operational discipline. The docs include a verify runbook — sync ran ≠ sync worked. Checks embedding coverage, page counts, proves the loop is healthy. This is operational verification, not answer provenance, but it's more than most systems offer.

What This Means for AI Agent Memory

GBrain validates a principle: Memory that compounds beats memory that just retrieves.

There are six critical problems any serious memory system must solve:

Relationship — facts don't exist in isolation; they connect to other facts
Temporal — what was true yesterday may not be true today
Consolidation — contradictions must be resolved, knowledge must be merged
Decay — memory that only grows drowns in noise
Abstention — knowing when you don't know
Verification — tracing where any answer came from

GBrain demonstrates relationship strongly — entity pages with cross-references and graph traversal. The temporal pattern (compiled truth + timeline with stale alerts) is production-tested. Consolidation via the dream cycle is the standout feature — nightly entity sweeps, citation fixes, memory merging.

But the gaps are real: decay is stale detection without principled forgetting (the brain only grows). Abstention is prompt-level ("say I don't know"), not architectural. Verification is operational health checks, not answer provenance.

The practical constraints matter too: frontier model required (Opus 4.6 / GPT-5.4), Supabase Pro at $25/mo, and 30-minute setup assuming you already have an agent running.

This is exactly why GBrain matters: it proves entity-centric memory with nightly consolidation works in production, and shows where the hard problems remain.

GBrain's markdown pages with typed links are a graph, but a constrained one. Relational data at its best is a hypergraph — relationships connecting multiple nodes simultaneously. We're building Hypabase to address all six problems: hypergraph-native storage, temporal tracking built in, principled decay, abstention by design, and full provenance.

Key Takeaways

Compiled truth + timeline. Separate what you believe from why you believe it. The pattern that makes knowledge updateable.
Brain-agent loop. Read before responding, write after learning. The cycle that makes knowledge compound.
Dream cycle. Consolidate while idle. The feature that makes the brain smarter, not just bigger.
Entity-centric structure. People, companies, concepts as first-class pages with cross-references. Not flat facts.
Integrations matter. Meeting transcripts, emails, calendar — data has to flow in automatically or the brain starves.

GBrain isn't a toy. It's a production system from someone running YC who needed his AI to actually know his world. The patterns are validated. The gaps are clear.

Try Hypabase Memory — hypergraph-native memory that addresses all six problems.