Zep and LangMem take very different approaches to agent memory. Zep offers a standalone temporal knowledge graph (Graphiti) with enterprise features, while LangMem is LangChain's official memory toolkit designed specifically for the LangChain/LangGraph ecosystem.
This comparison covers their architectures, benchmark performance, pricing, and ideal use cases to help you decide.
Quick Comparison
| Factor | Zep | LangMem |
|---|
| Architecture | Temporal knowledge graph (Graphiti) | Modular memory API with LangGraph integration |
| LongMemEval* | 71.2% | Not published |
| Deployment | Cloud-first; self-host requires Graphiti + graph DB | Self-hosted with LangGraph |
| Pricing | Free / $25 / $475 / Enterprise | Open source |
| GitHub Stars | 4.4K (Zep) + 24.8K (Graphiti) | 1.4K |
| Funding | Not disclosed | Part of LangChain ($25M+) |
What is Zep?
Zep uses Graphiti, a temporal knowledge graph where time is a first-class dimension. Every fact has valid_from, valid_to, and invalid_at markers, allowing queries like "what was true in January?" or "when did this change?"
Zep positions itself around "context engineering" rather than just memory. Graphiti, the underlying engine, has 24.8K stars and supports multiple graph backends (Neo4j, FalkorDB, Kuzu, Neptune).
Key strengths:
- Best-in-class temporal reasoning
- Multi-hop graph queries
- <200ms retrieval latency
- Graphiti is open source (24.8K stars)
- Strong enterprise features (SOC2, HIPAA)
What is LangMem?
LangMem is LangChain's official long-term memory toolkit, designed to integrate seamlessly with the LangChain/LangGraph ecosystem. It provides both active memory tools for "hot path" operations during conversations and automated background handlers for memory distillation.
Backed by the LangChain team ($25M+ in funding), LangMem offers a modular architecture with pluggable storage backends and native LangGraph storage layer integration.
Key strengths:
- Native LangChain/LangGraph integration
- Backed by LangChain team
- Modular architecture with pluggable storage
- Active + background memory patterns
- Part of a well-funded ecosystem
Architecture Comparison
Zep's Approach
Zep's Graphiti engine stores facts as nodes in a knowledge graph with explicit temporal metadata. Each edge carries validity windows tracking when facts became true and when they were superseded.
This temporal awareness is native to the architecture—not a filter applied after retrieval. The tradeoff is infrastructure complexity: self-hosting requires running Graphiti plus a graph database.
LangMem's Approach
LangMem provides four core capabilities: a modular memory API compatible with arbitrary storage backends, active memory tools for hot-path operations, an automated memory handler for background distillation and refresh, and native LangGraph storage layer integration.
The design philosophy is framework-native—LangMem assumes you're already in the LangChain/LangGraph ecosystem and builds memory as a natural extension of that workflow.
The Key Difference
Zep is a standalone memory system; LangMem is a framework extension.
Zep operates independently—you can use it with any agent framework or build custom integrations. Its temporal knowledge graph is self-contained and architecturally opinionated.
LangMem delegates storage and retrieval to LangGraph's infrastructure. It's lighter-weight and more modular, but it's not designed to function outside the LangChain ecosystem. If you're not using LangGraph, LangMem offers limited value.
This means the choice often comes down to your existing stack: if you're committed to LangChain/LangGraph, LangMem integrates seamlessly. If you need a standalone memory layer, Zep is the more complete solution.
| Benchmark | Zep | LangMem |
|---|
| LongMemEval* | 71.2% | Not published |
Zep self-reports 71.2% on LongMemEval, though independent evaluation shows 63.8%. LangMem has not published any LongMemEval scores, making accuracy comparison impossible.
Without published benchmarks, it's unclear how LangMem's modular approach performs on standardized retrieval tasks.
Both lack the accuracy of Hypabase (87.4%), which uses AMR-based extraction for higher retrieval precision.
Pricing Comparison
Zep
| Tier | Price | Limits |
|---|
| Free | $0 | 1K episodes/month |
| Flex | $25/month | 20K credits, 600 req/min |
| Flex Plus | $475/month | 300K credits, 1K req/min, webhooks |
| Enterprise | Custom | SOC2, HIPAA, dedicated support |
LangMem
| Tier | Price | Details |
|---|
| Open Source | Free | Self-hosted with LangGraph |
| LangGraph Platform | Varies | Managed LangGraph hosting with memory built in |
LangMem itself is free and open source. However, using it effectively requires LangGraph infrastructure, which has its own costs if using the managed platform. Zep offers a managed service with clear pricing tiers but charges for usage at scale.
When to Choose Zep
Choose Zep if you:
- Need temporal queries ("what was true last month?")
- Want a standalone memory system independent of agent framework
- Require enterprise compliance (SOC2, HIPAA)
Zep's temporal graph is useful for knowledge updates, though the architecture hasn't evolved much since launch.
When to Choose LangMem
Choose LangMem if you:
- Are fully committed to the LangChain/LangGraph ecosystem
- Want official LangChain support and integration
- Need pluggable storage backends within LangGraph
LangMem is tightly coupled to LangChain and hasn't published benchmark scores, making it hard to evaluate retrieval accuracy independently.
Consider Hypabase
Zep locks you into a graph database stack. LangMem locks you into the LangChain ecosystem. Neither publishes competitive benchmark numbers—Zep's self-reported 71.2% trails the field, and LangMem has no published scores at all. If you want high accuracy without framework lock-in, Hypabase takes a fundamentally different approach.
| Factor | Zep | LangMem | Hypabase |
|---|
| Extraction | LLM-based into triples | Framework-dependent | AMR (formal linguistic framework) |
| Representation | Temporal triples | Pluggable storage | N-ary hyperedges |
| LongMemEval* | 71.2% | Not published | 87.4% |
| Personalization | — | — | 100% |
Hypabase uses Abstract Meaning Representation (AMR)—a formal framework from computational linguistics—to extract facts into structured hyperedges. No framework dependency, no graph database. Facts are stored in PENMAN notation with karaka semantic roles:
"The database migration completed successfully at 2:47 AM"
Ad-hoc extraction (Zep):
(database_migration, status, completed)
(database_migration, time, 2:47 AM)
— "successfully" qualifier is lost
LangMem extraction:
Depends on your storage backend and prompts
— Quality varies by configuration
AMR extraction (Hypabase):
(completed :object migration :attribute database :attribute successful :locus "2:47 AM")
Hypabase captures the event, its result, and its precise timestamp in one hyperedge. Zep's triples lose the "successfully" qualifier. LangMem's output depends entirely on how you've configured your storage backend—there's no guaranteed extraction quality.
Why This Matters for Operational Events
| Benefit | How AMR + Hyperedges Deliver It |
|---|
| Qualifiers preserved | :attribute roles capture success/failure status alongside the event |
| Precise timestamps | :locus binds the exact time to the event atomically |
| Framework-independent | Works with any agent framework—no LangChain/LangGraph required, no graph DB needed |
| Parseable output | PENMAN notation has defined grammar; malformed extractions caught at parse time |
This is why Hypabase achieves 100% on personalization tasks—operational events, status updates, and factual records are all extracted with their full context intact, queryable without framework overhead.
Learn more about Hypabase →
FAQ
Is Zep better than LangMem?
They serve different purposes. Zep (71.2% self-reported) is a standalone memory system with temporal reasoning. LangMem is a LangChain ecosystem tool with no published benchmarks. For higher accuracy with structured extraction, consider Hypabase (87.4%).
Can I migrate from Zep to LangMem?
There's no direct migration path—they use different architectures and storage models. Migration requires re-ingesting conversation history through the new system. Note that switching to LangMem also means committing to the LangGraph ecosystem.
What's the main difference?
Zep optimizes for temporal accuracy with a standalone graph-based system. LangMem optimizes for LangChain/LangGraph ecosystem integration. Hypabase optimizes for extraction quality using AMR and structured hyperedge representation.
Which is better for self-hosting?
LangMem requires LangGraph infrastructure. Zep requires running Graphiti plus a graph database (Neo4j/FalkorDB/Kuzu). Both carry significant operational overhead. Hypabase runs entirely in a single SQLite file with no external database required—the simplest self-hosting option.
Conclusion
Zep adds temporal reasoning but self-reports 71.2% on LongMemEval (independent evaluation shows 63.8%). Useful for knowledge updates, though requires graph database infrastructure.
LangMem integrates natively with LangChain/LangGraph but has no published benchmark scores. Best suited for teams already committed to the LangChain ecosystem, though accuracy is unverified.
Hypabase achieves 87.4% through AMR-based extraction into hyperedges—structured knowledge representation that preserves relationships ad-hoc extraction fragments. 100% on personalization tasks.
All three are straightforward to integrate:
Try Hypabase →
LongMemEval scores: Zep (71.2%) self-reported; independent evaluation shows 63.8% (arxiv:2512.13564). LangMem has not published LongMemEval scores. Hypabase (87.4%) from published benchmark harness.