Letta and LangMem are both open-source approaches to AI agent memory, but from different angles. Letta (formerly MemGPT) gives agents autonomous control over tiered memory through function calls, while LangMem is LangChain's official memory toolkit designed to integrate natively with the LangGraph ecosystem.
This comparison covers their architectures, benchmark performance, pricing, and ideal use cases to help you decide.
Quick Comparison
| Factor | Letta | LangMem |
|---|
| Architecture | Three-tier self-editing memory | Modular memory API with LangGraph integration |
| LongMemEval* | Not published | Not published |
| Deployment | Self-hosted (Docker/Python) or Letta Cloud | Self-hosted with LangGraph |
| Pricing | Open source / Cloud TBD | Open source |
| GitHub Stars | 22K | 1.4K |
| Funding | $10M seed (YC, Jeff Dean) | Part of LangChain ($25M+) |
What is Letta?
Letta (formerly MemGPT) pioneered the concept of agents that manage their own memory through function calls. Born from UC Berkeley research, the agent decides what's worth remembering and can edit its own memory across three tiers: core, recall, and archival.
With 22K GitHub stars and $10M in seed funding (including Jeff Dean as an investor), Letta has strong research credentials and a growing community.
Key strengths:
- Research-backed approach (UC Berkeley)
- Agent autonomy in memory management
- Active benchmark publishing (Letta Leaderboard)
- Strong coding agent focus (Letta Code)
- Well-funded with notable investors
What is LangMem?
LangMem is LangChain's official long-term memory toolkit, designed to plug directly into the LangChain/LangGraph ecosystem. It provides both active memory tools for "hot path" operations during conversations and automated background handlers for memory distillation and refresh.
Backed by LangChain's $25M+ in funding, LangMem leverages native LangGraph storage layer integration and supports arbitrary storage backends through its modular API.
Key strengths:
- Native LangChain/LangGraph integration
- Backed by the LangChain team
- Modular architecture with pluggable storage
- Active + background memory patterns
- Official support from a well-funded team
Architecture Comparison
Letta's Approach
Letta gives the agent control over memory management through function calls across three tiers:
- Core memory: Always in the context window
- Recall memory: Searchable conversation cache
- Archival memory: Long-term storage
The agent decides what moves between tiers. Memory operations are explicit function calls—core_memory_replace, archival_memory_insert, archival_memory_search—making the memory behavior inspectable and debuggable.
LangMem's Approach
LangMem offers four core capabilities:
- Modular memory API: Compatible with arbitrary storage backends
- Active memory tools: "Hot path" operations during conversations
- Automated memory handler: Background distillation and refresh
- Native LangGraph storage: Deep integration with LangGraph's persistence layer
LangMem is designed for developers already committed to the LangChain ecosystem, providing memory as a composable building block within LangGraph workflows.
The Key Difference
Letta is a standalone memory system; LangMem is a framework-native component.
Letta works independently of any orchestration framework. You can use it with LangChain, CrewAI, or your own agent loop. Its three-tier architecture is self-contained and framework-agnostic.
LangMem is purpose-built for LangGraph. Its storage integration, background handlers, and memory tools assume you're running within the LangChain/LangGraph stack. Outside that ecosystem, LangMem loses most of its value.
If you're fully committed to LangGraph, LangMem offers tighter integration. If you want flexibility, Letta doesn't lock you into a specific framework.
| Benchmark | Letta | LangMem |
|---|
| LongMemEval* | Not published | Not published |
Neither Letta nor LangMem has published LongMemEval scores. Letta maintains its own Letta Leaderboard, while LangMem has no public benchmarks at all.
This makes it impossible to directly compare retrieval accuracy between the two. Both trail Hypabase (87.4%) and other solutions that have published benchmark results.
Pricing Comparison
Letta
| Tier | Price | Details |
|---|
| Open Source | Free | Self-hosted via Docker or Python |
| Letta Cloud | TBD | Managed hosting, pricing not finalized |
LangMem
| Tier | Price | Details |
|---|
| Open Source | Free | Self-hosted with LangGraph |
| LangGraph Platform | Varies | Managed hosting through LangChain |
Both tools are open source at their core. The real cost difference is in infrastructure: Letta requires its own server or Docker deployment, while LangMem requires a LangGraph deployment (which may mean LangGraph Platform costs if you don't self-host).
When to Choose Letta
Choose Letta if you:
- Want agents that autonomously manage their own memory
- Need framework-agnostic memory that works with any agent loop
- Are building coding agents (Letta Code)
Letta's standalone architecture means you're not locked into any orchestration framework. The tradeoff is no published retrieval benchmarks and a more complex mental model for memory management.
When to Choose LangMem
Choose LangMem if you:
- Are fully committed to LangChain/LangGraph
- Want official LangChain support and documentation
- Need memory that integrates with LangGraph's persistence layer
LangMem makes sense when LangGraph is already your orchestration layer. Outside that ecosystem, the tight coupling becomes a liability rather than an advantage.
Consider Hypabase
The core problem with both Letta and LangMem is that neither has published retrieval benchmarks—you're flying blind on accuracy. Letta's autonomous agent decides what to remember without measurable guarantees. LangMem locks you into the LangGraph ecosystem and still gives you no numbers to evaluate. Both use ad-hoc LLM prompts for extraction that scatter facts into disconnected triples.
Hypabase publishes its numbers and takes a structurally different approach: AMR-based extraction into hyperedges.
| Factor | Letta | LangMem | Hypabase |
|---|
| Extraction | Agent-controlled function calls | Framework-dependent | AMR (formal linguistic framework) |
| Representation | Three memory tiers | Pluggable storage | N-ary hyperedges |
| LongMemEval* | Not published | Not published | 87.4% |
| Personalization | — | — | 100% |
Hypabase uses Abstract Meaning Representation (AMR)—a formal framework from computational linguistics—to parse sentences into structured graphs. Facts are stored in PENMAN notation using karaka semantic roles (from Panini's Sanskrit grammar):
"The CI pipeline failed due to a flaky integration test"
Ad-hoc extraction (Letta, LangMem):
(CI_pipeline, failed, -)
(failure_cause, was, flaky_test)
(test_type, is, integration)
AMR extraction (Hypabase):
(failed :subject CI-pipeline :instrument flaky-integration-test :attribute integration)
The difference: one hyperedge captures the failure, its cause, and the test type atomically. When an agent later asks "Why did CI break?", the complete causal chain is retrieved in one operation. Triple-based systems must reassemble fragments—and if one triple was extracted with slightly different wording, the link is broken.
Why Engineering Teams Choose This
| Benefit | How It Works |
|---|
| Framework-agnostic | Works with any agent loop—no LangGraph lock-in, no framework dependency |
| Published, verifiable accuracy | 87.4% on LongMemEval with a reproducible benchmark harness, unlike the zero published scores from Letta and LangMem |
| Causal chains preserved | Multi-role hyperedges keep cause-and-effect relationships intact across retrieval |
| Zero infrastructure overhead | Single SQLite file—no Docker, no PostgreSQL, no LangGraph Platform costs |
Learn more about Hypabase →
FAQ
Is Letta better than LangMem?
They target different audiences. Letta is a standalone memory system for autonomous agents. LangMem is a framework-native toolkit for LangGraph users. Neither has published LongMemEval scores. For published benchmarks and structured extraction, consider Hypabase (87.4%).
Can I use LangMem without LangChain?
LangMem is designed for the LangChain/LangGraph ecosystem. While the modular API supports pluggable storage, you lose most of its value outside LangGraph. For framework-agnostic memory, Letta or Hypabase are better options.
What's the main difference?
Letta is a standalone system where agents manage their own memory tiers. LangMem is a composable toolkit within LangGraph. Hypabase is also framework-agnostic and uses AMR for structured extraction into hyperedges, with published benchmark scores.
Which has better retrieval accuracy?
Neither has published LongMemEval results, making accuracy comparison impossible. Hypabase publishes 87.4% on LongMemEval with 100% on personalization tasks, and runs entirely in a single SQLite file with no external database required.
Conclusion
Letta offers agent-controlled, framework-agnostic memory management with strong research backing, but no published retrieval benchmarks.
LangMem provides tight LangGraph integration with official LangChain support, but locks you into the LangChain ecosystem and also lacks published benchmarks.
Hypabase achieves 87.4% through AMR-based extraction into hyperedges—structured knowledge representation that preserves relationships ad-hoc extraction fragments. 100% on personalization tasks.
All three are straightforward to integrate:
Try Hypabase →
LongMemEval scores: Neither Letta nor LangMem has published LongMemEval results. Hypabase (87.4%) from published benchmark harness.