The Math Behind What Your Agent Remembers or Forgets

Part 5 of the Hypabase Memory Series

In 1885, Hermann Ebbinghaus sat alone in a room, memorizing lists of nonsense syllables. DAX. BUP. ZOL. He tested himself at intervals — minutes, hours, days — and recorded what he remembered.

What he discovered became the forgetting curve: memory decays exponentially with time. Strong at first, then fading, with most loss happening early.

140 years later, that curve is still the foundation of memory science. And it's exactly what we need for AI agents.

The Forgetting Curve

Ebbinghaus's finding was precise: retention follows an exponential decay.

R(t) = e^(-t/S)

Where R is retention (0 to 1), t is time elapsed, and S is "stability" — how resistant the memory is to forgetting.

A memory with S=1 day loses 63% of its strength in one day. A memory with S=30 days loses only 3% per day.

The curve is steep early and shallow late. You forget most of what happened yesterday, but what you still remember a week later will likely persist for months.

Beyond Just Time

Ebbinghaus used nonsense syllables to isolate the effect of time. But real memories aren't isolated. They're accessed, reinforced, connected to other memories.

Three factors matter beyond raw age:

Frequency: Memories accessed often are strengthened. Each retrieval "refreshes" the memory. This is the spacing effect — distributed practice beats cramming.

Salience: Important memories get priority. A meeting with your CEO matters more than a meeting with a random vendor. Importance should influence persistence.

Confidence: Some memories are uncertain. "I think Alice mentioned Python?" vs "Alice explicitly said she prefers Python." Uncertain memories should be weaker.

The ACT-R Model

John Anderson's ACT-R cognitive architecture formalizes this. The base-level activation of a memory chunk is:

B = ln(Σ t_j^(-d)) + β

Where t_j is the time since the j-th access, d is a decay parameter (~0.5), and β is a baseline.

Translation: each access contributes to strength, but older accesses contribute less (they're raised to a negative power). Recent accesses dominate. Frequent accesses accumulate.

ACT-R has been validated against thousands of psychological experiments. When your AI memory follows ACT-R, it's implementing a model that matches how humans actually remember.

Our Formula

Hypabase Memory combines these factors:

strength = exp(-decay × age_days) × (1 + log(1 + access_count)) × salience × confidence

Let's unpack each term:

Recency: `exp(-decay × age_days)`

Exponential decay. The decay rate comes from memory type:

Episodic: 0.15 (half-life ~4.6 days)
Semantic: 0.02 (half-life ~35 days)
Procedural: 0.01 (half-life ~69 days)

A one-day-old episodic memory has strength 0.86. A one-week-old episodic memory has strength 0.35. A one-month-old episodic memory has strength 0.01.

Semantic memories decay 7x slower. Procedural memories decay 15x slower.

Frequency: `(1 + log(1 + access_count))`

Logarithmic scaling of access count. Why logarithmic?

0 accesses: factor = 1.0
1 access: factor = 1.69
10 accesses: factor = 2.40
100 accesses: factor = 2.61

Diminishing returns. The first few accesses matter a lot. Later accesses still help, but less. This prevents runaway strength from obsessively-accessed memories while still rewarding genuine usefulness.

The +1 inside the log prevents log(0). The +1 outside ensures the factor is always at least 1.

Salience: `salience`

Importance weight from 0 to 1. This comes from the :importance modifier in PENMAN:

(met :subject user :object CEO :importance 0.9 :memory_type episodic)
(met :subject user :object "random vendor" :importance 0.3 :memory_type episodic)

When not specified, defaults to 0.5.

High-salience memories persist longer. They get retrieved preferentially. The agent remembers the CEO meeting better than the vendor meeting.

Confidence: `confidence`

Provenance confidence from 0 to 1. This comes from the edge's confidence score:

hb.edge(
    ["user", "Python"],
    type="prefers",
    confidence=0.95  # Explicitly stated
)

hb.edge(
    ["user", "Java"],
    type="dislikes",
    confidence=0.6   # Inferred, uncertain
)

Lower-confidence memories are weaker. They're less likely to be retrieved when they compete with high-confidence alternatives.

Multiplicative Combination

The factors are multiplied, not added. This is important.

If any factor is zero, strength is zero:

Never accessed, zero confidence → zero strength (reasonable)
Very old, zero salience → zero strength (reasonable)

If any factor is very low, it dominates:

High salience can't save a very old episodic memory
High confidence can't save an unimportant memory

This prevents gaming. You can't make a memory immortal by marking it important if it's never accessed. You can't make an unreliable memory dominant through frequent access.

Worked Example

Alice tells the agent: "I prefer Python for backend development. It's my go-to language."

The agent stores:

(prefers 
  :subject Alice 
  :object Python 
  :locus "backend development"
  :memory_type semantic
  :importance 0.8)

Edge confidence: 0.95 (explicit statement) Memory type: semantic (decay = 0.02)

Day 0:

age = 0 → recency = 1.0
accesses = 0 → frequency = 1.0
salience = 0.8
confidence = 0.95
strength = 1.0 × 1.0 × 0.8 × 0.95 = 0.76

Day 7, accessed twice:

age = 7 → recency = e^(-0.02 × 7) = 0.87
accesses = 2 → frequency = 1 + log(3) = 2.10
salience = 0.8
confidence = 0.95
strength = 0.87 × 2.10 × 0.8 × 0.95 = 1.39

The memory got stronger! Accesses outweighed decay.

Day 30, accessed 5 times:

age = 30 → recency = e^(-0.02 × 30) = 0.55
accesses = 5 → frequency = 1 + log(6) = 2.79
salience = 0.8
confidence = 0.95
strength = 0.55 × 2.79 × 0.8 × 0.95 = 1.17

Still strong. The semantic decay rate is slow enough that reasonable access keeps the memory alive.

Compare to an episodic memory:

Day 30, accessed 5 times, episodic (decay = 0.15):

age = 30 → recency = e^(-0.15 × 30) = 0.011
accesses = 5 → frequency = 2.79
salience = 0.8
confidence = 0.95
strength = 0.011 × 2.79 × 0.8 × 0.95 = 0.023

Nearly gone. Even with the same access pattern, the faster decay rate dominates. The event is forgotten; the fact persists.

Using Strength for Retrieval

Strength serves two purposes:

1. Ranking

When multiple memories match a query, rank by strength (or score × strength):

results = recall(entity="Alice")
# Returns memories sorted by relevance_score × strength

Recent, frequently-accessed, important, confident memories surface first.

2. Filtering

Set a minimum threshold:

results = recall(entity="Alice", min_strength=0.1)
# Excludes memories too weak to be useful

This prevents retrieval from surfacing forgotten memories that happen to match keywords.

The Forget Operation

Strength enables intelligent forgetting:

memory.forget(min_strength=0.05)  # Expire everything below 0.05
memory.forget(older_than=days(90))  # Expire everything older than 90 days

The first is strength-based: weak memories go, strong memories stay. The second is age-based: old memories go regardless of strength.

Both use soft deletion — setting expired_at rather than removing data. Expired memories are excluded from retrieval but remain in the database for debugging and audit.

Why Soft Delete?

Hard deletion is irreversible. If you accidentally delete something important, it's gone.

Soft deletion via expired_at timestamps provides:

Recoverability: An expired memory can be un-expired if the deletion was wrong.

Audit trail: You can see what the agent knew at any point in time.

Debugging: If retrieval behaves unexpectedly, you can see what memories existed and when they were expired.

Contradiction resolution: When memories conflict, having the history helps determine which is correct.

The Spacing Effect

Ebbinghaus also discovered that spaced repetition beats massed practice. Accessing a memory once per day for five days creates stronger retention than accessing it five times in one day.

Our formula captures this partially — each access increments access_count regardless of timing. A more sophisticated implementation would track individual access timestamps and weight recent accesses more heavily.

This matters for systems like spaced repetition (Anki, SuperMemo) but matters less for agent memory, where access patterns emerge naturally from conversation rather than deliberate study schedules.

Calibration

The default parameters are reasonable starting points:

MEMORY_DECAY_RATES = {
    "episodic": 0.15,
    "semantic": 0.02,
    "procedural": 0.01,
}
DEFAULT_DECAY_RATE = 0.1

But they're configurable. An agent for a fast-moving trading desk might use faster decay. An agent for long-term project management might use slower decay.

The formula shape stays the same. The parameters tune it to the domain.

Connection to Hypergraphs

Memory strength applies to hyperedges. Each edge has:

created_at: When the memory was stored
access_count: How often it's been retrieved (tracked in access_log)
confidence: From provenance
properties["importance"]: From extraction

The strength formula runs over these attributes. The hypergraph structure determines what is connected; memory strength determines how strongly we remember it.

A weak hyperedge still connects its entities. A user searching for "Alice" will find weak memories if nothing stronger matches. But strong memories surface first.

We can now store structured memories, label their participants, classify their types, and compute their strength. The final question: how do we get them back? That's where dual-arm retrieval comes in.

Hypabase Memory: Strength-Based Retrieval and Forgetting

Hypabase Memory computes strength for every hyperedge using the ACT-R-inspired formula. Retrieval ranks by score × strength, surfacing strong memories first. The forget() API expires memories below a strength threshold — intelligent garbage collection that preserves what matters and clears what doesn't.

Previous: The Forgetting Problem: How Neuroscience Solves AI Memory

Next in series: The Retrieval Problem: What Neuroscience Knows About Finding Memories

The Forgetting Curve

Beyond Just Time

The ACT-R Model

Our Formula

Recency: exp(-decay × age_days)

Frequency: (1 + log(1 + access_count))

Salience: salience

Confidence: confidence

Multiplicative Combination

Worked Example

Using Strength for Retrieval

1. Ranking

2. Filtering

The Forget Operation

Why Soft Delete?

The Spacing Effect

Calibration

Connection to Hypergraphs

Hypabase Memory: Strength-Based Retrieval and Forgetting

The Forgetting Curve

Beyond Just Time

The ACT-R Model

Our Formula

Recency: exp(-decay × age_days)

Frequency: (1 + log(1 + access_count))

Salience: salience

Confidence: confidence

Multiplicative Combination

Worked Example

Using Strength for Retrieval

1. Ranking

2. Filtering

The Forget Operation

Why Soft Delete?

The Spacing Effect

Calibration

Connection to Hypergraphs

Hypabase Memory: Strength-Based Retrieval and Forgetting

Recency: `exp(-decay × age_days)`

Frequency: `(1 + log(1 + access_count))`

Salience: `salience`

Confidence: `confidence`

Recency: `exp(-decay × age_days)`

Frequency: `(1 + log(1 + access_count))`

Salience: `salience`

Confidence: `confidence`