C2 — Pool Efficiency

Accuracy is half the story.
Token cost is the other half.

All 20 facts written to a single shared namespace. 40 queries must find the right fact among everything stored. The efficiency score compounds accuracy and injection size: a system that returns too much context loses points even if it finds the right answer.

5
Iranti
100% · 20 tok/query
1.39
Shodh
92% · 66 tok/query
4.44
Mem0
80% · 18 tok/query
1.22
Graphiti
60% · 49 tok/query

The pool test

C1 tests each fact in isolation — one fact per namespace, zero competition for retrieval. C2 is harder: all 20 facts are written to a single shared namespace, and each query must return the one relevant fact from the full pool.

This tests how well each system concentrates its injection on relevant content versus returning bulk context. In production, a memory pool grows over time — systems that return more context with each query impose higher token cost at inference time.

For Iranti, the shared namespace is a single project/user context. For Shodh, a single user ID. For Mem0, a single user ID with one Chroma collection. For Graphiti, a singlegroup_id containing all episodes.

Efficiency formula

The efficiency score is a compound metric that penalizes token bloat:

efficiency = accuracy% ÷ avg_tok_per_query
Iranti
100 ÷ 20 = 5.0
Shodh
92 ÷ 66 = 1.39
Mem0
80 ÷ 18 = 4.44
Graphiti
60 ÷ 49 = 1.22

Tokens counted as whitespace-delimited words in the returned context string. Per-query averages computed across all 40 queries.

Three-axis comparison

Accuracy
Iranti100%
Shodh92%
Mem080%
Graphiti60%
Avg tok/query (lower = better)
Iranti20
Shodh66
Mem018
Graphiti49

Bar length = relative token cost

Efficiency score
Iranti5
Shodh1.39
Mem04.44
Graphiti1.22

Normalized to max 5.0

Why Shodh's token cost collapses efficiency

In C1 (isolated namespaces), Shodh returned 20 tokens per query — identical to Iranti. This is because each namespace contained exactly one fact, so recall returned exactly that fact's text.

In C2 (shared pool of 20 facts), Shodh's token count jumps to 66 tokens per query. Shodh's recall implementation returns the full text of each matched memory without summarization or truncation. When the pool contains 20 facts and recall is tuned to return top-k results, the returned context includes multiple full fact texts.

At 66 tok/query across 40 queries, Shodh injects 2,640 tokens of memory context into a typical session — versus Iranti's 800. At scale, this differential compounds across every turn that involves memory retrieval.

Shodh accuracy in pool: 92%

Shodh still finds the right answer 92% of the time — the accuracy degradation from isolated to pool is small (100% → 92%). The problem is not retrieval quality, it is injection volume. The correct fact is in the context, surrounded by other facts.

Isolated (C1) vs Pool (C2)
SystemIsolated tok/qPool tok/q
Iranti2020
Shodh2066
Mem01318
Graphiti3749
Iranti pool behavior

Iranti's attend-based injection returns only the entity facts relevant to the current query — 20 tokens whether the pool has 1 fact or 1,000. This is a consequence of structured entity+key addressing: the query maps to specific entity attributes, not a full-text search across all stored content.

Key findings

01

Iranti's structured attend-based injection returns only the relevant entity fact — 20 tok/query regardless of pool size.

02

Shodh's recall returns full memory text from the pool: 66 tok/query despite 92% accuracy collapses efficiency to 1.39.

03

Mem0 is lean (18 tok/query) but at 80% accuracy — efficiency score 4.44, second only to Iranti.

04

Graphiti returns ~49 tok/query of rephrased edge facts at 60% accuracy: worst efficiency score at 1.22.