Memory
that doesn't
suck.
Rust-core memory layer for AI agents. Seven-stage write pipeline rejects noise before storage. Five-retriever hybrid search. Typed schema with provenance. Embedded, self-hosted, or cloud — same API, same guarantees.
Every memory store
ships with 98% junk.
We audited 10,134 production memory entries from the leading vendor. What we found was duplicates, hallucinated profiles, 4000-word state dumps, and the same memory repeated 668 times. This is why agents feel dumb.
Duplicates, hallucinations, feedback-loop noise in a production Mem0 deployment.
A single hallucinated memory stored 668 times because no pre-write filter caught it.
Reported sync writes blocking user responses. Self-hosted async client didn't exist.
Temporal and relational features paywalled behind Pro tier. No self-host escape.
Five stages
between input and storage.
Most memory systems are a single function call: extract → store. Recall is a five-stage pipeline. The first stage rejects 40% with zero LLM calls. The middle stages are the quality moat. By the time a memory reaches storage, we're confident it deserves to be there.
Flat text loses
temporal signal.
A memory isn't a string. It has a subject, a predicate, an object, a valid-from timestamp, a confidence score, and a trail back to the turn that produced it. Recall stores five types — each with its own fields and lifecycle.
Stable information. Slow to change. High confidence.
Revisable. Versioned. Supersession over deletion.
Something that happened at a specific moment. Immutable.
People, projects, companies — canonical references.
Edges between entities. Graph queries, no graph DB.
Five retrievers,
fused in parallel.
Pure vector similarity loses on temporal reasoning — LongMemEval measures 5-65% accuracy across existing systems. Recall runs five retrievers concurrently and fuses them with Reciprocal Rank Fusion, then optionally reranks.
type > event · confidence > 0.92
when > 2026-04-15T14:30Z
latency > 47ms (parallel)
Semantic
Dense embedding similarity via pgvector (HNSW) or sqlite-vec. Catches paraphrases and implicit meaning.
Bm25
Proper nouns, exact terms, IDs. Vector search misses these — keyword search nails them.
Graph
When the query mentions a known entity, we join on entity_id and pull related memories directly.
Temporal
"Last week," "before I joined," "when we discussed this" — temporal cues route into a dedicated retriever.
Type filter
"What does Priya prefer?" is a preference query, not a fact query. Type hints narrow the pool.
RRF + cross-encoder rerank
Score-weighted RRF: weight × √score / (k + rank). Candidates from multiple retrievers accumulate higher scores. Optional second pass through a cross-encoder reranker for quality-critical queries.
Skipped for entity-specific queries where entity graph provides direct matches. 429 retry with backoff — quality preserved through rate limits.
Black box?
Not anymore.
The full playground is an IDE for memory. Paste a turn or ask a query — watch every pipeline stage execute with its LLM prompts, decisions, DB mutations, token count, and cost exposed. Behind the scenes, visible.
Observability built-in.
Console dashboard reflecting the exact state of your pipelines. Every write, read, and quality signal in one view.
The matrix, unfiltered.
Every row is a promise we're making. Every column is a competitor whose docs we read. We'd rather show the comparison than let you discover the gaps in production.
| Capability | Mem0 | Zep | Letta | AgentCore | Recall |
|---|---|---|---|---|---|
| Open source core | partial | partial | ✓ | ✕ | Apache 2.0 |
| Embedded mode (SQLite) | ✕ | ✕ | ✕ | ✕ | zero deps |
| Self-host, no extra DB | ✓ | needs Neo4j | ✓ | ✕ | Postgres only |
| Typed memory schema | flat text | graph only | blocks | structured | 5 types + lifecycle |
| Pre-write filter | ✕ | ✕ | ✕ | ✕ | 7-stage pipeline |
| 5-retriever hybrid | 1 | 2 | 1 | 2 | 5 parallel |
| Temporal queries | ✕ | pro only | ✕ | ✕ | native |
| Transparent pricing | ✕ | opaque | ✓ | ✕ | pass-through LLM |
Architecture deep-dive.
Modular Rust core with multi-language bindings. Three layers, one codebase — from SDK surface to storage backend.
Run it your way.
Same API, same guarantees — whether it's a SQLite file on your laptop, Postgres in your VPC, or our managed cloud.
Embedded
SQLite + sqlite-vec. Single binary, single file. Your laptop, their laptop, CI/CD. Perfect for dev, prototyping, or privacy-sensitive workloads.
Self-hosted
Docker + Postgres + pgvector. No Neo4j, no Redis, no separate vector DB. Full power, no ops tax. Perfect for teams with existing Postgres infra.
Cloud
We run it for you. Auto-scaling, replicas, observability dashboard included. Pay for compute, not seats. Perfect for production at scale.