Recall
DocsSign InGet Started →
▲ NEW Recall v0.1 — now in public beta·Read the launch post →
RECALL v0.1 · OPEN CORE · APACHE 2.0

Memory
that doesn't
suck.

Rust-core memory layer for AI agents. Seven-stage write pipeline rejects noise before storage. Five-retriever hybrid search. Typed schema with provenance. Embedded, self-hosted, or cloud — same API, same guarantees.

Get Started →See the pipeline★ GitHub
<10%Junk rate
<200msP99 read
85%+LongMemEval
3 modesEmbedded · self · cloud
~/inbox3 — recall v0.1.0zsh
$ npm install @arc-labs/recall
+ @arc-labs/recall@0.1.0 (15MB · rust core bundled)
 
$ node
> const recall = new Recall(storage: './memory.db')
> await recall.remember(
messages: role: 'user', content: 'I just moved to Lisbon'
scope: user_id: 'u_1', agent_id: 'inbox3'
>
stored: 1, discarded: 4, merged: 0,
memory_ids: mem_01HX7R...,
type: fact, confidence: 0.94,
latency_ms: 187
 
>
01 Why Recall exists

Every memory store
ships with 98% junk.

We audited 10,134 production memory entries from the leading vendor. What we found was duplicates, hallucinated profiles, 4000-word state dumps, and the same memory repeated 668 times. This is why agents feel dumb.

97.8%
Junk rate

Duplicates, hallucinations, feedback-loop noise in a production Mem0 deployment.

668×
Same row, repeated

A single hallucinated memory stored 668 times because no pre-write filter caught it.

20s
Write latency

Reported sync writes blocking user responses. Self-hosted async client didn&apos;t exist.

$249/mo
To get a graph

Temporal and relational features paywalled behind Pro tier. No self-host escape.

02 Write pipeline

Five stages
between input and storage.

Most memory systems are a single function call: extract → store. Recall is a five-stage pipeline. The first stage rejects 40% with zero LLM calls. The middle stages are the quality moat. By the time a memory reaches storage, we're confident it deserves to be there.

01 / GATE

gate

Input validation. Rejects short, rate-limited, or pattern-matched turns before any API call. Zero LLM cost.

✕ REJECTED
Input: "I think maybe I like blue? hmm" → Rejected (low confidence, vague pattern)
40Rejected
0.3msLatency
03 Typed schema

Flat text loses
temporal signal.

A memory isn't a string. It has a subject, a predicate, an object, a valid-from timestamp, a confidence score, and a trail back to the turn that produced it. Recall stores five types — each with its own fields and lifecycle.

F
Fact

Stable information. Slow to change. High confidence.

subject — entity
predicate — string
object — any
confidence — float
provenance — turn[]
P
Preference

Revisable. Versioned. Supersession over deletion.

subject — entity
predicate — string
version — int
superseded_by — id?
valid_from — ts
E
Event

Something that happened at a specific moment. Immutable.

subject — entity
predicate — string
object — entity
event_at — ts
tags — string[]
N
Entity

People, projects, companies — canonical references.

canonical_name — str
entity_type — enum
aliases — str[]
attributes — kv
mention_count — int
R
Relation

Edges between entities. Graph queries, no graph DB.

from_entity — id
to_entity — id
relation_type — str
valid_from/to — ts
evidence — mem[]
04 Read pipeline4 STAGES · 5 RETRIEVERS + FUSION

Five retrievers,
fused in parallel.

Pure vector similarity loses on temporal reasoning — LongMemEval measures 5-65% accuracy across existing systems. Recall runs five retrievers concurrently and fuses them with Reciprocal Rank Fusion, then optionally reranks.

query> "what did Priya say about the deadline?"
semantic
88
bm25
64
graph
92
temporal
41
type filter
73
RRF FUSION · TOP RESULT
content > "Priya stated the deadline is Friday for Inbox3 MVP"
type > event · confidence > 0.92
when > 2026-04-15T14:30Z
latency > 47ms (parallel)
01

Semantic

Dense embedding similarity via pgvector (HNSW) or sqlite-vec. Catches paraphrases and implicit meaning.

pgvector · HNSW · cosine
02

Bm25

Proper nouns, exact terms, IDs. Vector search misses these — keyword search nails them.

tsvector · pg_trgm · tantivy
03

Graph

When the query mentions a known entity, we join on entity_id and pull related memories directly.

SQL join · entity_id index
04

Temporal

&quot;Last week,&quot; &quot;before I joined,&quot; &quot;when we discussed this&quot; — temporal cues route into a dedicated retriever.

timestamp btree · event_at
05

Type filter

&quot;What does Priya prefer?&quot; is a preference query, not a fact query. Type hints narrow the pool.

type column · enum index
+

RRF + cross-encoder rerank

Score-weighted RRF: weight × √score / (k + rank). Candidates from multiple retrievers accumulate higher scores. Optional second pass through a cross-encoder reranker for quality-critical queries.

Skipped for entity-specific queries where entity graph provides direct matches. 429 retry with backoff — quality preserved through rate limits.

RRF · cross-encoder · optional
05 Playground

Black box?
Not anymore.

The full playground is an IDE for memory. Paste a turn or ask a query — watch every pipeline stage execute with its LLM prompts, decisions, DB mutations, token count, and cost exposed. Behind the scenes, visible.

WRITE MODE
7-stage pipeline with LLM prompt & output inspection
READ MODE
5 retrievers in parallel · RRF fusion · optional rerank
OBSERVABILITY
Latency, tokens, cost, DB mutations — all live
Open the playground →
full-screen · 5 write fixtures · 4 read fixtures · editable
Launch →
06 Observability

Observability built-in.

Console dashboard reflecting the exact state of your pipelines. Every write, read, and quality signal in one view.

Last 24h
Writes: 847
P99 write: 1.2s
Reads: 12,341
P99 read: 180ms
Cost: $3.42
$0.11/hour
Writes per minute (stacked by outcome)
storedmergeddiscarded
Quality
Junk rate:8.4%(↓2.1% vs 7d avg)✓
Duplicate rate:12.1%(↑0.5% vs 7d avg)▲
Contradiction rate:0.3/1k(stable)✓
Active alerts
▲ Duplicate rate rising — possible extractor regression
07 Why not them

The matrix, unfiltered.

Every row is a promise we're making. Every column is a competitor whose docs we read. We'd rather show the comparison than let you discover the gaps in production.

CapabilityMem0ZepLettaAgentCoreRecall
Open source corepartialpartial✓✕Apache 2.0
Embedded mode (SQLite)✕✕✕✕zero deps
Self-host, no extra DB✓needs Neo4j✓✕Postgres only
Typed memory schemaflat textgraph onlyblocksstructured5 types + lifecycle
Pre-write filter✕✕✕✕7-stage pipeline
5-retriever hybrid12125 parallel
Temporal queries✕pro only✕✕native
Transparent pricing✕opaque✓✕pass-through LLM
08 Architecture

Architecture deep-dive.

Modular Rust core with multi-language bindings. Three layers, one codebase — from SDK surface to storage backend.

USER-FACING SURFACES
TS SDK
(napi-rs)
Python SDK
(async HTTP)
MCP Server
(binary)
REST API
(axum)
Recall Core (Rust)
Write pipeline
pre-filter→extract→classify→resolve refs→dedupe→conflict check→persist
Read pipeline
expand query→multi-retrieve→rerank→format
Background worker
consolidate→decay→prune→compact
Storage layer (pluggable)
Embedded:SQLite + sqlite-vec
Self-hosted:Postgres + pgvector + pg_trgm
Cloud:Managed Postgres + pgvector
09 Deploy

Run it your way.

Same API, same guarantees — whether it's a SQLite file on your laptop, Postgres in your VPC, or our managed cloud.

memory.db

Embedded

Zero dependencies

SQLite + sqlite-vec. Single binary, single file. Your laptop, their laptop, CI/CD. Perfect for dev, prototyping, or privacy-sensitive workloads.

Runtime: WASM
Storage: SQLite + vec
Vectors: sqlite-vec
Size: ~15MB
your-vpc:5432

Self-hosted

Postgres only

Docker + Postgres + pgvector. No Neo4j, no Redis, no separate vector DB. Full power, no ops tax. Perfect for teams with existing Postgres infra.

Runtime: Docker
Storage: Postgres
Vectors: pgvector
Size: ~50MB
recall.arc-labs.ai

Cloud

Managed

We run it for you. Auto-scaling, replicas, observability dashboard included. Pay for compute, not seats. Perfect for production at scale.

Runtime: managed
Storage: managed
Vectors: managed
Scale: unlimited

Memory that deserves to be remembered.

Get Started →See the pipeline★ GitHub
npm install @arc-labs/recallcopy
Recall

Memory that agents can actually use. Rust-core, seven-stage write pipeline, five-retriever hybrid search. Open source Apache 2.0.

GitHubTwitterDiscord
Product
  • Features
  • Architecture
  • Pricing
  • Changelog
Developers
  • Documentation
  • TypeScript SDK
  • Python SDK
  • API Reference
Community
  • GitHub
  • Discord
  • Twitter
Company
  • Arc Labs
  • About
  • Careers
© 2026 Arc Labs · Bangalore

RECALL