▲ NEW Recall v0.1 — now in public beta·Read the launch post →

RECALL v0.1 · OPEN CORE · APACHE 2.0

Memory
that doesn't
suck.

Rust-core memory layer for AI agents. Seven-stage write pipeline rejects noise before storage. Five-retriever hybrid search. Typed schema with provenance. Embedded, self-hosted, or cloud — same API, same guarantees.

Get Started →See the pipeline ★ GitHub

<10%Junk rate

<200msP99 read

85%+LongMemEval

3 modesEmbedded · self · cloud

~/inbox3 — recall v0.1.0zsh

$ npm install @arc-labs/recall

+ @arc-labs/recall@0.1.0 (15MB · rust core bundled)

$ node

> const recall = new Recall(storage: './memory.db')

> await recall.remember(

messages: role: 'user', content: 'I just moved to Lisbon'

scope: user_id: 'u_1', agent_id: 'inbox3'

stored: 1, discarded: 4, merged: 0,

memory_ids: mem_01HX7R...,

type: fact, confidence: 0.94,

latency_ms: 187

01 Why Recall exists

Every memory store
ships with 98% junk.

We audited 10,134 production memory entries from the leading vendor. What we found was duplicates, hallucinated profiles, 4000-word state dumps, and the same memory repeated 668 times. This is why agents feel dumb.

97.8%

Junk rate

Duplicates, hallucinations, feedback-loop noise in a production Mem0 deployment.

668×

Same row, repeated

A single hallucinated memory stored 668 times because no pre-write filter caught it.

20s

Write latency

Reported sync writes blocking user responses. Self-hosted async client didn't exist.

$249/mo

To get a graph

Temporal and relational features paywalled behind Pro tier. No self-host escape.

02 Write pipeline

Five stages
between input and storage.

Most memory systems are a single function call: extract → store. Recall is a five-stage pipeline. The first stage rejects 40% with zero LLM calls. The middle stages are the quality moat. By the time a memory reaches storage, we're confident it deserves to be there.

01 / GATE

gate

Input validation. Rejects short, rate-limited, or pattern-matched turns before any API call. Zero LLM cost.

✕ REJECTED

Input: "I think maybe I like blue? hmm" → Rejected (low confidence, vague pattern)

40Rejected

0.3msLatency

03 Typed schema

Flat text loses
temporal signal.

A memory isn't a string. It has a subject, a predicate, an object, a valid-from timestamp, a confidence score, and a trail back to the turn that produced it. Recall stores five types — each with its own fields and lifecycle.

Fact

Stable information. Slow to change. High confidence.

subject — entity

predicate — string

object — any

confidence — float

provenance — turn[]

Preference

Revisable. Versioned. Supersession over deletion.

subject — entity

predicate — string

version — int

superseded_by — id?

valid_from — ts

Event

Something that happened at a specific moment. Immutable.

subject — entity

predicate — string

object — entity

event_at — ts

tags — string[]

Entity

People, projects, companies — canonical references.

canonical_name — str

entity_type — enum

aliases — str[]

attributes — kv

mention_count — int

Relation

Edges between entities. Graph queries, no graph DB.

from_entity — id

to_entity — id

relation_type — str

valid_from/to — ts

evidence — mem[]

04 Read pipeline4 STAGES · 5 RETRIEVERS + FUSION

Five retrievers,
fused in parallel.

Pure vector similarity loses on temporal reasoning — LongMemEval measures 5-65% accuracy across existing systems. Recall runs five retrievers concurrently and fuses them with Reciprocal Rank Fusion, then optionally reranks.

query> "what did Priya say about the deadline?"

semantic

bm25

graph

temporal

type filter

RRF FUSION · TOP RESULT

content > "Priya stated the deadline is Friday for Inbox3 MVP"
type > event · confidence > 0.92
when > 2026-04-15T14:30Z
latency > 47ms (parallel)

Semantic

Dense embedding similarity via pgvector (HNSW) or sqlite-vec. Catches paraphrases and implicit meaning.

pgvector · HNSW · cosine

Bm25

Proper nouns, exact terms, IDs. Vector search misses these — keyword search nails them.

tsvector · pg_trgm · tantivy

Graph

When the query mentions a known entity, we join on entity_id and pull related memories directly.

SQL join · entity_id index

Temporal

"Last week," "before I joined," "when we discussed this" — temporal cues route into a dedicated retriever.

timestamp btree · event_at

Type filter

"What does Priya prefer?" is a preference query, not a fact query. Type hints narrow the pool.

type column · enum index

RRF + cross-encoder rerank

Score-weighted RRF: weight × √score / (k + rank). Candidates from multiple retrievers accumulate higher scores. Optional second pass through a cross-encoder reranker for quality-critical queries.

Skipped for entity-specific queries where entity graph provides direct matches. 429 retry with backoff — quality preserved through rate limits.

RRF · cross-encoder · optional

05 Playground

Black box?
Not anymore.

The full playground is an IDE for memory. Paste a turn or ask a query — watch every pipeline stage execute with its LLM prompts, decisions, DB mutations, token count, and cost exposed. Behind the scenes, visible.

WRITE MODE

7-stage pipeline with LLM prompt & output inspection

READ MODE

5 retrievers in parallel · RRF fusion · optional rerank

OBSERVABILITY

Latency, tokens, cost, DB mutations — all live

Open the playground →

full-screen · 5 write fixtures · 4 read fixtures · editable

Launch →

06 Observability

Observability built-in.

Console dashboard reflecting the exact state of your pipelines. Every write, read, and quality signal in one view.

Last 24h

Writes: 847

P99 write: 1.2s

P99 read: 180ms

Cost: $3.42

$0.11/hour

Writes per minute (stacked by outcome)

storedmergeddiscarded

Quality

Junk rate:8.4%(↓2.1% vs 7d avg)✓

Duplicate rate:12.1%(↑0.5% vs 7d avg)▲

Contradiction rate:0.3/1k(stable)✓

Active alerts

▲ Duplicate rate rising — possible extractor regression

07 Why not them

The matrix, unfiltered.

Every row is a promise we're making. Every column is a competitor whose docs we read. We'd rather show the comparison than let you discover the gaps in production.

Capability	Mem0	Zep	Letta	AgentCore	Recall
Open source core	partial	partial	✓	✕	Apache 2.0
Embedded mode (SQLite)	✕	✕	✕	✕	zero deps
Self-host, no extra DB	✓	needs Neo4j	✓	✕	Postgres only
Typed memory schema	flat text	graph only	blocks	structured	5 types + lifecycle
Pre-write filter	✕	✕	✕	✕	7-stage pipeline
5-retriever hybrid	1	2	1	2	5 parallel
Temporal queries	✕	pro only	✕	✕	native
Transparent pricing	✕	opaque	✓	✕	pass-through LLM

08 Architecture

Architecture deep-dive.

Modular Rust core with multi-language bindings. Three layers, one codebase — from SDK surface to storage backend.

USER-FACING SURFACES

TS SDK

(napi-rs)

Python SDK

(async HTTP)

MCP Server

(binary)

REST API

(axum)

Recall Core (Rust)

Write pipeline

pre-filter→extract→classify→resolve refs→dedupe→conflict check→persist

Read pipeline

expand query→multi-retrieve→rerank→format

Background worker

consolidate→decay→prune→compact

Storage layer (pluggable)

Embedded:SQLite + sqlite-vec

Self-hosted:Postgres + pgvector + pg_trgm

Cloud:Managed Postgres + pgvector

09 Deploy

Run it your way.

Same API, same guarantees — whether it's a SQLite file on your laptop, Postgres in your VPC, or our managed cloud.

Embedded

Zero dependencies

SQLite + sqlite-vec. Single binary, single file. Your laptop, their laptop, CI/CD. Perfect for dev, prototyping, or privacy-sensitive workloads.

Runtime: WASM

Storage: SQLite + vec

Vectors: sqlite-vec

Size: ~15MB

Self-hosted

Postgres only

Docker + Postgres + pgvector. No Neo4j, no Redis, no separate vector DB. Full power, no ops tax. Perfect for teams with existing Postgres infra.

Runtime: Docker

Storage: Postgres

Vectors: pgvector

Size: ~50MB

Cloud

Managed

We run it for you. Auto-scaling, replicas, observability dashboard included. Pay for compute, not seats. Perfect for production at scale.

Runtime: managed

Storage: managed

Vectors: managed

Scale: unlimited

Memory that deserves to be remembered.

Get Started →See the pipeline ★ GitHub

npm install @arc-labs/recallcopy

Memory
that doesn't
suck.

<10%Junk rate

<200msP99 read

85%+LongMemEval

3 modesEmbedded · self · cloud

Capability

Mem0

Zep

Letta

AgentCore

Recall

Open source core

partial

✓

Apache 2.0

Embedded mode (SQLite)

zero deps

Self-host, no extra DB

✓

needs Neo4j

✓

Postgres only

Typed memory schema

flat text

graph only

blocks

structured

5 types + lifecycle

Pre-write filter

7-stage pipeline

5-retriever hybrid

5 parallel

Temporal queries

pro only

native

Transparent pricing

opaque

✓

pass-through LLM

Memorythat doesn'tsuck.

Every memory storeships with 98% junk.

Five stagesbetween input and storage.

gate

Flat text losestemporal signal.

Five retrievers,fused in parallel.

Semantic

Bm25

Graph

Temporal

Type filter

RRF + cross-encoder rerank

Black box?Not anymore.

Observability built-in.

The matrix, unfiltered.

Architecture deep-dive.

Run it your way.

Embedded

Self-hosted

Cloud

Memory that deserves to be remembered.

Memorythat doesn'tsuck.

Every memory storeships with 98% junk.

Five stagesbetween input and storage.

gate

Flat text losestemporal signal.

Five retrievers,fused in parallel.

Semantic

Bm25

Graph

Temporal

Type filter

RRF + cross-encoder rerank

Black box?Not anymore.

Observability built-in.

The matrix, unfiltered.

Architecture deep-dive.

Run it your way.

Embedded

Self-hosted

Cloud

Memory that deserves to be remembered.

Memory
that doesn't
suck.

Every memory store
ships with 98% junk.

Five stages
between input and storage.

Flat text loses
temporal signal.

Five retrievers,
fused in parallel.

Black box?
Not anymore.

Memory
that doesn't
suck.

Every memory store
ships with 98% junk.

Five stages
between input and storage.

Flat text loses
temporal signal.

Five retrievers,
fused in parallel.

Black box?
Not anymore.