The Agentic Memory Landscape

Memory, Context
& Graphs

How agent memory, context engineering, and knowledge graphs are converging in 2026, and where Living Memory and Alexandria fit.

The Problem

An army of amnesiacs.

Agents with skills and tool access can do remarkable work. But without shared memory, every session starts from zero. No lessons from last time. No precedent. No pattern recognition across sessions or across agents.

Without memory, agency does the same work repeatedly. An agent fleet without shared memory is an army of amnesiacs.

Memory is the multiplier; agency is the force. The question is: what kind of memory, and how does it compound?

The Evolution

From retrieval to learning.

2023

Naive RAG

Chunk documents, embed, retrieve by similarity. Read-only, one-shot. The agent gets "related text," not understanding.

→

2024

Advanced RAG

Hybrid search, reranking, HyDE, corrective loops. Better retrieval, but still stateless. The system doesn't learn from use.

→

2025

Agentic RAG

The agent decides what to retrieve, when, and how deep. Multi-hop reasoning. Adaptive routing. Still read-only at its core.

→

2026

AI Memory

Read-write. The system learns from every session. Shared across agents. Knowledge compounds, decays, self-maintains.

Each step addresses a limitation of the last. RAG gave agents knowledge. Agentic RAG gave them judgment about retrieval. AI Memory gives them the ability to learn, remember, and forget.

The 2026 Shift

Context engineering, not prompt engineering.

The industry consensus in 2026: the bottleneck isn't the model, it's the context the model receives. "Context engineering" is the discipline of assembling the right knowledge, at the right moment, at the right depth, for the right agent.

This is more than prompt design. It spans retrieval, filtering, scoping, progressive disclosure, token budgets, and deciding what to exclude. The model is only as good as what you put in the window.

Our entire architecture is a context engineering system. getContext, retrieval filters, workflow-scoped precedent, token budgets, progressive disclosure: these are all context engineering.

Prompt engineering

Craft the instruction. Static. One-size-fits-all.

RAG

Retrieve relevant chunks. Better, but blunt: similarity isn't understanding.

Context engineering

Assemble the full picture: lessons, precedent, entity context, scoped by workflow, bounded by token budget, with judgment about what to include and what to leave out.

Why Graphs

Search finds what's near your query.
Graphs find what's near your understanding.

Every sophisticated RAG pattern either adds graph structure (Graph RAG, Hybrid RAG) or adds agentic reasoning to compensate for search limitations (Adaptive RAG, Agentic RAG). Each escalation is a workaround for something curated graph traversal handles natively.

Graphs provide two things embeddings can't: typed relationships (decisions link to assumptions, assumptions link to evidence) and structural discovery (the third hop takes you to territory you didn't aim for). Search can't escape the query neighborhood. Graphs can.

	Vector search	Knowledge graph
Finds	Text similar to query	Entities connected to understanding
Discovery	Locked to query neighborhood	Multi-hop traversal across topics
Relationships	Implicit (cosine distance)	Explicit, typed, directional
Maintenance	Re-embed on change	Update claims, prune, challenge
Gives the agent	Related text	Reasoning structure

Context Graphs

The enterprise opportunity.

Foundation Capital calls context graphs "AI's trillion-dollar opportunity." The argument: systems of record capture what happened. Context graphs capture why, who, when, and what precedent. Decision traces, not just state changes.

Salesforce knows the opportunity stage moved. It doesn't know who approved the deviation, what precedent was referenced, or what the state was when the decision was made.

The gap isn't missing data. It's missing decision traces. Agents run into the same ambiguity humans resolve with judgment and organizational memory. But the inputs to those judgments aren't stored as durable artifacts. Context graphs make them durable, queryable, and shared.

Architecture

Three axes, not one ladder.

◆

Storage Axis

Living Memory (user/workspace) — Alexandria (cross-session) — Episodic Continuity (scratchpads + frames). Tells you where a fact lives.

◆

Interpretation Axis

Substrate (universal state) — Lens (DB-resident, RBAC, consumer-keyed) — Frame (DB-resident, transient, task-keyed). Tells you whose view.

◆

Action Axis

Typed state-changes — outcome → procedure feedback — intentional non-action. Tells you how it evolves.

Five-level ladders force a single dimension. Three-axis classification is what an architecture this rich actually needs. Every memory feature gets a coordinate on each axis — storage tells you where it lives, interpretation tells you whose view, action tells you how it evolves.

2026 Vocabulary

Substrate. Lens. Frame.

view = render(substrate, lens, frame)

One substrate, many lenses, transient frames. Sentra named substrate and lens. Frame is the missing transient layer — without it, per-task plans, intent, and suppression state have nowhere to live.

Frame

Transient. Task-keyed. Holds intent, next-steps, footprint, suppression.

DB

Lens

Persistent. Consumer-keyed. RBAC capabilities + defaults + watched state-changes.

DB

Substrate

Universal. Entities + facts + state-changes + decision traces + interactions + bi-temporal validity.

Storage

Why three, not two

Sentra ships substrate and lens. So does a well-built enterprise data platform. The transient layer is what's missing in every system we've evaluated: where does the current task's intent live? Where does "I tried this and rejected it" persist across sessions? Per-tool memory (Notion remembers Notion-stuff; Linear remembers Linear-stuff) accumulates substrate without ever building the frame. The frame is the durable position.

Lens packs are the productizable surface

The substrate generalizes. Lenses don't. A Sales lens, Success lens, Support lens, On-call lens, Exec lens — each is a packaged set of entity defaults, watched state-changes, and RBAC capabilities. Lens packs are what you sell to enterprise; substrate is what you preserve as the platform.

What's Different

Not storage. Metabolism.

Capability	Typical RAG / vector store	Living Memory + Alexandria
Retrieval	Embedding similarity, one-shot	Workflow-scoped, entity-anchored, multi-pass (berrypicking), token-budgeted
Learning	None. Re-embed on manual update.	Session-end reflection, digestive pipeline, lessons shared across agents
Forgetting	None. Accumulate forever.	Decay tiers, condensation, supersession, stale detection. Forgetting is architecture.
Connections	Implicit (cosine distance)	Typed, weighted, entity-anchored. Decisions link to assumptions link to evidence.
Quality	Whatever was embedded	Trust tiers, immune system (write-time validation), quality-gated claims
Presentation	Chunks returned by query	Claims installed as capabilities. [[wiki-links]] carry reasoning inline. Progressive disclosure.

Pressure-Tested

Validated against 30+ external sources.

We pressure-tested the architecture against LangChain, Engram, ALMA, MemGPT/Letta, Cognee, MAPLE, Anthropic's memory tool, a major 2026 academic survey, 8 RAG architecture patterns, the NLAH paper on harness engineering, Thoughtworks and OpenAI harness patterns, and the thought leadership of Karpathy, Cherny, and PAL. Two 2026 empirical papers additionally quantify Karpathy's file-system principle: Cao et al. (arXiv 2603.20432) show agents with filesystem-organized context outperforming SoTA by +17.3%; Lee et al. / Stanford (arXiv 2603.28052) show +7.7 points with 4× fewer tokens. What we found:

✓

Validated strengths

Hierarchical token-level memory (academic gold standard). Memory evolution (decay, condensation, supersession) ahead of the field. Propositional framing matches the "retrieve-then-generate" paradigm the industry is moving toward. Forgetting design is the most complete in any source evaluated. File-system-as-state: independently confirmed by two 2026 empirical papers (Cao et al. +17.3%; Lee et al. / Stanford +7.7pt) — the exact pattern this architecture already implements in .agent-data/, scratchpads, and decision traces.

◆

Gaps we're closing

Episodic memory (raw case records beneath lessons). Context pressure handling for long-running agents. Shared memory access control for multi-agent writes. Memory explainability and retrieval attribution. Event-triggered learning for critical failures.

◆

Borrowed from the field

Decay tiers (Engram). Memory strengthening (Cognee). Offline consolidation cycle (CLS theory). Context pressure events (Anthropic). Reflect-generate-verify (ALMA). M/L/P naming (MAPLE). Software 2.0 / LLM OS (Karpathy). Context engineering as discipline (Cherny). Code-as-reasoning (PAL). NLAH six-component harness decomposition (Pan et al.). Research→Plan→Implement workflow (HumanLayer).

Core Differentiators

Where we lead.

Substrate generalizes; lenses don't

One substrate, many lenses. Adding a Sales lens or On-call lens doesn't fork the underlying state. Per-tool memory in everyone else's stack does.

Frame is first-class

Transient task-binding with intent, next-steps, footprint, suppression. The layer Sentra and Glean don't ship.

State-change as a typed primitive

Every fact mutation emits before/after with consumer and frame annotation. Bi-temporal validity columns answer “what did the system believe on date X.”

Memory fragmentation is the moat

Notion remembers Notion-stuff. Linear remembers Linear-stuff. Cursor remembers Cursor-stuff. Glean searches across them but doesn't preserve state-change. The substrate that remembers across systems is the durable position.

Forgetting + intentional non-action

Decay, condensation, supersession at write time; trigger-layer act/wait/escalate/no-op decisions at action time. Restraint is structural, not silence.

Personal-scale dogfood, enterprise-shaped

Solo dogfood > enterprise vapor. Years of substrate accumulation under load is the credibility artifact. Lens-pack productization remains option-value.

The Landscape

The industry is converging
on what we're building.

RAG added graphs. Graphs added agents. Agent memory added learning loops and forgetting. Sentra named substrate and lens. We added frame. The 2026 architecture is one substrate, many lenses, transient frames — view = render(substrate, lens, frame). Validated across two platforms: Forge (an agentic project-orchestration platform on TypeScript / SQLite) and Strata (an autonomous job-search OS on Python / PostgreSQL + pgvector).

Memory fragmentation is the moat. The substrate that remembers across systems is the durable position.