How agent memory, context engineering, and knowledge graphs are converging in 2026, and where Living Memory and Alexandria fit.
Agents with skills and tool access can do remarkable work. But without shared memory, every session starts from zero. No lessons from last time. No precedent. No pattern recognition across sessions or across agents.
Memory is the multiplier; agency is the force. The question is: what kind of memory, and how does it compound?
Each step addresses a limitation of the last. RAG gave agents knowledge. Agentic RAG gave them judgment about retrieval. AI Memory gives them the ability to learn, remember, and forget.
The industry consensus in 2026: the bottleneck isn't the model, it's the context the model receives. "Context engineering" is the discipline of assembling the right knowledge, at the right moment, at the right depth, for the right agent.
This is more than prompt design. It spans retrieval, filtering, scoping, progressive disclosure, token budgets, and deciding what to exclude. The model is only as good as what you put in the window.
Our entire architecture is a context engineering system. getContext, retrieval filters, workflow-scoped precedent, token budgets, progressive disclosure: these are all context engineering.
Every sophisticated RAG pattern either adds graph structure (Graph RAG, Hybrid RAG) or adds agentic reasoning to compensate for search limitations (Adaptive RAG, Agentic RAG). Each escalation is a workaround for something curated graph traversal handles natively.
Graphs provide two things embeddings can't: typed relationships (decisions link to assumptions, assumptions link to evidence) and structural discovery (the third hop takes you to territory you didn't aim for). Search can't escape the query neighborhood. Graphs can.
| Vector search | Knowledge graph | |
|---|---|---|
| Finds | Text similar to query | Entities connected to understanding |
| Discovery | Locked to query neighborhood | Multi-hop traversal across topics |
| Relationships | Implicit (cosine distance) | Explicit, typed, directional |
| Maintenance | Re-embed on change | Update claims, prune, challenge |
| Gives the agent | Related text | Reasoning structure |
Foundation Capital calls context graphs "AI's trillion-dollar opportunity." The argument: systems of record capture what happened. Context graphs capture why, who, when, and what precedent. Decision traces, not just state changes.
The gap isn't missing data. It's missing decision traces. Agents run into the same ambiguity humans resolve with judgment and organizational memory. But the inputs to those judgments aren't stored as durable artifacts. Context graphs make them durable, queryable, and shared.
Living Memory (user/workspace) — Alexandria (cross-session) — Episodic Continuity (scratchpads + frames). Tells you where a fact lives.
Substrate (universal state) — Lens (DB-resident, RBAC, consumer-keyed) — Frame (DB-resident, transient, task-keyed). Tells you whose view.
Typed state-changes — outcome → procedure feedback — intentional non-action. Tells you how it evolves.
Five-level ladders force a single dimension. Three-axis classification is what an architecture this rich actually needs. Every memory feature gets a coordinate on each axis — storage tells you where it lives, interpretation tells you whose view, action tells you how it evolves.
view = render(substrate, lens, frame)
One substrate, many lenses, transient frames. Sentra named substrate and lens. Frame is the missing transient layer — without it, per-task plans, intent, and suppression state have nowhere to live.
Sentra ships substrate and lens. So does a well-built enterprise data platform. The transient layer is what's missing in every system we've evaluated: where does the current task's intent live? Where does "I tried this and rejected it" persist across sessions? Per-tool memory (Notion remembers Notion-stuff; Linear remembers Linear-stuff) accumulates substrate without ever building the frame. The frame is the durable position.
The substrate generalizes. Lenses don't. A Sales lens, Success lens, Support lens, On-call lens, Exec lens — each is a packaged set of entity defaults, watched state-changes, and RBAC capabilities. Lens packs are what you sell to enterprise; substrate is what you preserve as the platform.
| Capability | Typical RAG / vector store | Living Memory + Alexandria |
|---|---|---|
| Retrieval | Embedding similarity, one-shot | Workflow-scoped, entity-anchored, multi-pass (berrypicking), token-budgeted |
| Learning | None. Re-embed on manual update. | Session-end reflection, digestive pipeline, lessons shared across agents |
| Forgetting | None. Accumulate forever. | Decay tiers, condensation, supersession, stale detection. Forgetting is architecture. |
| Connections | Implicit (cosine distance) | Typed, weighted, entity-anchored. Decisions link to assumptions link to evidence. |
| Quality | Whatever was embedded | Trust tiers, immune system (write-time validation), quality-gated claims |
| Presentation | Chunks returned by query | Claims installed as capabilities. [[wiki-links]] carry reasoning inline. Progressive disclosure. |
We pressure-tested the architecture against LangChain, Engram, ALMA, MemGPT/Letta, Cognee, MAPLE, Anthropic's memory tool, a major 2026 academic survey, 8 RAG architecture patterns, the NLAH paper on harness engineering, Thoughtworks and OpenAI harness patterns, and the thought leadership of Karpathy, Cherny, and PAL. Two 2026 empirical papers additionally quantify Karpathy's file-system principle: Cao et al. (arXiv 2603.20432) show agents with filesystem-organized context outperforming SoTA by +17.3%; Lee et al. / Stanford (arXiv 2603.28052) show +7.7 points with 4× fewer tokens. What we found:
Hierarchical token-level memory (academic gold standard). Memory evolution (decay, condensation, supersession) ahead of the field. Propositional framing matches the "retrieve-then-generate" paradigm the industry is moving toward. Forgetting design is the most complete in any source evaluated. File-system-as-state: independently confirmed by two 2026 empirical papers (Cao et al. +17.3%; Lee et al. / Stanford +7.7pt) — the exact pattern this architecture already implements in .agent-data/, scratchpads, and decision traces.
Episodic memory (raw case records beneath lessons). Context pressure handling for long-running agents. Shared memory access control for multi-agent writes. Memory explainability and retrieval attribution. Event-triggered learning for critical failures.
Decay tiers (Engram). Memory strengthening (Cognee). Offline consolidation cycle (CLS theory). Context pressure events (Anthropic). Reflect-generate-verify (ALMA). M/L/P naming (MAPLE). Software 2.0 / LLM OS (Karpathy). Context engineering as discipline (Cherny). Code-as-reasoning (PAL). NLAH six-component harness decomposition (Pan et al.). Research→Plan→Implement workflow (HumanLayer).
RAG added graphs. Graphs added agents. Agent memory added learning loops and forgetting. Sentra named substrate and lens. We added frame. The 2026 architecture is one substrate, many lenses, transient frames — view = render(substrate, lens, frame). Validated across two platforms: Forge (an agentic project-orchestration platform on TypeScript / SQLite) and Strata (an autonomous job-search OS on Python / PostgreSQL + pgvector).
Memory fragmentation is the moat. The substrate that remembers across systems is the durable position.