Episodic Memory Retrieval & Injection

Problem

Stateless request handling causes agents to repeatedly rediscover decisions, constraints, and prior failures. Over multi-session workflows this leads to redundant work, inconsistent behavior, and shallow planning because each turn lacks durable historical context.

Solution

Add a vector-backed episodic memory store:

After every episode, write a short "memory blob" (event, outcome, rationale) to the DB.
On new tasks, embed the prompt, retrieve top-k similar memories, and inject as hints in the context.
Apply TTL or decay scoring to prune stale memories.

Design memory writes as structured records (decision, evidence, outcome, confidence) rather than raw transcripts. Structured memory reduces repetitive outputs and improves reasoning (ParamMem 2026). At retrieval time, filter by task scope and recency so injected memories improve reasoning quality instead of introducing retrieval noise. Episodic memory with self-reflection achieved 91% pass@1 on HumanEval vs 80% baseline (Reflexion, NeurIPS 2023).

How to use it

Use this in multi-session coding agents, support copilots, and long-running research workflows.
Start with a small top-k and strict metadata filters (task, repo, owner, timestamp).
Add memory quality review jobs to remove low-value or contradictory memories.
Track whether retrieved memories improved outcomes versus baseline.

Trade-offs

Pros: richer continuity, fewer repeated mistakes.
Cons: retrieval noise if memories aren't curated; storage cost.

References

Reflexion (Shinn et al., NeurIPS 2023): https://arxiv.org/abs/2303.11366
ParamMem (Yao et al., 2026): https://arxiv.org/abs/2602.23320v1
MemGPT (Packer et al., UC Berkeley 2023): https://arxiv.org/abs/2310.08560
Cursor "10x-MCP" persistent memory layer
Windsurf Memories docs
Primary source: https://forum.cursor.com/t/agentic-memory-management-for-cursor/78021