Files
syndarix/backend/docs/MEMORY_SYSTEM.md
Felipe Cardoso e3fe0439fd docs(memory): add comprehensive memory system documentation (#101)
Add complete documentation for the Agent Memory System including:
- Architecture overview with ASCII diagram
- Memory type descriptions (working, episodic, semantic, procedural)
- Usage examples for all memory operations
- Memory scoping hierarchy explanation
- Consolidation flow documentation
- MCP tools reference
- Reflection capabilities
- Configuration reference table
- Integration with Context Engine
- Metrics reference
- Performance targets
- Troubleshooting guide
- Directory structure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 11:03:57 +01:00

14 KiB

Agent Memory System

Comprehensive multi-tier cognitive memory for AI agents, enabling state persistence, experiential learning, and context continuity across sessions.

Overview

The Agent Memory System implements a cognitive architecture inspired by human memory:

+------------------------------------------------------------------+
|                      Agent Memory System                          |
+------------------------------------------------------------------+
|                                                                   |
|  +------------------+                    +------------------+     |
|  | Working Memory   |----consolidate---->| Episodic Memory  |     |
|  | (Redis/In-Mem)   |                    | (PostgreSQL)     |     |
|  |                  |                    |                  |     |
|  | - Current task   |                    | - Past sessions  |     |
|  | - Variables      |                    | - Experiences    |     |
|  | - Scratchpad     |                    | - Outcomes       |     |
|  +------------------+                    +--------+---------+     |
|                                                   |               |
|                                          extract  |               |
|                                                   v               |
|  +------------------+                    +------------------+     |
|  |Procedural Memory |<-----learn from----| Semantic Memory  |     |
|  | (PostgreSQL)     |                    | (PostgreSQL +    |     |
|  |                  |                    |    pgvector)     |     |
|  | - Procedures     |                    |                  |     |
|  | - Skills         |                    | - Facts          |     |
|  | - Patterns       |                    | - Entities       |     |
|  +------------------+                    | - Relationships  |     |
|                                          +------------------+     |
+------------------------------------------------------------------+

Memory Types

Working Memory

Short-term, session-scoped memory for current task state.

Features:

  • Key-value storage with TTL
  • Task state tracking
  • Scratchpad for reasoning
  • Checkpoint/restore support
  • Redis primary with in-memory fallback

Usage:

from app.services.memory.working import WorkingMemory

memory = WorkingMemory(scope_context)
await memory.set("key", {"data": "value"}, ttl_seconds=3600)
value = await memory.get("key")

# Task state
await memory.set_task_state(TaskState(task_id="t1", status="running"))
state = await memory.get_task_state()

# Checkpoints
checkpoint_id = await memory.create_checkpoint()
await memory.restore_checkpoint(checkpoint_id)

Episodic Memory

Experiential records of past agent actions and outcomes.

Features:

  • Records task completions and failures
  • Semantic similarity search (pgvector)
  • Temporal and outcome-based retrieval
  • Importance scoring
  • Episode summarization

Usage:

from app.services.memory.episodic import EpisodicMemory

memory = EpisodicMemory(session, embedder)

# Record an episode
episode = await memory.record_episode(
    project_id=project_id,
    episode=EpisodeCreate(
        task_type="code_review",
        task_description="Review PR #42",
        outcome=Outcome.SUCCESS,
        actions=[{"type": "analyze", "target": "src/"}],
    )
)

# Search similar experiences
similar = await memory.search_similar(
    project_id=project_id,
    query="debugging memory leak",
    limit=5
)

# Get recent episodes
recent = await memory.get_recent(project_id, limit=10)

Semantic Memory

Learned facts and knowledge with confidence scoring.

Features:

  • Triple format (subject, predicate, object)
  • Confidence scoring with decay
  • Fact extraction from episodes
  • Conflict resolution
  • Entity-based retrieval

Usage:

from app.services.memory.semantic import SemanticMemory

memory = SemanticMemory(session, embedder)

# Store a fact
fact = await memory.store_fact(
    project_id=project_id,
    fact=FactCreate(
        subject="UserService",
        predicate="handles",
        object="authentication",
        confidence=0.9,
    )
)

# Search facts
facts = await memory.search_facts(project_id, "authentication flow")

# Reinforce on repeated learning
await memory.reinforce_fact(fact.id)

Procedural Memory

Learned skills and procedures from successful patterns.

Features:

  • Procedure recording from task patterns
  • Trigger-based matching
  • Success rate tracking
  • Procedure suggestions
  • Step-by-step storage

Usage:

from app.services.memory.procedural import ProceduralMemory

memory = ProceduralMemory(session, embedder)

# Record a procedure
procedure = await memory.record_procedure(
    project_id=project_id,
    procedure=ProcedureCreate(
        name="PR Review Process",
        trigger_pattern="code review requested",
        steps=[
            Step(action="fetch_diff"),
            Step(action="analyze_changes"),
            Step(action="check_tests"),
        ]
    )
)

# Find matching procedures
matches = await memory.find_matching(project_id, "need to review code")

# Record outcomes
await memory.record_outcome(procedure.id, success=True)

Memory Scoping

Memory is organized in a hierarchical scope structure:

Global Memory (shared by all)
└── Project Memory (per project)
    └── Agent Type Memory (per agent type)
        └── Agent Instance Memory (per instance)
            └── Session Memory (ephemeral)

Usage:

from app.services.memory.scoping import ScopeManager, ScopeLevel

manager = ScopeManager(session)

# Get scoped memories with inheritance
memories = await manager.get_scoped_memories(
    context=ScopeContext(
        project_id=project_id,
        agent_type_id=agent_type_id,
        agent_instance_id=agent_instance_id,
        session_id=session_id,
    ),
    include_inherited=True,  # Include parent scopes
)

Memory Consolidation

Automatic background processes transfer and extract knowledge:

Working Memory ──> Episodic Memory ──> Semantic Memory
                                   └──> Procedural Memory

Consolidation Types:

  • working_to_episodic: Transfer session state to episodes (on session end)
  • episodic_to_semantic: Extract facts from experiences
  • episodic_to_procedural: Learn procedures from patterns
  • prune: Remove low-value memories

Celery Tasks:

from app.tasks.memory_consolidation import (
    consolidate_session,
    run_nightly_consolidation,
    prune_old_memories,
)

# Manual consolidation
consolidate_session.delay(session_id)

# Scheduled nightly (3 AM by default)
run_nightly_consolidation.delay()

Memory Retrieval

Hybrid Retrieval

Combine multiple retrieval strategies:

from app.services.memory.indexing import RetrievalEngine

engine = RetrievalEngine(session, embedder)

# Hybrid search across memory types
results = await engine.retrieve_hybrid(
    project_id=project_id,
    query="authentication error handling",
    memory_types=["episodic", "semantic", "procedural"],
    filters={"outcome": "success"},
    limit=10,
)

Index Types

  • Vector Index: Semantic similarity (HNSW/pgvector)
  • Temporal Index: Time-based retrieval
  • Entity Index: Entity mention lookup
  • Outcome Index: Success/failure filtering

MCP Tools

The memory system exposes MCP tools for agent use:

remember

Store information in memory.

{
  "memory_type": "working",
  "content": {"key": "value"},
  "importance": 0.8,
  "ttl_seconds": 3600
}

recall

Retrieve from memory.

{
  "query": "authentication patterns",
  "memory_types": ["episodic", "semantic"],
  "limit": 10,
  "filters": {"outcome": "success"}
}

forget

Remove from memory.

{
  "memory_type": "working",
  "key": "temp_data"
}

reflect

Analyze memory patterns.

{
  "analysis_type": "success_factors",
  "task_type": "code_review",
  "time_range_days": 30
}

get_memory_stats

Get memory usage statistics.

record_outcome

Record task success/failure for learning.

Memory Reflection

Analyze patterns and generate insights from memory:

from app.services.memory.reflection import MemoryReflection, TimeRange

reflection = MemoryReflection(session)

# Detect patterns
patterns = await reflection.analyze_patterns(
    project_id=project_id,
    time_range=TimeRange.last_days(30),
)

# Identify success factors
factors = await reflection.identify_success_factors(
    project_id=project_id,
    task_type="code_review",
)

# Detect anomalies
anomalies = await reflection.detect_anomalies(
    project_id=project_id,
    baseline_days=30,
)

# Generate insights
insights = await reflection.generate_insights(project_id)

# Comprehensive reflection
result = await reflection.reflect(project_id)
print(result.summary)

Configuration

All settings use the MEM_ environment variable prefix:

Variable Default Description
MEM_WORKING_MEMORY_BACKEND redis Backend: redis or memory
MEM_WORKING_MEMORY_DEFAULT_TTL_SECONDS 3600 Default TTL (1 hour)
MEM_REDIS_URL redis://localhost:6379/0 Redis connection URL
MEM_EPISODIC_MAX_EPISODES_PER_PROJECT 10000 Max episodes per project
MEM_EPISODIC_RETENTION_DAYS 365 Episode retention period
MEM_SEMANTIC_MAX_FACTS_PER_PROJECT 50000 Max facts per project
MEM_SEMANTIC_CONFIDENCE_DECAY_DAYS 90 Confidence half-life
MEM_EMBEDDING_MODEL text-embedding-3-small Embedding model
MEM_EMBEDDING_DIMENSIONS 1536 Vector dimensions
MEM_RETRIEVAL_MIN_SIMILARITY 0.5 Minimum similarity score
MEM_CONSOLIDATION_ENABLED true Enable auto-consolidation
MEM_CONSOLIDATION_SCHEDULE_CRON 0 3 * * * Nightly schedule
MEM_CACHE_ENABLED true Enable retrieval caching
MEM_CACHE_TTL_SECONDS 300 Cache TTL (5 minutes)

See app/services/memory/config.py for complete configuration options.

Integration with Context Engine

Memory integrates with the Context Engine as a context source:

from app.services.memory.integration import MemoryContextSource

# Register as context source
source = MemoryContextSource(memory_manager)
context_engine.register_source(source)

# Memory is automatically included in context assembly
context = await context_engine.assemble_context(
    project_id=project_id,
    session_id=session_id,
    current_task="Review authentication code",
)

Caching

Multi-layer caching for performance:

  • Hot Cache: Frequently accessed memories (LRU)
  • Retrieval Cache: Query result caching
  • Embedding Cache: Pre-computed embeddings
from app.services.memory.cache import CacheManager

cache = CacheManager(settings)
await cache.warm_hot_cache(project_id)  # Pre-warm common memories

Metrics

Prometheus-compatible metrics:

Metric Type Labels
memory_operations_total Counter operation, memory_type, scope, success
memory_retrievals_total Counter memory_type, strategy
memory_cache_hits_total Counter cache_type
memory_retrieval_latency_seconds Histogram -
memory_consolidation_duration_seconds Histogram -
memory_items_count Gauge memory_type, scope
from app.services.memory.metrics import get_memory_metrics

metrics = await get_memory_metrics()
summary = await metrics.get_summary()
prometheus_output = await metrics.get_prometheus_format()

Performance Targets

Operation Target P95
Working memory get/set < 5ms
Episodic memory retrieval < 100ms
Semantic memory search < 100ms
Procedural memory matching < 50ms
Consolidation batch (1000 items) < 30s

Troubleshooting

Redis Connection Issues

# Check Redis connectivity
redis-cli ping

# Verify memory settings
MEM_REDIS_URL=redis://localhost:6379/0

Slow Retrieval

  1. Check if caching is enabled: MEM_CACHE_ENABLED=true
  2. Verify HNSW indexes exist on vector columns
  3. Monitor memory_retrieval_latency_seconds metric

High Memory Usage

  1. Review MEM_EPISODIC_MAX_EPISODES_PER_PROJECT limit
  2. Ensure pruning is enabled: MEM_PRUNING_ENABLED=true
  3. Check consolidation is running (cron schedule)

Embedding Errors

  1. Verify LLM Gateway is accessible
  2. Check embedding model is valid
  3. Review batch size if hitting rate limits

Directory Structure

app/services/memory/
├── __init__.py           # Public exports
├── config.py             # MemorySettings
├── exceptions.py         # Memory-specific errors
├── manager.py            # MemoryManager facade
├── types.py              # Core types
├── working/              # Working memory
│   ├── memory.py
│   └── storage.py
├── episodic/             # Episodic memory
│   ├── memory.py
│   ├── recorder.py
│   └── retrieval.py
├── semantic/             # Semantic memory
│   ├── memory.py
│   ├── extraction.py
│   └── verification.py
├── procedural/           # Procedural memory
│   ├── memory.py
│   └── matching.py
├── scoping/              # Memory scoping
│   ├── scope.py
│   └── resolver.py
├── indexing/             # Indexing & retrieval
│   ├── index.py
│   └── retrieval.py
├── consolidation/        # Memory consolidation
│   └── service.py
├── reflection/           # Memory reflection
│   ├── service.py
│   └── types.py
├── integration/          # External integrations
│   ├── context_source.py
│   └── lifecycle.py
├── cache/                # Caching layer
│   ├── cache_manager.py
│   ├── hot_cache.py
│   └── embedding_cache.py
├── mcp/                  # MCP tools
│   ├── service.py
│   └── tools.py
└── metrics/              # Observability
    └── collector.py