forked from cardosofelipe/fast-next-template

Files

Felipe Cardoso e3fe0439fd docs(memory): add comprehensive memory system documentation (#101 )

Add complete documentation for the Agent Memory System including:
- Architecture overview with ASCII diagram
- Memory type descriptions (working, episodic, semantic, procedural)
- Usage examples for all memory operations
- Memory scoping hierarchy explanation
- Consolidation flow documentation
- MCP tools reference
- Reflection capabilities
- Configuration reference table
- Integration with Context Engine
- Metrics reference
- Performance targets
- Troubleshooting guide
- Directory structure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-05 11:03:57 +01:00

14 KiB

Raw Permalink Blame History

Agent Memory System

Comprehensive multi-tier cognitive memory for AI agents, enabling state persistence, experiential learning, and context continuity across sessions.

Overview

The Agent Memory System implements a cognitive architecture inspired by human memory:

+------------------------------------------------------------------+
|                      Agent Memory System                          |
+------------------------------------------------------------------+
|                                                                   |
|  +------------------+                    +------------------+     |
|  | Working Memory   |----consolidate---->| Episodic Memory  |     |
|  | (Redis/In-Mem)   |                    | (PostgreSQL)     |     |
|  |                  |                    |                  |     |
|  | - Current task   |                    | - Past sessions  |     |
|  | - Variables      |                    | - Experiences    |     |
|  | - Scratchpad     |                    | - Outcomes       |     |
|  +------------------+                    +--------+---------+     |
|                                                   |               |
|                                          extract  |               |
|                                                   v               |
|  +------------------+                    +------------------+     |
|  |Procedural Memory |<-----learn from----| Semantic Memory  |     |
|  | (PostgreSQL)     |                    | (PostgreSQL +    |     |
|  |                  |                    |    pgvector)     |     |
|  | - Procedures     |                    |                  |     |
|  | - Skills         |                    | - Facts          |     |
|  | - Patterns       |                    | - Entities       |     |
|  +------------------+                    | - Relationships  |     |
|                                          +------------------+     |
+------------------------------------------------------------------+

Memory Types

Working Memory

Short-term, session-scoped memory for current task state.

Features:

Key-value storage with TTL
Task state tracking
Scratchpad for reasoning
Checkpoint/restore support
Redis primary with in-memory fallback

Usage:

from app.services.memory.working import WorkingMemory

memory = WorkingMemory(scope_context)
await memory.set("key", {"data": "value"}, ttl_seconds=3600)
value = await memory.get("key")

# Task state
await memory.set_task_state(TaskState(task_id="t1", status="running"))
state = await memory.get_task_state()

# Checkpoints
checkpoint_id = await memory.create_checkpoint()
await memory.restore_checkpoint(checkpoint_id)

Episodic Memory

Experiential records of past agent actions and outcomes.

Features:

Records task completions and failures
Semantic similarity search (pgvector)
Temporal and outcome-based retrieval
Importance scoring
Episode summarization

Usage:

from app.services.memory.episodic import EpisodicMemory

memory = EpisodicMemory(session, embedder)

# Record an episode
episode = await memory.record_episode(
    project_id=project_id,
    episode=EpisodeCreate(
        task_type="code_review",
        task_description="Review PR #42",
        outcome=Outcome.SUCCESS,
        actions=[{"type": "analyze", "target": "src/"}],
    )
)

# Search similar experiences
similar = await memory.search_similar(
    project_id=project_id,
    query="debugging memory leak",
    limit=5
)

# Get recent episodes
recent = await memory.get_recent(project_id, limit=10)

Semantic Memory

Learned facts and knowledge with confidence scoring.

Features:

Triple format (subject, predicate, object)
Confidence scoring with decay
Fact extraction from episodes
Conflict resolution
Entity-based retrieval

Usage:

from app.services.memory.semantic import SemanticMemory

memory = SemanticMemory(session, embedder)

# Store a fact
fact = await memory.store_fact(
    project_id=project_id,
    fact=FactCreate(
        subject="UserService",
        predicate="handles",
        object="authentication",
        confidence=0.9,
    )
)

# Search facts
facts = await memory.search_facts(project_id, "authentication flow")

# Reinforce on repeated learning
await memory.reinforce_fact(fact.id)

Procedural Memory

Learned skills and procedures from successful patterns.

Features:

Procedure recording from task patterns
Trigger-based matching
Success rate tracking
Procedure suggestions
Step-by-step storage

Usage:

from app.services.memory.procedural import ProceduralMemory

memory = ProceduralMemory(session, embedder)

# Record a procedure
procedure = await memory.record_procedure(
    project_id=project_id,
    procedure=ProcedureCreate(
        name="PR Review Process",
        trigger_pattern="code review requested",
        steps=[
            Step(action="fetch_diff"),
            Step(action="analyze_changes"),
            Step(action="check_tests"),
        ]
    )
)

# Find matching procedures
matches = await memory.find_matching(project_id, "need to review code")

# Record outcomes
await memory.record_outcome(procedure.id, success=True)

Memory Scoping

Memory is organized in a hierarchical scope structure:

Global Memory (shared by all)
└── Project Memory (per project)
    └── Agent Type Memory (per agent type)
        └── Agent Instance Memory (per instance)
            └── Session Memory (ephemeral)

Usage:

from app.services.memory.scoping import ScopeManager, ScopeLevel

manager = ScopeManager(session)

# Get scoped memories with inheritance
memories = await manager.get_scoped_memories(
    context=ScopeContext(
        project_id=project_id,
        agent_type_id=agent_type_id,
        agent_instance_id=agent_instance_id,
        session_id=session_id,
    ),
    include_inherited=True,  # Include parent scopes
)

Memory Consolidation

Automatic background processes transfer and extract knowledge:

Working Memory ──> Episodic Memory ──> Semantic Memory
                                   └──> Procedural Memory

Consolidation Types:

working_to_episodic: Transfer session state to episodes (on session end)
episodic_to_semantic: Extract facts from experiences
episodic_to_procedural: Learn procedures from patterns
prune: Remove low-value memories

Celery Tasks:

from app.tasks.memory_consolidation import (
    consolidate_session,
    run_nightly_consolidation,
    prune_old_memories,
)

# Manual consolidation
consolidate_session.delay(session_id)

# Scheduled nightly (3 AM by default)
run_nightly_consolidation.delay()

Memory Retrieval

Hybrid Retrieval

Combine multiple retrieval strategies:

from app.services.memory.indexing import RetrievalEngine

engine = RetrievalEngine(session, embedder)

# Hybrid search across memory types
results = await engine.retrieve_hybrid(
    project_id=project_id,
    query="authentication error handling",
    memory_types=["episodic", "semantic", "procedural"],
    filters={"outcome": "success"},
    limit=10,
)

Index Types

Vector Index: Semantic similarity (HNSW/pgvector)
Temporal Index: Time-based retrieval
Entity Index: Entity mention lookup
Outcome Index: Success/failure filtering

MCP Tools

The memory system exposes MCP tools for agent use:

`remember`

Store information in memory.

{
  "memory_type": "working",
  "content": {"key": "value"},
  "importance": 0.8,
  "ttl_seconds": 3600
}

`recall`

Retrieve from memory.

{
  "query": "authentication patterns",
  "memory_types": ["episodic", "semantic"],
  "limit": 10,
  "filters": {"outcome": "success"}
}

`forget`

Remove from memory.

{
  "memory_type": "working",
  "key": "temp_data"
}

`reflect`

Analyze memory patterns.

{
  "analysis_type": "success_factors",
  "task_type": "code_review",
  "time_range_days": 30
}

`get_memory_stats`

Get memory usage statistics.

`record_outcome`

Record task success/failure for learning.

Memory Reflection

Analyze patterns and generate insights from memory:

from app.services.memory.reflection import MemoryReflection, TimeRange

reflection = MemoryReflection(session)

# Detect patterns
patterns = await reflection.analyze_patterns(
    project_id=project_id,
    time_range=TimeRange.last_days(30),
)

# Identify success factors
factors = await reflection.identify_success_factors(
    project_id=project_id,
    task_type="code_review",
)

# Detect anomalies
anomalies = await reflection.detect_anomalies(
    project_id=project_id,
    baseline_days=30,
)

# Generate insights
insights = await reflection.generate_insights(project_id)

# Comprehensive reflection
result = await reflection.reflect(project_id)
print(result.summary)

Configuration

All settings use the MEM_ environment variable prefix:

Variable	Default	Description
`MEM_WORKING_MEMORY_BACKEND`	`redis`	Backend: `redis` or `memory`
`MEM_WORKING_MEMORY_DEFAULT_TTL_SECONDS`	`3600`	Default TTL (1 hour)
`MEM_REDIS_URL`	`redis://localhost:6379/0`	Redis connection URL
`MEM_EPISODIC_MAX_EPISODES_PER_PROJECT`	`10000`	Max episodes per project
`MEM_EPISODIC_RETENTION_DAYS`	`365`	Episode retention period
`MEM_SEMANTIC_MAX_FACTS_PER_PROJECT`	`50000`	Max facts per project
`MEM_SEMANTIC_CONFIDENCE_DECAY_DAYS`	`90`	Confidence half-life
`MEM_EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model
`MEM_EMBEDDING_DIMENSIONS`	`1536`	Vector dimensions
`MEM_RETRIEVAL_MIN_SIMILARITY`	`0.5`	Minimum similarity score
`MEM_CONSOLIDATION_ENABLED`	`true`	Enable auto-consolidation
`MEM_CONSOLIDATION_SCHEDULE_CRON`	`0 3 * * *`	Nightly schedule
`MEM_CACHE_ENABLED`	`true`	Enable retrieval caching
`MEM_CACHE_TTL_SECONDS`	`300`	Cache TTL (5 minutes)

See app/services/memory/config.py for complete configuration options.

Integration with Context Engine

Memory integrates with the Context Engine as a context source:

from app.services.memory.integration import MemoryContextSource

# Register as context source
source = MemoryContextSource(memory_manager)
context_engine.register_source(source)

# Memory is automatically included in context assembly
context = await context_engine.assemble_context(
    project_id=project_id,
    session_id=session_id,
    current_task="Review authentication code",
)

Caching

Multi-layer caching for performance:

Hot Cache: Frequently accessed memories (LRU)
Retrieval Cache: Query result caching
Embedding Cache: Pre-computed embeddings

from app.services.memory.cache import CacheManager

cache = CacheManager(settings)
await cache.warm_hot_cache(project_id)  # Pre-warm common memories

Metrics

Prometheus-compatible metrics:

Metric	Type	Labels
`memory_operations_total`	Counter	operation, memory_type, scope, success
`memory_retrievals_total`	Counter	memory_type, strategy
`memory_cache_hits_total`	Counter	cache_type
`memory_retrieval_latency_seconds`	Histogram	-
`memory_consolidation_duration_seconds`	Histogram	-
`memory_items_count`	Gauge	memory_type, scope

from app.services.memory.metrics import get_memory_metrics

metrics = await get_memory_metrics()
summary = await metrics.get_summary()
prometheus_output = await metrics.get_prometheus_format()

Performance Targets

Operation	Target P95
Working memory get/set	< 5ms
Episodic memory retrieval	< 100ms
Semantic memory search	< 100ms
Procedural memory matching	< 50ms
Consolidation batch (1000 items)	< 30s

Troubleshooting

Redis Connection Issues

# Check Redis connectivity
redis-cli ping

# Verify memory settings
MEM_REDIS_URL=redis://localhost:6379/0

Slow Retrieval

Check if caching is enabled: MEM_CACHE_ENABLED=true
Verify HNSW indexes exist on vector columns
Monitor memory_retrieval_latency_seconds metric

High Memory Usage

Review MEM_EPISODIC_MAX_EPISODES_PER_PROJECT limit
Ensure pruning is enabled: MEM_PRUNING_ENABLED=true
Check consolidation is running (cron schedule)

Embedding Errors

Verify LLM Gateway is accessible
Check embedding model is valid
Review batch size if hitting rate limits

Directory Structure

app/services/memory/
├── __init__.py           # Public exports
├── config.py             # MemorySettings
├── exceptions.py         # Memory-specific errors
├── manager.py            # MemoryManager facade
├── types.py              # Core types
├── working/              # Working memory
│   ├── memory.py
│   └── storage.py
├── episodic/             # Episodic memory
│   ├── memory.py
│   ├── recorder.py
│   └── retrieval.py
├── semantic/             # Semantic memory
│   ├── memory.py
│   ├── extraction.py
│   └── verification.py
├── procedural/           # Procedural memory
│   ├── memory.py
│   └── matching.py
├── scoping/              # Memory scoping
│   ├── scope.py
│   └── resolver.py
├── indexing/             # Indexing & retrieval
│   ├── index.py
│   └── retrieval.py
├── consolidation/        # Memory consolidation
│   └── service.py
├── reflection/           # Memory reflection
│   ├── service.py
│   └── types.py
├── integration/          # External integrations
│   ├── context_source.py
│   └── lifecycle.py
├── cache/                # Caching layer
│   ├── cache_manager.py
│   ├── hot_cache.py
│   └── embedding_cache.py
├── mcp/                  # MCP tools
│   ├── service.py
│   └── tools.py
└── metrics/              # Observability
    └── collector.py

14 KiB Raw Permalink Blame History

Agent Memory System

Overview

Memory Types

Working Memory

Episodic Memory

Semantic Memory

Procedural Memory

Memory Scoping

Memory Consolidation

Memory Retrieval

Hybrid Retrieval

Index Types

MCP Tools

remember

recall

forget

reflect

get_memory_stats

record_outcome

Memory Reflection

Configuration

Integration with Context Engine

Caching

Metrics

Performance Targets

Troubleshooting

Redis Connection Issues

Slow Retrieval

High Memory Usage

Embedding Errors

Directory Structure

14 KiB

Raw Permalink Blame History

`remember`

`recall`

`forget`

`reflect`

`get_memory_stats`

`record_outcome`