forked from cardosofelipe/fast-next-template

Files

Felipe Cardoso 085a748929 feat(memory): #87 project setup & core architecture

Implements Sub-Issue #87 of Issue #62 (Agent Memory System).

Core infrastructure:
- memory/types.py: Type definitions for all memory types (Working, Episodic,
  Semantic, Procedural) with enums for MemoryType, ScopeLevel, Outcome
- memory/config.py: MemorySettings with MEM_ env prefix, thread-safe singleton
- memory/exceptions.py: Comprehensive exception hierarchy for memory operations
- memory/manager.py: MemoryManager facade with placeholder methods

Directory structure:
- working/: Working memory (Redis/in-memory) - to be implemented in #89
- episodic/: Episodic memory (experiences) - to be implemented in #90
- semantic/: Semantic memory (facts) - to be implemented in #91
- procedural/: Procedural memory (skills) - to be implemented in #92
- scoping/: Scope management - to be implemented in #93
- indexing/: Vector indexing - to be implemented in #94
- consolidation/: Memory consolidation - to be implemented in #95

Tests: 71 unit tests for config, types, and exceptions
Docs: Comprehensive implementation plan at docs/architecture/memory-system-plan.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-05 01:27:36 +01:00

18 KiB

Raw Blame History

Agent Memory System - Implementation Plan

Issue #62 - Part of Epic #60 (Phase 2: MCP Integration)

Branch: feature/62-agent-memory-system Parent Epic: #60 [EPIC] Phase 2: MCP Integration Dependencies: #56 (LLM Gateway), #57 (Knowledge Base), #61 (Context Management Engine)

Executive Summary

The Agent Memory System provides multi-tier cognitive memory for AI agents, enabling them to:

Maintain state across sessions (Working Memory)
Learn from past experiences (Episodic Memory)
Store and retrieve facts (Semantic Memory)
Develop and reuse procedures (Procedural Memory)

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                           Agent Memory System                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────┐                      ┌─────────────────┐               │
│  │ Working Memory  │───────────────────▶  │ Episodic Memory │               │
│  │ (Redis/In-Mem)  │    consolidate       │  (PostgreSQL)   │               │
│  │                 │                      │                 │               │
│  │ • Current task  │                      │ • Past sessions │               │
│  │ • Variables     │                      │ • Experiences   │               │
│  │ • Scratchpad    │                      │ • Outcomes      │               │
│  └─────────────────┘                      └────────┬────────┘               │
│                                                    │                         │
│                                           extract  │                         │
│                                                    ▼                         │
│  ┌─────────────────┐                      ┌─────────────────┐               │
│  │Procedural Memory│◀─────────────────────│ Semantic Memory │               │
│  │  (PostgreSQL)   │      learn from      │  (PostgreSQL +  │               │
│  │                 │                      │    pgvector)    │               │
│  │ • Procedures    │                      │                 │               │
│  │ • Skills        │                      │ • Facts         │               │
│  │ • Patterns      │                      │ • Entities      │               │
│  └─────────────────┘                      │ • Relationships │               │
│                                           └─────────────────┘               │
└─────────────────────────────────────────────────────────────────────────────┘

Memory Scoping Hierarchy

Global Memory (shared by all)
└── Project Memory (per project)
    └── Agent Type Memory (per agent type)
        └── Agent Instance Memory (per instance)
            └── Session Memory (ephemeral)

Sub-Issue Breakdown

Phase 1: Foundation (Critical Path)

Sub-Issue #62-1: Project Setup & Core Architecture

Priority: P0 - Must complete first Estimated Complexity: Medium

Tasks:

Create backend/app/services/memory/ directory structure
Create __init__.py with public API exports
Create config.py with MemorySettings (Pydantic)
Define base interfaces in types.py:
- MemoryItem - Base class for all memory items
- MemoryScope - Enum for scoping levels
- MemoryStore - Abstract base for storage backends
Create manager.py with MemoryManager class (facade)
Create exceptions.py with memory-specific errors
Write ADR-010 documenting memory architecture decisions
Create dependency injection setup
Unit tests for configuration and types

Deliverables:

Directory structure matching existing patterns (like context/, safety/)
Configuration with MEM_ env prefix
Type definitions for all memory concepts
Comprehensive unit tests

Sub-Issue #62-2: Database Schema & Storage Layer

Priority: P0 - Required for all memory types Estimated Complexity: High

Database Tables:

working_memory - Ephemeral key-value storage
- id (UUID, PK)
- scope_type (ENUM: global/project/agent_type/agent_instance/session)
- scope_id (VARCHAR - the ID for the scope level)
- key (VARCHAR)
- value (JSONB)
- expires_at (TIMESTAMP WITH TZ)
- created_at, updated_at
episodes - Experiential memories
- id (UUID, PK)
- project_id (UUID, FK)
- agent_instance_id (UUID, FK, nullable)
- agent_type_id (UUID, FK, nullable)
- session_id (VARCHAR)
- task_type (VARCHAR)
- task_description (TEXT)
- actions (JSONB)
- context_summary (TEXT)
- outcome (ENUM: success/failure/partial)
- outcome_details (TEXT)
- duration_seconds (FLOAT)
- tokens_used (BIGINT)
- lessons_learned (JSONB - list of strings)
- importance_score (FLOAT, 0-1)
- embedding (VECTOR(1536))
- occurred_at (TIMESTAMP WITH TZ)
- created_at, updated_at
facts - Semantic knowledge
- id (UUID, PK)
- project_id (UUID, FK, nullable - null for global)
- subject (VARCHAR)
- predicate (VARCHAR)
- object (TEXT)
- confidence (FLOAT, 0-1)
- source_episode_ids (UUID[])
- first_learned (TIMESTAMP WITH TZ)
- last_reinforced (TIMESTAMP WITH TZ)
- reinforcement_count (INT)
- embedding (VECTOR(1536))
- created_at, updated_at
procedures - Learned skills
- id (UUID, PK)
- project_id (UUID, FK, nullable)
- agent_type_id (UUID, FK, nullable)
- name (VARCHAR)
- trigger_pattern (TEXT)
- steps (JSONB)
- success_count (INT)
- failure_count (INT)
- last_used (TIMESTAMP WITH TZ)
- embedding (VECTOR(1536))
- created_at, updated_at
memory_consolidation_log - Consolidation tracking
- id (UUID, PK)
- consolidation_type (ENUM)
- source_count (INT)
- result_count (INT)
- started_at, completed_at
- status (ENUM: pending/running/completed/failed)
- error (TEXT, nullable)

Tasks:

Create SQLAlchemy models in backend/app/models/memory/
Create Alembic migration with all tables
Add pgvector indexes (HNSW for episodes, facts, procedures)
Create repository classes in backend/app/crud/memory/
Add composite indexes for common query patterns
Unit tests for all repositories

Sub-Issue #62-3: Working Memory Implementation

Priority: P0 - Core functionality Estimated Complexity: Medium

Components:

backend/app/services/memory/working/memory.py - WorkingMemory class
backend/app/services/memory/working/storage.py - Redis + in-memory backend

Features:

Session-scoped containers with automatic cleanup
Variable storage (get/set/delete)
Task state tracking (current step, status, progress)
Scratchpad for reasoning steps
Configurable capacity limits
TTL-based expiration
Checkpoint/snapshot support for recovery
Redis primary storage with in-memory fallback

API:

class WorkingMemory:
    async def set(self, key: str, value: Any, ttl_seconds: int | None = None) -> None
    async def get(self, key: str, default: Any = None) -> Any
    async def delete(self, key: str) -> bool
    async def exists(self, key: str) -> bool
    async def list_keys(self, pattern: str = "*") -> list[str]
    async def get_all(self) -> dict[str, Any]
    async def clear(self) -> int
    async def set_task_state(self, state: TaskState) -> None
    async def get_task_state(self) -> TaskState | None
    async def append_scratchpad(self, content: str) -> None
    async def get_scratchpad(self) -> list[str]
    async def create_checkpoint(self) -> str  # Returns checkpoint ID
    async def restore_checkpoint(self, checkpoint_id: str) -> None

Phase 2: Memory Types

Sub-Issue #62-4: Episodic Memory Implementation

Priority: P1 Estimated Complexity: High

Components:

backend/app/services/memory/episodic/memory.py - EpisodicMemory class
backend/app/services/memory/episodic/recorder.py - Episode recording
backend/app/services/memory/episodic/retrieval.py - Retrieval strategies

Features:

Episode recording during agent execution
Store task completions with context
Store failures with error context
Retrieval by semantic similarity (vector search)
Retrieval by recency
Retrieval by outcome (success/failure)
Importance scoring based on outcome significance
Episode summarization for long-term storage

API:

class EpisodicMemory:
    async def record_episode(self, episode: EpisodeCreate) -> Episode
    async def search_similar(self, query: str, limit: int = 10) -> list[Episode]
    async def get_recent(self, limit: int = 10, since: datetime | None = None) -> list[Episode]
    async def get_by_outcome(self, outcome: Outcome, limit: int = 10) -> list[Episode]
    async def get_by_task_type(self, task_type: str, limit: int = 10) -> list[Episode]
    async def update_importance(self, episode_id: UUID, score: float) -> None
    async def summarize_episodes(self, episode_ids: list[UUID]) -> str

Sub-Issue #62-5: Semantic Memory Implementation

Priority: P1 Estimated Complexity: High

Components:

backend/app/services/memory/semantic/memory.py - SemanticMemory class
backend/app/services/memory/semantic/extraction.py - Fact extraction from episodes
backend/app/services/memory/semantic/verification.py - Fact verification

Features:

Fact storage with triple format (subject, predicate, object)
Confidence scoring and decay
Fact extraction from episodic memory
Conflict resolution for contradictory facts
Retrieval by query (semantic search)
Retrieval by entity (subject or object)
Source tracking (which episodes contributed)
Reinforcement on repeated learning

API:

class SemanticMemory:
    async def store_fact(self, fact: FactCreate) -> Fact
    async def search_facts(self, query: str, limit: int = 10) -> list[Fact]
    async def get_by_entity(self, entity: str, limit: int = 20) -> list[Fact]
    async def reinforce_fact(self, fact_id: UUID) -> Fact
    async def deprecate_fact(self, fact_id: UUID, reason: str) -> None
    async def extract_facts_from_episode(self, episode: Episode) -> list[Fact]
    async def resolve_conflict(self, fact_ids: list[UUID]) -> Fact

Sub-Issue #62-6: Procedural Memory Implementation

Priority: P2 Estimated Complexity: Medium

Components:

backend/app/services/memory/procedural/memory.py - ProceduralMemory class
backend/app/services/memory/procedural/matching.py - Procedure matching

Features:

Procedure recording from successful task patterns
Trigger pattern matching
Step-by-step procedure storage
Success/failure rate tracking
Procedure suggestion based on context
Procedure versioning

API:

class ProceduralMemory:
    async def record_procedure(self, procedure: ProcedureCreate) -> Procedure
    async def find_matching(self, context: str, limit: int = 5) -> list[Procedure]
    async def record_outcome(self, procedure_id: UUID, success: bool) -> None
    async def get_best_procedure(self, task_type: str) -> Procedure | None
    async def update_steps(self, procedure_id: UUID, steps: list[Step]) -> Procedure

Phase 3: Advanced Features

Sub-Issue #62-7: Memory Scoping

Priority: P1 Estimated Complexity: Medium

Components:

backend/app/services/memory/scoping/scope.py - Scope management
backend/app/services/memory/scoping/resolver.py - Scope resolution

Features:

Global scope (shared across all)
Project scope (per project)
Agent type scope (per agent type)
Agent instance scope (per instance)
Session scope (ephemeral)
Scope inheritance (child sees parent memories)
Access control policies

Sub-Issue #62-8: Memory Indexing & Retrieval

Priority: P1 Estimated Complexity: High

Components:

backend/app/services/memory/indexing/index.py - Memory indexer
backend/app/services/memory/indexing/retrieval.py - Retrieval engine

Features:

Vector embeddings for all memory types
Temporal index (by time)
Entity index (by entities mentioned)
Outcome index (by success/failure)
Hybrid retrieval (vector + filters)
Relevance scoring
Retrieval caching

Sub-Issue #62-9: Memory Consolidation

Priority: P2 Estimated Complexity: High

Components:

backend/app/services/memory/consolidation/service.py - Consolidation service
backend/app/tasks/memory_consolidation.py - Celery tasks

Features:

Working → Episodic transfer (session end)
Episodic → Semantic extraction (learn facts)
Episodic → Procedural extraction (learn procedures)
Nightly consolidation Celery tasks
Memory pruning (remove low-value)
Importance-based retention

Phase 4: Integration

Sub-Issue #62-10: MCP Tools Definition

Priority: P0 - Required for agent usage Estimated Complexity: Medium

MCP Tools:

remember - Store in memory

{
  "memory_type": "working|episodic|semantic|procedural",
  "content": "...",
  "importance": 0.8,
  "ttl_seconds": 3600
}

recall - Retrieve from memory

{
  "query": "...",
  "memory_types": ["episodic", "semantic"],
  "limit": 10,
  "filters": {"outcome": "success"}
}

forget - Remove from memory

{
  "memory_type": "working",
  "key": "temp_calculation"
}

reflect - Analyze patterns

{
  "analysis_type": "recent_patterns|success_factors|failure_patterns"
}

get_memory_stats - Usage statistics
search_procedures - Find relevant procedures
record_outcome - Record task success/failure

Sub-Issue #62-11: Component Integration

Priority: P1 Estimated Complexity: Medium

Integrations:

Context Engine (#61) - Include relevant memories in context assembly
Knowledge Base (#57) - Coordinate with KB to avoid duplication
LLM Gateway (#56) - Use for embedding generation
Agent lifecycle hooks (spawn, pause, resume, terminate)

Sub-Issue #62-12: Caching Layer

Priority: P2 Estimated Complexity: Medium

Features:

Hot memory caching (frequently accessed)
Retrieval result caching
Embedding caching
Cache invalidation strategies

Phase 5: Intelligence & Quality

Sub-Issue #62-13: Memory Reflection

Priority: P3 Estimated Complexity: High

Features:

Pattern detection in episodic memory
Success/failure factor analysis
Anomaly detection
Insights generation

Sub-Issue #62-14: Metrics & Observability

Priority: P2 Estimated Complexity: Low

Metrics:

memory_size_bytes by type and scope
memory_operations_total counter
memory_retrieval_latency_seconds histogram
memory_consolidation_duration_seconds histogram
procedure_success_rate gauge

Sub-Issue #62-15: Documentation & Final Testing

Priority: P0 Estimated Complexity: Medium

Deliverables:

README with architecture overview
API documentation with examples
Integration guide
E2E tests for full memory lifecycle
Achieve >90% code coverage
Performance benchmarks

Implementation Order

Phase 1 (Foundation) - Sequential
  #62-1 → #62-2 → #62-3

Phase 2 (Memory Types) - Can parallelize after Phase 1
  #62-4, #62-5, #62-6 (parallel after #62-3)

Phase 3 (Advanced) - Sequential within phase
  #62-7 → #62-8 → #62-9

Phase 4 (Integration) - After Phase 2
  #62-10 → #62-11 → #62-12

Phase 5 (Quality) - Final
  #62-13, #62-14, #62-15

Performance Targets

Metric	Target	Notes
Working memory get/set	<5ms	P95
Episodic memory retrieval	<100ms	P95, as per epic
Semantic memory search	<100ms	P95
Procedural memory matching	<50ms	P95
Consolidation batch	<30s	Per 1000 episodes

Risk Mitigation

Embedding costs - Use caching aggressively, batch embeddings
Storage growth - Implement TTL, pruning, and archival policies
Query performance - HNSW indexes, pagination, query optimization
Scope complexity - Start simple (instance scope only), add hierarchy later

Review Checkpoints

After each sub-issue:

Run make validate-all
Multi-agent code review
Verify E2E stack still works
Commit with granular message

18 KiB Raw Blame History

Agent Memory System - Implementation Plan

Issue #62 - Part of Epic #60 (Phase 2: MCP Integration)

Executive Summary

Architecture Overview

Memory Scoping Hierarchy

Sub-Issue Breakdown

Phase 1: Foundation (Critical Path)

Sub-Issue #62-1: Project Setup & Core Architecture

Sub-Issue #62-2: Database Schema & Storage Layer

Sub-Issue #62-3: Working Memory Implementation

Phase 2: Memory Types

Sub-Issue #62-4: Episodic Memory Implementation

Sub-Issue #62-5: Semantic Memory Implementation

Sub-Issue #62-6: Procedural Memory Implementation

Phase 3: Advanced Features

Sub-Issue #62-7: Memory Scoping

Sub-Issue #62-8: Memory Indexing & Retrieval

Sub-Issue #62-9: Memory Consolidation

Phase 4: Integration

Sub-Issue #62-10: MCP Tools Definition

Sub-Issue #62-11: Component Integration

Sub-Issue #62-12: Caching Layer

Phase 5: Intelligence & Quality

Sub-Issue #62-13: Memory Reflection

Sub-Issue #62-14: Metrics & Observability

Sub-Issue #62-15: Documentation & Final Testing

Implementation Order

Performance Targets

Risk Mitigation

Review Checkpoints

18 KiB

Raw Blame History