forked from cardosofelipe/fast-next-template
feat(memory): #87 project setup & core architecture
Implements Sub-Issue #87 of Issue #62 (Agent Memory System). Core infrastructure: - memory/types.py: Type definitions for all memory types (Working, Episodic, Semantic, Procedural) with enums for MemoryType, ScopeLevel, Outcome - memory/config.py: MemorySettings with MEM_ env prefix, thread-safe singleton - memory/exceptions.py: Comprehensive exception hierarchy for memory operations - memory/manager.py: MemoryManager facade with placeholder methods Directory structure: - working/: Working memory (Redis/in-memory) - to be implemented in #89 - episodic/: Episodic memory (experiences) - to be implemented in #90 - semantic/: Semantic memory (facts) - to be implemented in #91 - procedural/: Procedural memory (skills) - to be implemented in #92 - scoping/: Scope management - to be implemented in #93 - indexing/: Vector indexing - to be implemented in #94 - consolidation/: Memory consolidation - to be implemented in #95 Tests: 71 unit tests for config, types, and exceptions Docs: Comprehensive implementation plan at docs/architecture/memory-system-plan.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
526
docs/architecture/memory-system-plan.md
Normal file
526
docs/architecture/memory-system-plan.md
Normal file
@@ -0,0 +1,526 @@
|
||||
# Agent Memory System - Implementation Plan
|
||||
|
||||
## Issue #62 - Part of Epic #60 (Phase 2: MCP Integration)
|
||||
|
||||
**Branch:** `feature/62-agent-memory-system`
|
||||
**Parent Epic:** #60 [EPIC] Phase 2: MCP Integration
|
||||
**Dependencies:** #56 (LLM Gateway), #57 (Knowledge Base), #61 (Context Management Engine)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The Agent Memory System provides multi-tier cognitive memory for AI agents, enabling them to:
|
||||
- Maintain state across sessions (Working Memory)
|
||||
- Learn from past experiences (Episodic Memory)
|
||||
- Store and retrieve facts (Semantic Memory)
|
||||
- Develop and reuse procedures (Procedural Memory)
|
||||
|
||||
### Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Agent Memory System │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ Working Memory │───────────────────▶ │ Episodic Memory │ │
|
||||
│ │ (Redis/In-Mem) │ consolidate │ (PostgreSQL) │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ • Current task │ │ • Past sessions │ │
|
||||
│ │ • Variables │ │ • Experiences │ │
|
||||
│ │ • Scratchpad │ │ • Outcomes │ │
|
||||
│ └─────────────────┘ └────────┬────────┘ │
|
||||
│ │ │
|
||||
│ extract │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │Procedural Memory│◀─────────────────────│ Semantic Memory │ │
|
||||
│ │ (PostgreSQL) │ learn from │ (PostgreSQL + │ │
|
||||
│ │ │ │ pgvector) │ │
|
||||
│ │ • Procedures │ │ │ │
|
||||
│ │ • Skills │ │ • Facts │ │
|
||||
│ │ • Patterns │ │ • Entities │ │
|
||||
│ └─────────────────┘ │ • Relationships │ │
|
||||
│ └─────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Memory Scoping Hierarchy
|
||||
|
||||
```
|
||||
Global Memory (shared by all)
|
||||
└── Project Memory (per project)
|
||||
└── Agent Type Memory (per agent type)
|
||||
└── Agent Instance Memory (per instance)
|
||||
└── Session Memory (ephemeral)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sub-Issue Breakdown
|
||||
|
||||
### Phase 1: Foundation (Critical Path)
|
||||
|
||||
#### Sub-Issue #62-1: Project Setup & Core Architecture
|
||||
**Priority:** P0 - Must complete first
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Tasks:**
|
||||
- [ ] Create `backend/app/services/memory/` directory structure
|
||||
- [ ] Create `__init__.py` with public API exports
|
||||
- [ ] Create `config.py` with `MemorySettings` (Pydantic)
|
||||
- [ ] Define base interfaces in `types.py`:
|
||||
- `MemoryItem` - Base class for all memory items
|
||||
- `MemoryScope` - Enum for scoping levels
|
||||
- `MemoryStore` - Abstract base for storage backends
|
||||
- [ ] Create `manager.py` with `MemoryManager` class (facade)
|
||||
- [ ] Create `exceptions.py` with memory-specific errors
|
||||
- [ ] Write ADR-010 documenting memory architecture decisions
|
||||
- [ ] Create dependency injection setup
|
||||
- [ ] Unit tests for configuration and types
|
||||
|
||||
**Deliverables:**
|
||||
- Directory structure matching existing patterns (like `context/`, `safety/`)
|
||||
- Configuration with MEM_ env prefix
|
||||
- Type definitions for all memory concepts
|
||||
- Comprehensive unit tests
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-2: Database Schema & Storage Layer
|
||||
**Priority:** P0 - Required for all memory types
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Database Tables:**
|
||||
|
||||
1. **`working_memory`** - Ephemeral key-value storage
|
||||
- `id` (UUID, PK)
|
||||
- `scope_type` (ENUM: global/project/agent_type/agent_instance/session)
|
||||
- `scope_id` (VARCHAR - the ID for the scope level)
|
||||
- `key` (VARCHAR)
|
||||
- `value` (JSONB)
|
||||
- `expires_at` (TIMESTAMP WITH TZ)
|
||||
- `created_at`, `updated_at`
|
||||
|
||||
2. **`episodes`** - Experiential memories
|
||||
- `id` (UUID, PK)
|
||||
- `project_id` (UUID, FK)
|
||||
- `agent_instance_id` (UUID, FK, nullable)
|
||||
- `agent_type_id` (UUID, FK, nullable)
|
||||
- `session_id` (VARCHAR)
|
||||
- `task_type` (VARCHAR)
|
||||
- `task_description` (TEXT)
|
||||
- `actions` (JSONB)
|
||||
- `context_summary` (TEXT)
|
||||
- `outcome` (ENUM: success/failure/partial)
|
||||
- `outcome_details` (TEXT)
|
||||
- `duration_seconds` (FLOAT)
|
||||
- `tokens_used` (BIGINT)
|
||||
- `lessons_learned` (JSONB - list of strings)
|
||||
- `importance_score` (FLOAT, 0-1)
|
||||
- `embedding` (VECTOR(1536))
|
||||
- `occurred_at` (TIMESTAMP WITH TZ)
|
||||
- `created_at`, `updated_at`
|
||||
|
||||
3. **`facts`** - Semantic knowledge
|
||||
- `id` (UUID, PK)
|
||||
- `project_id` (UUID, FK, nullable - null for global)
|
||||
- `subject` (VARCHAR)
|
||||
- `predicate` (VARCHAR)
|
||||
- `object` (TEXT)
|
||||
- `confidence` (FLOAT, 0-1)
|
||||
- `source_episode_ids` (UUID[])
|
||||
- `first_learned` (TIMESTAMP WITH TZ)
|
||||
- `last_reinforced` (TIMESTAMP WITH TZ)
|
||||
- `reinforcement_count` (INT)
|
||||
- `embedding` (VECTOR(1536))
|
||||
- `created_at`, `updated_at`
|
||||
|
||||
4. **`procedures`** - Learned skills
|
||||
- `id` (UUID, PK)
|
||||
- `project_id` (UUID, FK, nullable)
|
||||
- `agent_type_id` (UUID, FK, nullable)
|
||||
- `name` (VARCHAR)
|
||||
- `trigger_pattern` (TEXT)
|
||||
- `steps` (JSONB)
|
||||
- `success_count` (INT)
|
||||
- `failure_count` (INT)
|
||||
- `last_used` (TIMESTAMP WITH TZ)
|
||||
- `embedding` (VECTOR(1536))
|
||||
- `created_at`, `updated_at`
|
||||
|
||||
5. **`memory_consolidation_log`** - Consolidation tracking
|
||||
- `id` (UUID, PK)
|
||||
- `consolidation_type` (ENUM)
|
||||
- `source_count` (INT)
|
||||
- `result_count` (INT)
|
||||
- `started_at`, `completed_at`
|
||||
- `status` (ENUM: pending/running/completed/failed)
|
||||
- `error` (TEXT, nullable)
|
||||
|
||||
**Tasks:**
|
||||
- [ ] Create SQLAlchemy models in `backend/app/models/memory/`
|
||||
- [ ] Create Alembic migration with all tables
|
||||
- [ ] Add pgvector indexes (HNSW for episodes, facts, procedures)
|
||||
- [ ] Create repository classes in `backend/app/crud/memory/`
|
||||
- [ ] Add composite indexes for common query patterns
|
||||
- [ ] Unit tests for all repositories
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-3: Working Memory Implementation
|
||||
**Priority:** P0 - Core functionality
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/working/memory.py` - WorkingMemory class
|
||||
- `backend/app/services/memory/working/storage.py` - Redis + in-memory backend
|
||||
|
||||
**Features:**
|
||||
- [ ] Session-scoped containers with automatic cleanup
|
||||
- [ ] Variable storage (get/set/delete)
|
||||
- [ ] Task state tracking (current step, status, progress)
|
||||
- [ ] Scratchpad for reasoning steps
|
||||
- [ ] Configurable capacity limits
|
||||
- [ ] TTL-based expiration
|
||||
- [ ] Checkpoint/snapshot support for recovery
|
||||
- [ ] Redis primary storage with in-memory fallback
|
||||
|
||||
**API:**
|
||||
```python
|
||||
class WorkingMemory:
|
||||
async def set(self, key: str, value: Any, ttl_seconds: int | None = None) -> None
|
||||
async def get(self, key: str, default: Any = None) -> Any
|
||||
async def delete(self, key: str) -> bool
|
||||
async def exists(self, key: str) -> bool
|
||||
async def list_keys(self, pattern: str = "*") -> list[str]
|
||||
async def get_all(self) -> dict[str, Any]
|
||||
async def clear(self) -> int
|
||||
async def set_task_state(self, state: TaskState) -> None
|
||||
async def get_task_state(self) -> TaskState | None
|
||||
async def append_scratchpad(self, content: str) -> None
|
||||
async def get_scratchpad(self) -> list[str]
|
||||
async def create_checkpoint(self) -> str # Returns checkpoint ID
|
||||
async def restore_checkpoint(self, checkpoint_id: str) -> None
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Memory Types
|
||||
|
||||
#### Sub-Issue #62-4: Episodic Memory Implementation
|
||||
**Priority:** P1
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/episodic/memory.py` - EpisodicMemory class
|
||||
- `backend/app/services/memory/episodic/recorder.py` - Episode recording
|
||||
- `backend/app/services/memory/episodic/retrieval.py` - Retrieval strategies
|
||||
|
||||
**Features:**
|
||||
- [ ] Episode recording during agent execution
|
||||
- [ ] Store task completions with context
|
||||
- [ ] Store failures with error context
|
||||
- [ ] Retrieval by semantic similarity (vector search)
|
||||
- [ ] Retrieval by recency
|
||||
- [ ] Retrieval by outcome (success/failure)
|
||||
- [ ] Importance scoring based on outcome significance
|
||||
- [ ] Episode summarization for long-term storage
|
||||
|
||||
**API:**
|
||||
```python
|
||||
class EpisodicMemory:
|
||||
async def record_episode(self, episode: EpisodeCreate) -> Episode
|
||||
async def search_similar(self, query: str, limit: int = 10) -> list[Episode]
|
||||
async def get_recent(self, limit: int = 10, since: datetime | None = None) -> list[Episode]
|
||||
async def get_by_outcome(self, outcome: Outcome, limit: int = 10) -> list[Episode]
|
||||
async def get_by_task_type(self, task_type: str, limit: int = 10) -> list[Episode]
|
||||
async def update_importance(self, episode_id: UUID, score: float) -> None
|
||||
async def summarize_episodes(self, episode_ids: list[UUID]) -> str
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-5: Semantic Memory Implementation
|
||||
**Priority:** P1
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/semantic/memory.py` - SemanticMemory class
|
||||
- `backend/app/services/memory/semantic/extraction.py` - Fact extraction from episodes
|
||||
- `backend/app/services/memory/semantic/verification.py` - Fact verification
|
||||
|
||||
**Features:**
|
||||
- [ ] Fact storage with triple format (subject, predicate, object)
|
||||
- [ ] Confidence scoring and decay
|
||||
- [ ] Fact extraction from episodic memory
|
||||
- [ ] Conflict resolution for contradictory facts
|
||||
- [ ] Retrieval by query (semantic search)
|
||||
- [ ] Retrieval by entity (subject or object)
|
||||
- [ ] Source tracking (which episodes contributed)
|
||||
- [ ] Reinforcement on repeated learning
|
||||
|
||||
**API:**
|
||||
```python
|
||||
class SemanticMemory:
|
||||
async def store_fact(self, fact: FactCreate) -> Fact
|
||||
async def search_facts(self, query: str, limit: int = 10) -> list[Fact]
|
||||
async def get_by_entity(self, entity: str, limit: int = 20) -> list[Fact]
|
||||
async def reinforce_fact(self, fact_id: UUID) -> Fact
|
||||
async def deprecate_fact(self, fact_id: UUID, reason: str) -> None
|
||||
async def extract_facts_from_episode(self, episode: Episode) -> list[Fact]
|
||||
async def resolve_conflict(self, fact_ids: list[UUID]) -> Fact
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-6: Procedural Memory Implementation
|
||||
**Priority:** P2
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/procedural/memory.py` - ProceduralMemory class
|
||||
- `backend/app/services/memory/procedural/matching.py` - Procedure matching
|
||||
|
||||
**Features:**
|
||||
- [ ] Procedure recording from successful task patterns
|
||||
- [ ] Trigger pattern matching
|
||||
- [ ] Step-by-step procedure storage
|
||||
- [ ] Success/failure rate tracking
|
||||
- [ ] Procedure suggestion based on context
|
||||
- [ ] Procedure versioning
|
||||
|
||||
**API:**
|
||||
```python
|
||||
class ProceduralMemory:
|
||||
async def record_procedure(self, procedure: ProcedureCreate) -> Procedure
|
||||
async def find_matching(self, context: str, limit: int = 5) -> list[Procedure]
|
||||
async def record_outcome(self, procedure_id: UUID, success: bool) -> None
|
||||
async def get_best_procedure(self, task_type: str) -> Procedure | None
|
||||
async def update_steps(self, procedure_id: UUID, steps: list[Step]) -> Procedure
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Advanced Features
|
||||
|
||||
#### Sub-Issue #62-7: Memory Scoping
|
||||
**Priority:** P1
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/scoping/scope.py` - Scope management
|
||||
- `backend/app/services/memory/scoping/resolver.py` - Scope resolution
|
||||
|
||||
**Features:**
|
||||
- [ ] Global scope (shared across all)
|
||||
- [ ] Project scope (per project)
|
||||
- [ ] Agent type scope (per agent type)
|
||||
- [ ] Agent instance scope (per instance)
|
||||
- [ ] Session scope (ephemeral)
|
||||
- [ ] Scope inheritance (child sees parent memories)
|
||||
- [ ] Access control policies
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-8: Memory Indexing & Retrieval
|
||||
**Priority:** P1
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/indexing/index.py` - Memory indexer
|
||||
- `backend/app/services/memory/indexing/retrieval.py` - Retrieval engine
|
||||
|
||||
**Features:**
|
||||
- [ ] Vector embeddings for all memory types
|
||||
- [ ] Temporal index (by time)
|
||||
- [ ] Entity index (by entities mentioned)
|
||||
- [ ] Outcome index (by success/failure)
|
||||
- [ ] Hybrid retrieval (vector + filters)
|
||||
- [ ] Relevance scoring
|
||||
- [ ] Retrieval caching
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-9: Memory Consolidation
|
||||
**Priority:** P2
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Components:**
|
||||
- `backend/app/services/memory/consolidation/service.py` - Consolidation service
|
||||
- `backend/app/tasks/memory_consolidation.py` - Celery tasks
|
||||
|
||||
**Features:**
|
||||
- [ ] Working → Episodic transfer (session end)
|
||||
- [ ] Episodic → Semantic extraction (learn facts)
|
||||
- [ ] Episodic → Procedural extraction (learn procedures)
|
||||
- [ ] Nightly consolidation Celery tasks
|
||||
- [ ] Memory pruning (remove low-value)
|
||||
- [ ] Importance-based retention
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Integration
|
||||
|
||||
#### Sub-Issue #62-10: MCP Tools Definition
|
||||
**Priority:** P0 - Required for agent usage
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**MCP Tools:**
|
||||
|
||||
1. **`remember`** - Store in memory
|
||||
```json
|
||||
{
|
||||
"memory_type": "working|episodic|semantic|procedural",
|
||||
"content": "...",
|
||||
"importance": 0.8,
|
||||
"ttl_seconds": 3600
|
||||
}
|
||||
```
|
||||
|
||||
2. **`recall`** - Retrieve from memory
|
||||
```json
|
||||
{
|
||||
"query": "...",
|
||||
"memory_types": ["episodic", "semantic"],
|
||||
"limit": 10,
|
||||
"filters": {"outcome": "success"}
|
||||
}
|
||||
```
|
||||
|
||||
3. **`forget`** - Remove from memory
|
||||
```json
|
||||
{
|
||||
"memory_type": "working",
|
||||
"key": "temp_calculation"
|
||||
}
|
||||
```
|
||||
|
||||
4. **`reflect`** - Analyze patterns
|
||||
```json
|
||||
{
|
||||
"analysis_type": "recent_patterns|success_factors|failure_patterns"
|
||||
}
|
||||
```
|
||||
|
||||
5. **`get_memory_stats`** - Usage statistics
|
||||
6. **`search_procedures`** - Find relevant procedures
|
||||
7. **`record_outcome`** - Record task success/failure
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-11: Component Integration
|
||||
**Priority:** P1
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Integrations:**
|
||||
- [ ] Context Engine (#61) - Include relevant memories in context assembly
|
||||
- [ ] Knowledge Base (#57) - Coordinate with KB to avoid duplication
|
||||
- [ ] LLM Gateway (#56) - Use for embedding generation
|
||||
- [ ] Agent lifecycle hooks (spawn, pause, resume, terminate)
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-12: Caching Layer
|
||||
**Priority:** P2
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Features:**
|
||||
- [ ] Hot memory caching (frequently accessed)
|
||||
- [ ] Retrieval result caching
|
||||
- [ ] Embedding caching
|
||||
- [ ] Cache invalidation strategies
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Intelligence & Quality
|
||||
|
||||
#### Sub-Issue #62-13: Memory Reflection
|
||||
**Priority:** P3
|
||||
**Estimated Complexity:** High
|
||||
|
||||
**Features:**
|
||||
- [ ] Pattern detection in episodic memory
|
||||
- [ ] Success/failure factor analysis
|
||||
- [ ] Anomaly detection
|
||||
- [ ] Insights generation
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-14: Metrics & Observability
|
||||
**Priority:** P2
|
||||
**Estimated Complexity:** Low
|
||||
|
||||
**Metrics:**
|
||||
- `memory_size_bytes` by type and scope
|
||||
- `memory_operations_total` counter
|
||||
- `memory_retrieval_latency_seconds` histogram
|
||||
- `memory_consolidation_duration_seconds` histogram
|
||||
- `procedure_success_rate` gauge
|
||||
|
||||
---
|
||||
|
||||
#### Sub-Issue #62-15: Documentation & Final Testing
|
||||
**Priority:** P0
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] README with architecture overview
|
||||
- [ ] API documentation with examples
|
||||
- [ ] Integration guide
|
||||
- [ ] E2E tests for full memory lifecycle
|
||||
- [ ] Achieve >90% code coverage
|
||||
- [ ] Performance benchmarks
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
```
|
||||
Phase 1 (Foundation) - Sequential
|
||||
#62-1 → #62-2 → #62-3
|
||||
|
||||
Phase 2 (Memory Types) - Can parallelize after Phase 1
|
||||
#62-4, #62-5, #62-6 (parallel after #62-3)
|
||||
|
||||
Phase 3 (Advanced) - Sequential within phase
|
||||
#62-7 → #62-8 → #62-9
|
||||
|
||||
Phase 4 (Integration) - After Phase 2
|
||||
#62-10 → #62-11 → #62-12
|
||||
|
||||
Phase 5 (Quality) - Final
|
||||
#62-13, #62-14, #62-15
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Metric | Target | Notes |
|
||||
|--------|--------|-------|
|
||||
| Working memory get/set | <5ms | P95 |
|
||||
| Episodic memory retrieval | <100ms | P95, as per epic |
|
||||
| Semantic memory search | <100ms | P95 |
|
||||
| Procedural memory matching | <50ms | P95 |
|
||||
| Consolidation batch | <30s | Per 1000 episodes |
|
||||
|
||||
---
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
1. **Embedding costs** - Use caching aggressively, batch embeddings
|
||||
2. **Storage growth** - Implement TTL, pruning, and archival policies
|
||||
3. **Query performance** - HNSW indexes, pagination, query optimization
|
||||
4. **Scope complexity** - Start simple (instance scope only), add hierarchy later
|
||||
|
||||
---
|
||||
|
||||
## Review Checkpoints
|
||||
|
||||
After each sub-issue:
|
||||
1. Run `make validate-all`
|
||||
2. Multi-agent code review
|
||||
3. Verify E2E stack still works
|
||||
4. Commit with granular message
|
||||
Reference in New Issue
Block a user