- Added `memory_consolidation` to the task list and updated `__all__` in test files.
- Updated `source_episode_ids` in `Fact` model to use JSON for cross-database compatibility.
- Revised related database migrations to use JSONB instead of ARRAY.
- Adjusted test concurrency in Makefile for improved test performance.
Auto-fixed linting errors and formatting issues:
- Removed unused imports (F401): pytest, Any, AnalysisType, MemoryType, OutcomeType
- Removed unused variable (F841): hooks variable in test
- Applied consistent formatting across memory service and test files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive caching layer for the Agent Memory System:
- HotMemoryCache: LRU cache for frequently accessed memories
- Python 3.12 type parameter syntax
- Thread-safe operations with RLock
- TTL-based expiration
- Access count tracking for hot memory identification
- Scoped invalidation by type, scope, or pattern
- EmbeddingCache: Cache embeddings by content hash
- Content-hash based deduplication
- Optional Redis backing for persistence
- LRU eviction with configurable max size
- CachedEmbeddingGenerator wrapper for transparent caching
- CacheManager: Unified cache management
- Coordinates hot cache, embedding cache, and retrieval cache
- Centralized invalidation across all caches
- Aggregated statistics and hit rate tracking
- Automatic cleanup scheduling
- Cache warmup support
Performance targets:
- Cache hit rate > 80% for hot memories
- Cache operations < 1ms (memory), < 5ms (Redis)
83 new tests with comprehensive coverage.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add MCP-compatible tools that expose memory operations to agents:
Tools implemented:
- remember: Store data in working, episodic, semantic, or procedural memory
- recall: Retrieve memories by query across multiple memory types
- forget: Delete specific keys or bulk delete by pattern
- reflect: Analyze patterns in recent episodes (success/failure factors)
- get_memory_stats: Return usage statistics and breakdowns
- search_procedures: Find procedures matching trigger patterns
- record_outcome: Record task outcomes and update procedure success rates
Key components:
- tools.py: Pydantic schemas for tool argument validation with comprehensive
field constraints (importance 0-1, TTL limits, limit ranges)
- service.py: MemoryToolService coordinating memory type operations with
proper scoping via ToolContext (project_id, agent_instance_id, session_id)
- Lazy initialization of memory services (WorkingMemory, EpisodicMemory,
SemanticMemory, ProceduralMemory)
Test coverage:
- 60 tests covering tool definitions, argument validation, and service
execution paths
- Mock-based tests for all memory type interactions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add MemoryConsolidationService with Working→Episodic→Semantic/Procedural transfer
- Add Celery tasks for session and nightly consolidation
- Implement memory pruning with importance-based retention
- Add comprehensive test suite (32 tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive indexing and retrieval system for memory search:
- VectorIndex for semantic similarity search using cosine similarity
- TemporalIndex for time-based queries with range and recency support
- EntityIndex for entity-based lookups with multi-entity intersection
- OutcomeIndex for success/failure filtering on episodes
- MemoryIndexer as unified interface for all index types
- RetrievalEngine with hybrid search combining all indices
- RelevanceScorer for multi-signal relevance scoring
- RetrievalCache for LRU caching of search results
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add scope management system for hierarchical memory access:
- ScopeManager with hierarchy: Global → Project → Agent Type → Agent Instance → Session
- ScopePolicy for access control (read, write, inherit permissions)
- ScopeResolver for resolving queries across scope hierarchies with inheritance
- ScopeFilter for filtering scopes by type, project, or agent
- Access control enforcement with parent scope visibility
- Deduplication support during resolution across scopes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements procedural memory for learned skills and procedures:
Core functionality:
- ProceduralMemory class for procedure storage/retrieval
- record_procedure with duplicate detection and step merging
- find_matching for context-based procedure search
- record_outcome for success/failure tracking
- get_best_procedure for finding highest success rate
- update_steps for procedure refinement
Supporting modules:
- ProcedureMatcher: Keyword-based procedure matching
- MatchResult/MatchContext: Matching result types
- Success rate weighting in match scoring
Test coverage:
- 43 unit tests covering all modules
- matching.py: 97% coverage
- memory.py: 86% coverage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements the episodic memory service for storing and retrieving
agent task execution experiences. This enables learning from past
successes and failures.
Components:
- EpisodicMemory: Main service class combining recording and retrieval
- EpisodeRecorder: Handles episode creation, importance scoring
- EpisodeRetriever: Multiple retrieval strategies (recency, semantic,
outcome, importance, task type)
Key features:
- Records task completions with context, actions, outcomes
- Calculates importance scores based on outcome, duration, lessons
- Semantic search with fallback to recency when embeddings unavailable
- Full CRUD operations with statistics and summarization
- Comprehensive unit tests (50 tests, all passing)
Closes#90🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements session-scoped ephemeral memory with:
Storage Backends:
- InMemoryStorage: Thread-safe fallback with TTL support and capacity limits
- RedisStorage: Primary storage with connection pooling and JSON serialization
- Auto-fallback from Redis to in-memory when unavailable
WorkingMemory Class:
- Key-value storage with TTL and reserved key protection
- Task state tracking with progress updates
- Scratchpad for reasoning steps with timestamps
- Checkpoint/snapshot support for recovery
- Factory methods for auto-configured storage
Tests:
- 55 unit tests covering all functionality
- Tests for basic ops, TTL, capacity, concurrency
- Tests for task state, scratchpad, checkpoints
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements Sub-Issue #87 of Issue #62 (Agent Memory System).
Core infrastructure:
- memory/types.py: Type definitions for all memory types (Working, Episodic,
Semantic, Procedural) with enums for MemoryType, ScopeLevel, Outcome
- memory/config.py: MemorySettings with MEM_ env prefix, thread-safe singleton
- memory/exceptions.py: Comprehensive exception hierarchy for memory operations
- memory/manager.py: MemoryManager facade with placeholder methods
Directory structure:
- working/: Working memory (Redis/in-memory) - to be implemented in #89
- episodic/: Episodic memory (experiences) - to be implemented in #90
- semantic/: Semantic memory (facts) - to be implemented in #91
- procedural/: Procedural memory (skills) - to be implemented in #92
- scoping/: Scope management - to be implemented in #93
- indexing/: Vector indexing - to be implemented in #94
- consolidation/: Memory consolidation - to be implemented in #95
Tests: 71 unit tests for config, types, and exceptions
Docs: Comprehensive implementation plan at docs/architecture/memory-system-plan.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Introduced a new `context` module and its endpoints for Context Management.
- Added `/context` route to the API router for assembling LLM context, token counting, budget management, and cache invalidation.
- Implemented health checks, context assembly, token counting, and caching operations in the Context Management Engine.
- Included schemas for request/response models and tightened error handling for context-related operations.
- Added stricter budget validation in ContextRanker with explicit error handling for invalid configurations.
- Introduced `_get_valid_token_count()` helper to validate and safeguard token counts.
- Enhanced XML escaping in Claude adapter to prevent injection risks from scores and unhandled content.
- Added timeout enforcement for token counting, scoring, and compression with detailed error handling.
- Introduced tenant isolation in context caching using project and agent identifiers.
- Enhanced budget management with stricter checks for critical context overspending and buffer limitations.
- Optimized per-context locking with cleanup to prevent memory leaks in concurrent environments.
- Updated default assembly timeout settings for improved performance and reliability.
- Improved XML escaping in Claude adapter for safety against injection attacks.
- Standardized token estimation using model-specific ratios.
- Cleaned up unnecessary comments in `__all__` definitions for better readability.
- Adjusted indentation and formatting across modules for improved clarity (e.g., long lines, logical grouping).
- Simplified conditional expressions and inline comments for context scoring and ranking.
- Replaced some hard-coded values with type-safe annotations (e.g., `ClassVar`).
- Removed unused imports and ensured consistent usage across test files.
- Updated `test_score_not_cached_on_context` to clarify caching behavior.
- Improved truncation strategy logic and marker handling.
- Replace hard-coded limits with configurable settings (e.g., cache memory size, truncation strategy, relevance settings).
- Optimize parallel execution in token counting, scoring, and reranking for source diversity.
- Improve caching logic:
- Add per-context locks for safe parallel scoring.
- Reuse precomputed fingerprints for cache efficiency.
- Make truncation, scoring, and ranker behaviors fully configurable via settings.
- Add support for middle truncation, context hash-based hashing, and dynamic token limiting.
- Refactor methods for scalability and better error handling.
Tests: Updated all affected components with additional test cases.
Phase 7 of Context Management Engine - Main Engine:
- Add ContextEngine as main orchestration class
- Integrate all components: calculator, scorer, ranker, compressor, cache
- Add high-level assemble_context() API with:
- System prompt support
- Task description support
- Knowledge Base integration via MCP
- Conversation history conversion
- Tool results conversion
- Custom contexts support
- Add helper methods:
- get_budget_for_model()
- count_tokens() with caching
- invalidate_cache()
- get_stats()
- Add create_context_engine() factory function
Tests: 26 new tests, 311 total context tests passing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 5 of Context Management Engine - Model Adapters:
- Add ModelAdapter abstract base class with model matching
- Add DefaultAdapter for unknown models (plain text)
- Add ClaudeAdapter with XML-based formatting:
- <system_instructions> for system context
- <reference_documents>/<document> for knowledge
- <conversation_history>/<message> for chat
- <tool_results>/<tool_result> for tool outputs
- XML escaping for special characters
- Add OpenAIAdapter with markdown formatting:
- ## headers for sections
- ### Source headers for documents
- **ROLE** bold labels for conversation
- Code blocks for tool outputs
- Add get_adapter() factory function for model selection
Tests: 33 new tests, 256 total context tests passing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TokenCalculator with LLM Gateway integration for accurate token
counting with in-memory caching and fallback character-based estimation.
Implement TokenBudget for tracking allocations per context type with
budget enforcement, and BudgetAllocator for creating budgets based on
model context window sizes.
- TokenCalculator: MCP integration, caching, model-specific ratios
- TokenBudget: allocation tracking, can_fit/allocate/deallocate/reset
- BudgetAllocator: model context sizes, budget creation and adjustment
- 35 comprehensive tests covering all budget functionality
Part of #61 - Context Management Engine
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added `record_action` in `RateLimiter` for precise tracking of slot consumption post-validation.
- Introduced deduplication mechanism for warning alerts in `CostController` to prevent spamming.
- Refactored `CostController`'s session and daily budget alert handling for improved clarity.
- Implemented test suites for `CostController` and `SafetyGuardian` to validate changes.
- Expanded integration testing to cover deduplication, validation, and loop detection edge cases.
Improved code readability and uniformity by standardizing line breaks, indentation, and inline conditions across safety-related services, models, and tests, including content filters, validation rules, and emergency controls.
The ContentFilter was appending references to DEFAULT_PATTERNS objects,
so when tests modified patterns (e.g., disabling them), those changes
persisted across test runs. Use dataclass replace() to create copies.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add MCPSafetyWrapper for safe MCP tool execution
- Add MCPToolCall/MCPToolResult models for MCP interactions
- Add SafeToolExecutor context manager
- Add SafetyMetrics collector with Prometheus export support
- Track validations, approvals, rate limits, budgets, and more
- Support for counters, gauges, and histograms
Issue #63🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add rollback manager with file checkpointing and transaction context
- Add HITL manager with approval queues and notification handlers
- Add content filter with PII, secrets, and injection detection
- Add emergency controls with stop/pause/resume capabilities
- Update SafetyConfig with checkpoint_dir setting
Issue #63🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Core MCP client implementation with comprehensive tooling:
**Services:**
- MCPClientManager: Main facade for all MCP operations
- MCPServerRegistry: Thread-safe singleton for server configs
- ConnectionPool: Connection pooling with auto-reconnection
- ToolRouter: Automatic tool routing with circuit breaker
- AsyncCircuitBreaker: Custom async-compatible circuit breaker
**Configuration:**
- YAML-based config with Pydantic models
- Environment variable expansion support
- Transport types: HTTP, SSE, STDIO
**API Endpoints:**
- GET /mcp/servers - List all MCP servers
- GET /mcp/servers/{name}/tools - List server tools
- GET /mcp/tools - List all tools from all servers
- GET /mcp/health - Health check all servers
- POST /mcp/call - Execute tool (admin only)
- GET /mcp/circuit-breakers - Circuit breaker status
- POST /mcp/circuit-breakers/{name}/reset - Reset circuit breaker
- POST /mcp/servers/{name}/reconnect - Force reconnection
**Testing:**
- 156 unit tests with comprehensive coverage
- Tests for all services, routes, and error handling
- Proper mocking and async test support
**Documentation:**
- MCP_CLIENT.md with usage examples
- Phase 2+ workflow documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reformatted multiline function calls, object definitions, and queries for improved code readability and consistency. Adjusted imports and constraints where necessary.
- Removed explicit ENUM creation statements; rely on `sa.Enum` to auto-generate ENUM types during table creation.
- Cleaned up redundant `create_type=False` arguments to streamline definitions.
## Changes
### agent_instance.py - Task Completion Counter Race Condition
- Changed `record_task_completion()` from read-modify-write pattern to
atomic SQL UPDATE
- Previously: Read instance → increment in Python memory → write back
- Now: Uses `UPDATE ... SET tasks_completed = tasks_completed + 1`
- Prevents lost updates when multiple concurrent task completions occur
### sprint.py - Row-Level Locking for Sprint Operations
- Added `with_for_update()` to `complete_sprint()` to prevent race
conditions during velocity calculation
- Added `with_for_update()` to `cancel_sprint()` for consistency
- Ensures atomic check-and-update for sprint status changes
## Impact
These fixes prevent:
- Counter metrics being lost under concurrent load
- Data corruption during sprint completion
- Race conditions with concurrent sprint status changes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Bug Fixes:
- bulk_terminate_by_project now unassigns issues before terminating agents
to prevent orphaned issue assignments
- PATCH /issues/{id} now validates sprint status - cannot assign issues
to COMPLETED or CANCELLED sprints
- archive_project now performs cascading cleanup:
- Terminates all active agent instances
- Cancels all planned/active sprints
- Unassigns issues from terminated agents
Added edge case tests for all fixed bugs (19 new tests total):
- TestBulkTerminateEdgeCases
- TestSprintStatusValidation
- TestArchiveProjectCleanup
- TestDataIntegrityEdgeCases (IDOR protection)
Coverage: 93% (1836 tests passing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit fixes 4 production bugs found via edge case testing:
1. BUG: System allowed assigning issues to terminated agents
- Added validation in issue creation endpoint
- Added validation in issue update endpoint
- Added validation in issue assign endpoint
2. BUG: Issues remained orphaned when agent was terminated
- Agent termination now auto-unassigns all issues from that agent
These bugs could lead to issues being assigned to non-functional agents
that would never work on them, causing work to stall silently.
Tests added in tests/api/routes/syndarix/test_edge_cases.py to verify:
- Cannot assign issue to terminated agent (3 variations)
- Issues are auto-unassigned when agent is terminated
- Various other edge cases (sprints, projects, IDOR protection)
Coverage: 88% → 93% (1830 tests passing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
FastAPI processes routes in order, so /agents/metrics must be defined
before /agents/{agent_id} to prevent "metrics" from being parsed as a UUID.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move stats endpoint before {issue_id} routes to prevent UUID parsing errors
- Use remove() instead of soft_delete() since Issue model lacks deleted_at column
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
FastAPI processes routes in order, so /velocity must be defined
before /{sprint_id} to prevent "velocity" from being parsed as a UUID.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Completely rewrote migration 0004 to match current model definitions:
- Added issue_type ENUM (epic, story, task, bug)
- Fixed sprint_status ENUM to include in_review
- Fixed all table columns to match models exactly
- Fixed all indexes and constraints
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add UniqueConstraint to Sprint model to ensure sprint numbers
are unique within a project, matching the migration specification.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>