syndarix

Author	SHA1	Message	Date
Felipe Cardoso	4ad3d20cf2	chore(agents): update `sort_order` values for agent types to improve logical grouping	2026-01-06 18:43:29 +01:00
Felipe Cardoso	8e16e2645e	test(forms): add unit tests for FormTextarea and FormSelect components - Add comprehensive test coverage for FormTextarea and FormSelect components to validate rendering, accessibility, props forwarding, error handling, and behavior. - Introduced function-scoped fixtures in e2e tests to ensure test isolation and address event loop issues with pytest-asyncio and SQLAlchemy.	2026-01-06 17:54:49 +01:00
Felipe Cardoso	3f23bc3db3	refactor(migrations): replace hardcoded database URL with configurable environment variable and update command syntax to use consistent quoting style	2026-01-06 17:19:28 +01:00
Felipe Cardoso	a0ec5fa2cc	test(agents): add validation tests for category and display fields Added comprehensive unit and API tests to validate AgentType category and display fields: - Category validation for valid, null, and invalid values - Icon, color, and sort_order field constraints - Typical tasks and collaboration hints handling (stripping, removing empty strings, normalization) - New API tests for field creation, filtering, updating, and grouping	2026-01-06 17:19:21 +01:00
Felipe Cardoso	9339ea30a1	feat(agents): add category and display fields to AgentType model Add 6 new fields to AgentType for better organization and UI display: - category: enum for grouping (development, design, quality, etc.) - icon: Lucide icon identifier for UI - color: hex color code for visual distinction - sort_order: display ordering within categories - typical_tasks: list of tasks the agent excels at - collaboration_hints: agent slugs that work well together Backend changes: - Add AgentTypeCategory enum to enums.py - Update AgentType model with 6 new columns and indexes - Update schemas with validators for new fields - Add category filter and /grouped endpoint to routes - Update CRUD with get_grouped_by_category method - Update seed data with categories for all 27 agents - Add migration 0007 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 16:11:22 +01:00
Felipe Cardoso	79cb6bfd7b	feat(agents): comprehensive agent types with rich personalities Major revamp of agent types based on SOTA personality design research: - Expanded from 6 to 27 specialized agent types - Rich personality prompts following Anthropic and CrewAI best practices - Each agent has structured prompt with Core Identity, Expertise, Principles, and Scenario Handling sections Agent Categories: - Core Development (8): Product Owner, PM, BA, Architect, Full Stack, Backend, Frontend, Mobile Engineers - Design (2): UI/UX Designer, UX Researcher - Quality & Operations (3): QA, DevOps, Security Engineers - AI/ML (5): AI/ML Engineer, Researcher, CV, NLP, MLOps Engineers - Data (2): Data Scientist, Data Engineer - Leadership (2): Technical Lead, Scrum Master - Domain Specialists (5): Financial, Healthcare, Scientific, Behavioral Psychology Experts, Technical Writer Research applied: - Anthropic Claude persona design guidelines - CrewAI role/backstory/goal patterns - Role prompting research on detailed vs generic personas - Temperature tuning per agent type (0.2-0.7 based on role) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 14:25:13 +01:00
Felipe Cardoso	600657adc4	fix(agents): properly initialize form with API data defaults Root cause: The demo data's model_params was missing `top_p`, but the Zod schema required all three fields (temperature, max_tokens, top_p). This caused silent validation failures when editing agent types. Fixes: 1. Add getInitialValues() that ensures all required fields have defaults 2. Handle nested validation errors in handleFormError (e.g., model_params.top_p) 3. Add useEffect to reset form when agentType changes 4. Add console.error logging for debugging validation failures 5. Update demo data to include top_p in all agent types The form now properly initializes with safe defaults for any missing fields from the API response, preventing silent validation failures. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 11:54:45 +01:00
Felipe Cardoso	5a4d93df26	feat(dashboard): use real API data and add 3 more demo projects Dashboard changes: - Update useDashboard hook to fetch real projects from API - Calculate stats (active projects, agents, issues) from real data - Keep pending approvals as mock (no backend endpoint yet) Demo data additions: - API Gateway Modernization project (active, complex) - Customer Analytics Dashboard project (completed) - DevOps Pipeline Automation project (active, complex) - Added sprints, agent instances, and issues for each new project Total demo data: 6 projects, 14 agents, 22 issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 03:10:10 +01:00
Felipe Cardoso	7ef217be39	feat(demo): tie all demo projects to admin user - Update demo_data.json to use "__admin__" as owner_email for all projects - Add admin user lookup in load_demo_data() with special "__admin__" key - Remove notification_email from project settings (not a valid field) This ensures demo projects are visible to the admin user when logged in. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 03:00:07 +01:00
Felipe Cardoso	f9a72fcb34	fix(models): use enum values instead of names for PostgreSQL Add values_callable to all enum columns so SQLAlchemy serializes using the enum's .value (lowercase) instead of .name (uppercase). PostgreSQL enum types defined in migrations use lowercase values. Fixes: invalid input value for enum autonomy_level: "MILESTONE" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 02:53:45 +01:00
Felipe Cardoso	fcb0a5f86a	fix(models): add explicit enum names to match migration types SQLAlchemy's Enum() auto-generates type names from Python class names (e.g., AutonomyLevel -> autonomylevel), but migrations defined them with underscores (e.g., autonomy_level). This mismatch caused: "type 'autonomylevel' does not exist" Added explicit name parameters to all enum columns to match the migration-defined type names: - autonomy_level, project_status, project_complexity, client_mode - agent_status, sprint_status - issue_type, issue_status, issue_priority, sync_status 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 02:48:10 +01:00
Felipe Cardoso	92782bcb05	refactor(init_db): remove demo data file and implement structured seeding - Delete `demo_data.json` replaced by structured logic for better modularity. - Add support for seeding default agent types and new demo data structure. - Ensure demo mode only executes when explicitly enabled (settings.DEMO_MODE). - Enhance logging for improved debugging during DB initialization.	2026-01-06 02:34:34 +01:00
Felipe Cardoso	1dcf99ee38	fix(memory): use deque for metrics histograms to ensure bounded memory usage - Replace default empty list with `deque` for `memory_retrieval_latency_seconds` - Prevents unbounded memory growth by leveraging bounded circular buffer behavior	2026-01-06 02:34:28 +01:00
Felipe Cardoso	192237e69b	fix(memory): unify Outcome enum and add ABANDONED support - Add ABANDONED value to core Outcome enum in types.py - Replace duplicate OutcomeType class in mcp/tools.py with alias to Outcome - Simplify mcp/service.py to use outcome directly (no more silent mapping) - Add migration 0006 to extend PostgreSQL episode_outcome enum - Add missing constraints to migration 0005 (ix_facts_unique_triple_global) This fixes the semantic issue where ABANDONED outcomes were silently converted to FAILURE, losing information about task abandonment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 01:46:48 +01:00
Felipe Cardoso	3edce9cd26	fix(memory): address critical bugs from multi-agent review Bug Fixes: - Remove singleton pattern from consolidation/reflection services to prevent stale database session bugs (session is now passed per-request) - Add LRU eviction to MemoryToolService._working dict (max 1000 sessions) to prevent unbounded memory growth - Replace O(n) list.remove() with O(1) OrderedDict.move_to_end() in RetrievalCache for better performance under load - Use deque with maxlen for metrics histograms to prevent unbounded memory growth (circular buffer with 10k max samples) - Use full UUID for checkpoint IDs instead of 8-char prefix to avoid collision risk at scale (birthday paradox at ~50k checkpoints) Test Updates: - Update checkpoint test to expect 36-char UUID - Update reflection singleton tests to expect new factory behavior - Add reset_memory_reflection() no-op for backwards compatibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 18:55:32 +01:00
Felipe Cardoso	35aea2d73a	perf(mcp): optimize test performance with parallel connections and reduced retries - Connect to MCP servers concurrently instead of sequentially - Reduce retry settings in test mode (IS_TEST=True): - 1 attempt instead of 3 - 100ms retry delay instead of 1s - 2s timeout instead of 30-120s Reduces MCP E2E test time from ~16s to under 1s. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 18:33:38 +01:00
Felipe Cardoso	d0f32d04f7	fix(tests): reduce TTL durations to improve test reliability - Adjusted TTL durations and sleep intervals across memory and cache tests for consistent expiration behavior. - Prevented test flakiness caused by timing discrepancies in token expiration and cache cleanup.	2026-01-05 18:29:02 +01:00
Felipe Cardoso	da85a8aba8	fix(memory): prevent entry metadata mutation in vector search - Create shallow copy of VectorIndexEntry when adding similarity score - Prevents mutation of cached entries that could corrupt shared state 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:54 +01:00
Felipe Cardoso	f8bd1011e9	security(memory): escape SQL ILIKE patterns to prevent injection - Add _escape_like_pattern() helper to escape SQL wildcards (%, _, \) - Apply escaping in SemanticMemory.search_facts and get_by_entity - Apply escaping in ProceduralMemory.search and find_best_for_task Prevents attackers from injecting SQL wildcard patterns through user-controlled search terms. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:47 +01:00
Felipe Cardoso	f057c2f0b6	fix(memory): add thread-safe singleton initialization - Add threading.Lock with double-check locking to ScopeManager - Add asyncio.Lock with double-check locking to MemoryReflection - Make reset_memory_metrics async with proper locking - Update test fixtures to handle async reset functions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:39 +01:00
Felipe Cardoso	33ec889fc4	fix(memory): add data integrity constraints to Fact model - Change source_episode_ids from JSON to JSONB for PostgreSQL consistency - Add unique constraint for global facts (project_id IS NULL) - Add CHECK constraint ensuring reinforcement_count >= 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:30 +01:00
Felipe Cardoso	74b8c65741	fix(tests): move memory model tests to avoid import conflicts Moved tests/unit/models/memory/ to tests/models/memory/ to avoid Python import path conflicts when pytest collects all tests. The conflict was caused by tests/models/ and tests/unit/models/ both having __init__.py files, causing Python to confuse app.models.memory imports. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 15:45:30 +01:00
Felipe Cardoso	b232298c61	feat(memory): add memory consolidation task and switch `source_episode_ids` to JSON - Added `memory_consolidation` to the task list and updated `__all__` in test files. - Updated `source_episode_ids` in `Fact` model to use JSON for cross-database compatibility. - Revised related database migrations to use JSONB instead of ARRAY. - Adjusted test concurrency in Makefile for improved test performance.	2026-01-05 15:38:52 +01:00
Felipe Cardoso	cf6291ac8e	style(memory): apply ruff formatting and linting fixes Auto-fixed linting errors and formatting issues: - Removed unused imports (F401): pytest, Any, AnalysisType, MemoryType, OutcomeType - Removed unused variable (F841): hooks variable in test - Applied consistent formatting across memory service and test files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 14:07:48 +01:00
Felipe Cardoso	e3fe0439fd	docs(memory): add comprehensive memory system documentation (#101 ) Add complete documentation for the Agent Memory System including: - Architecture overview with ASCII diagram - Memory type descriptions (working, episodic, semantic, procedural) - Usage examples for all memory operations - Memory scoping hierarchy explanation - Consolidation flow documentation - MCP tools reference - Reflection capabilities - Configuration reference table - Integration with Context Engine - Metrics reference - Performance targets - Troubleshooting guide - Directory structure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 11:03:57 +01:00
Felipe Cardoso	57680c3772	feat(memory): implement metrics and observability (#100 ) Add comprehensive metrics collector for memory system with: - Counter metrics: operations, retrievals, cache hits/misses, consolidations, episodes recorded, patterns/anomalies/insights detected - Gauge metrics: item counts, memory size, cache size, procedure success rates, active sessions, pending consolidations - Histogram metrics: working memory latency, retrieval latency, consolidation duration, embedding latency - Prometheus format export - Summary and cache stats helpers 31 tests covering all metric types, singleton pattern, and edge cases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 11:00:53 +01:00
Felipe Cardoso	997cfaa03a	feat(memory): implement memory reflection service (#99 ) Add reflection layer for memory system with pattern detection, success/failure factor analysis, anomaly detection, and insights generation. Enables agents to learn from past experiences and identify optimization opportunities. Key components: - Pattern detection: recurring success/failure, action sequences, temporal, efficiency - Factor analysis: action, context, timing, resource, preceding state factors - Anomaly detection: unusual duration, token usage, failure rates, action patterns - Insight generation: optimization, warning, learning, recommendation, trend insights Also fixes pre-existing timezone issues in test_types.py (datetime.now() -> datetime.now(UTC)). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 04:22:23 +01:00
Felipe Cardoso	6954774e36	feat(memory): implement caching layer for memory operations (#98 ) Add comprehensive caching layer for the Agent Memory System: - HotMemoryCache: LRU cache for frequently accessed memories - Python 3.12 type parameter syntax - Thread-safe operations with RLock - TTL-based expiration - Access count tracking for hot memory identification - Scoped invalidation by type, scope, or pattern - EmbeddingCache: Cache embeddings by content hash - Content-hash based deduplication - Optional Redis backing for persistence - LRU eviction with configurable max size - CachedEmbeddingGenerator wrapper for transparent caching - CacheManager: Unified cache management - Coordinates hot cache, embedding cache, and retrieval cache - Centralized invalidation across all caches - Aggregated statistics and hit rate tracking - Automatic cleanup scheduling - Cache warmup support Performance targets: - Cache hit rate > 80% for hot memories - Cache operations < 1ms (memory), < 5ms (Redis) 83 new tests with comprehensive coverage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 04:04:13 +01:00
Felipe Cardoso	30e5c68304	feat(memory): integrate memory system with context engine (#97 ) ## Changes ### New Context Type - Add MEMORY to ContextType enum for agent memory context - Create MemoryContext class with subtypes (working, episodic, semantic, procedural) - Factory methods: from_working_memory, from_episodic_memory, from_semantic_memory, from_procedural_memory ### Memory Context Source - MemoryContextSource service fetches relevant memories for context assembly - Configurable fetch limits per memory type - Parallel fetching from all memory types ### Agent Lifecycle Hooks - AgentLifecycleManager handles spawn, pause, resume, terminate events - spawn: Initialize working memory with optional initial state - pause: Create checkpoint of working memory - resume: Restore from checkpoint - terminate: Consolidate working memory to episodic memory - LifecycleHooks for custom extension points ### Context Engine Integration - Add memory_query parameter to assemble_context() - Add session_id and agent_type_id for memory scoping - Memory budget allocation (15% by default) - set_memory_source() for runtime configuration ### Tests - 48 new tests for MemoryContext, MemoryContextSource, and lifecycle hooks - All 108 memory-related tests passing - mypy and ruff checks passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:49:22 +01:00
Felipe Cardoso	0b24d4c6cc	feat(memory): implement MCP tools for agent memory operations (#96 ) Add MCP-compatible tools that expose memory operations to agents: Tools implemented: - remember: Store data in working, episodic, semantic, or procedural memory - recall: Retrieve memories by query across multiple memory types - forget: Delete specific keys or bulk delete by pattern - reflect: Analyze patterns in recent episodes (success/failure factors) - get_memory_stats: Return usage statistics and breakdowns - search_procedures: Find procedures matching trigger patterns - record_outcome: Record task outcomes and update procedure success rates Key components: - tools.py: Pydantic schemas for tool argument validation with comprehensive field constraints (importance 0-1, TTL limits, limit ranges) - service.py: MemoryToolService coordinating memory type operations with proper scoping via ToolContext (project_id, agent_instance_id, session_id) - Lazy initialization of memory services (WorkingMemory, EpisodicMemory, SemanticMemory, ProceduralMemory) Test coverage: - 60 tests covering tool definitions, argument validation, and service execution paths - Mock-based tests for all memory type interactions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:32:10 +01:00
Felipe Cardoso	1670e05e0d	feat(memory): implement memory consolidation service and tasks (#95 ) - Add MemoryConsolidationService with Working→Episodic→Semantic/Procedural transfer - Add Celery tasks for session and nightly consolidation - Implement memory pruning with importance-based retention - Add comprehensive test suite (32 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:04:28 +01:00
Felipe Cardoso	999b7ac03f	feat(memory): implement memory indexing and retrieval engine (#94 ) Add comprehensive indexing and retrieval system for memory search: - VectorIndex for semantic similarity search using cosine similarity - TemporalIndex for time-based queries with range and recency support - EntityIndex for entity-based lookups with multi-entity intersection - OutcomeIndex for success/failure filtering on episodes - MemoryIndexer as unified interface for all index types - RetrievalEngine with hybrid search combining all indices - RelevanceScorer for multi-signal relevance scoring - RetrievalCache for LRU caching of search results 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:50:13 +01:00
Felipe Cardoso	48ecb40f18	feat(memory): implement memory scoping with hierarchy and access control (#93 ) Add scope management system for hierarchical memory access: - ScopeManager with hierarchy: Global → Project → Agent Type → Agent Instance → Session - ScopePolicy for access control (read, write, inherit permissions) - ScopeResolver for resolving queries across scope hierarchies with inheritance - ScopeFilter for filtering scopes by type, project, or agent - Access control enforcement with parent scope visibility - Deduplication support during resolution across scopes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:39:22 +01:00
Felipe Cardoso	b818f17418	feat(memory): add procedural memory implementation (Issue #92 ) Implements procedural memory for learned skills and procedures: Core functionality: - ProceduralMemory class for procedure storage/retrieval - record_procedure with duplicate detection and step merging - find_matching for context-based procedure search - record_outcome for success/failure tracking - get_best_procedure for finding highest success rate - update_steps for procedure refinement Supporting modules: - ProcedureMatcher: Keyword-based procedure matching - MatchResult/MatchContext: Matching result types - Success rate weighting in match scoring Test coverage: - 43 unit tests covering all modules - matching.py: 97% coverage - memory.py: 86% coverage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:31:32 +01:00
Felipe Cardoso	e946787a61	feat(memory): add semantic memory implementation (Issue #91 ) Implements semantic memory with fact storage, retrieval, and verification: Core functionality: - SemanticMemory class for fact storage/retrieval - Fact storage as subject-predicate-object triples - Duplicate detection with reinforcement - Semantic search with text-based fallback - Entity-based retrieval - Confidence scoring and decay - Conflict resolution Supporting modules: - FactExtractor: Pattern-based fact extraction from episodes - FactVerifier: Contradiction detection and reliability scoring Test coverage: - 47 unit tests covering all modules - extraction.py: 99% coverage - verification.py: 95% coverage - memory.py: 78% coverage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:23:06 +01:00
Felipe Cardoso	3554efe66a	feat(memory): add episodic memory implementation (Issue #90 ) Implements the episodic memory service for storing and retrieving agent task execution experiences. This enables learning from past successes and failures. Components: - EpisodicMemory: Main service class combining recording and retrieval - EpisodeRecorder: Handles episode creation, importance scoring - EpisodeRetriever: Multiple retrieval strategies (recency, semantic, outcome, importance, task type) Key features: - Records task completions with context, actions, outcomes - Calculates importance scores based on outcome, duration, lessons - Semantic search with fallback to recency when embeddings unavailable - Full CRUD operations with statistics and summarization - Comprehensive unit tests (50 tests, all passing) Closes #90 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:08:16 +01:00
Felipe Cardoso	bd988f76b0	fix(memory): address review findings from Issue #88 Fixes based on multi-agent review: Model Improvements: - Remove duplicate index ix_procedures_agent_type (already indexed via Column) - Fix postgresql_where to use text() instead of string literal in Fact model - Add thread-safety to Procedure.success_rate property (snapshot values) Data Integrity Constraints: - Add CheckConstraint for Episode: importance_score 0-1, duration >= 0, tokens >= 0 - Add CheckConstraint for Fact: confidence 0-1 - Add CheckConstraint for Procedure: success_count >= 0, failure_count >= 0 Migration Updates: - Add check constraints creation in upgrade() - Add check constraints removal in downgrade() Note: SQLAlchemy Column default=list is correct (callable factory pattern) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:54:51 +01:00
Felipe Cardoso	4974233169	feat(memory): add working memory implementation (Issue #89 ) Implements session-scoped ephemeral memory with: Storage Backends: - InMemoryStorage: Thread-safe fallback with TTL support and capacity limits - RedisStorage: Primary storage with connection pooling and JSON serialization - Auto-fallback from Redis to in-memory when unavailable WorkingMemory Class: - Key-value storage with TTL and reserved key protection - Task state tracking with progress updates - Scratchpad for reasoning steps with timestamps - Checkpoint/snapshot support for recovery - Factory methods for auto-configured storage Tests: - 55 unit tests covering all functionality - Tests for basic ops, TTL, capacity, concurrency - Tests for task state, scratchpad, checkpoints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:51:03 +01:00
Felipe Cardoso	c9d8c0835c	feat(memory): add database schema and storage layer (Issue #88 ) Add SQLAlchemy models for the Agent Memory System: - WorkingMemory: Key-value storage with TTL for active sessions - Episode: Experiential memories from task executions - Fact: Semantic knowledge triples with confidence scores - Procedure: Learned skills and procedures with success tracking - MemoryConsolidationLog: Tracks consolidation jobs between memory tiers Create enums for memory system: - ScopeType: global, project, agent_type, agent_instance, session - EpisodeOutcome: success, failure, partial - ConsolidationType: working_to_episodic, episodic_to_semantic, etc. - ConsolidationStatus: pending, running, completed, failed Add Alembic migration (0005) for all memory tables with: - Foreign key relationships to projects, agent_instances, agent_types - Comprehensive indexes for query patterns - Unique constraints for key lookups and triple uniqueness - Vector embedding column placeholders (Text fallback until pgvector enabled) Fix timezone-naive datetime.now() in types.py TaskState (review feedback) Includes 30 unit tests for models and enums. Closes #88 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:37:58 +01:00
Felipe Cardoso	085a748929	feat(memory): #87 project setup & core architecture Implements Sub-Issue #87 of Issue #62 (Agent Memory System). Core infrastructure: - memory/types.py: Type definitions for all memory types (Working, Episodic, Semantic, Procedural) with enums for MemoryType, ScopeLevel, Outcome - memory/config.py: MemorySettings with MEM_ env prefix, thread-safe singleton - memory/exceptions.py: Comprehensive exception hierarchy for memory operations - memory/manager.py: MemoryManager facade with placeholder methods Directory structure: - working/: Working memory (Redis/in-memory) - to be implemented in #89 - episodic/: Episodic memory (experiences) - to be implemented in #90 - semantic/: Semantic memory (facts) - to be implemented in #91 - procedural/: Procedural memory (skills) - to be implemented in #92 - scoping/: Scope management - to be implemented in #93 - indexing/: Vector indexing - to be implemented in #94 - consolidation/: Memory consolidation - to be implemented in #95 Tests: 71 unit tests for config, types, and exceptions Docs: Comprehensive implementation plan at docs/architecture/memory-system-plan.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:27:36 +01:00
Felipe Cardoso	4b149b8a52	feat(tests): add unit tests for Context Management API routes - Added detailed unit tests for `/context` endpoints, covering health checks, context assembly, token counting, budget retrieval, and cache invalidation. - Included edge cases, error handling, and input validation for context-related operations. - Improved test coverage for the Context Management module with mocked dependencies and integration scenarios.	2026-01-05 01:02:49 +01:00
Felipe Cardoso	ad0c06851d	feat(tests): add comprehensive E2E tests for MCP and Agent workflows - Introduced end-to-end tests for MCP workflows, including server discovery, authentication, context engine operations, error handling, and input validation. - Added full lifecycle tests for agent workflows, covering type management, instance spawning, status transitions, and admin-only operations. - Enhanced test coverage for real-world MCP and Agent scenarios across PostgreSQL and async environments.	2026-01-05 01:02:41 +01:00
Felipe Cardoso	49359b1416	feat(api): add Context Management API and routes - Introduced a new `context` module and its endpoints for Context Management. - Added `/context` route to the API router for assembling LLM context, token counting, budget management, and cache invalidation. - Implemented health checks, context assembly, token counting, and caching operations in the Context Management Engine. - Included schemas for request/response models and tightened error handling for context-related operations.	2026-01-05 01:02:33 +01:00
Felipe Cardoso	911d950c15	feat(tests): add comprehensive integration tests for MCP stack - Introduced integration tests covering backend, LLM Gateway, Knowledge Base, and Context Engine. - Includes health checks, tool listing, token counting, and end-to-end MCP flows. - Added `RUN_INTEGRATION_TESTS` environment flag to enable selective test execution. - Includes a quick health check script to verify service availability before running tests.	2026-01-05 01:02:22 +01:00
Felipe Cardoso	b2a3ac60e0	feat: add integration testing target to Makefile - Introduced `test-integration` command for MCP integration tests. - Expanded help section with details about running integration tests. - Improved Makefile's testing capabilities for enhanced developer workflows.	2026-01-05 01:02:16 +01:00
Felipe Cardoso	60ebeaa582	test(safety): add comprehensive tests for safety framework modules Add tests to improve backend coverage from 85% to 93%: - test_audit.py: 60 tests for AuditLogger (20% -> 99%) - Hash chain integrity, sanitization, retention, handlers - Fixed bug: hash chain modification after event creation - Fixed bug: verification not using correct prev_hash - test_hitl.py: Tests for HITL manager (0% -> 100%) - test_permissions.py: Tests for permissions manager (0% -> 99%) - test_rollback.py: Tests for rollback manager (0% -> 100%) - test_metrics.py: Tests for metrics collector (0% -> 100%) - test_mcp_integration.py: Tests for MCP safety wrapper (0% -> 100%) - test_validation.py: Additional cache and edge case tests (76% -> 100%) - test_scoring.py: Lock cleanup and edge case tests (78% -> 91%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 19:41:54 +01:00
Felipe Cardoso	758052dcff	feat(context): improve budget validation and XML safety in ranking and Claude adapter - Added stricter budget validation in ContextRanker with explicit error handling for invalid configurations. - Introduced `_get_valid_token_count()` helper to validate and safeguard token counts. - Enhanced XML escaping in Claude adapter to prevent injection risks from scores and unhandled content.	2026-01-04 16:02:18 +01:00
Felipe Cardoso	1628eacf2b	feat(context): enhance timeout handling, tenant isolation, and budget management - Added timeout enforcement for token counting, scoring, and compression with detailed error handling. - Introduced tenant isolation in context caching using project and agent identifiers. - Enhanced budget management with stricter checks for critical context overspending and buffer limitations. - Optimized per-context locking with cleanup to prevent memory leaks in concurrent environments. - Updated default assembly timeout settings for improved performance and reliability. - Improved XML escaping in Claude adapter for safety against injection attacks. - Standardized token estimation using model-specific ratios.	2026-01-04 15:52:50 +01:00
Felipe Cardoso	2bea057fb1	chore(context): refactor for consistency, optimize formatting, and simplify logic - Cleaned up unnecessary comments in `__all__` definitions for better readability. - Adjusted indentation and formatting across modules for improved clarity (e.g., long lines, logical grouping). - Simplified conditional expressions and inline comments for context scoring and ranking. - Replaced some hard-coded values with type-safe annotations (e.g., `ClassVar`). - Removed unused imports and ensured consistent usage across test files. - Updated `test_score_not_cached_on_context` to clarify caching behavior. - Improved truncation strategy logic and marker handling.	2026-01-04 15:23:14 +01:00
Felipe Cardoso	9e54f16e56	test(context): add edge case tests for truncation and scoring concurrency - Add tests for truncation edge cases, including zero tokens, short content, and marker handling. - Add concurrency tests for scoring to verify per-context locking and handling of multiple contexts.	2026-01-04 12:38:04 +01:00

1 2 3 4 5

202 Commits