syndarix

Author	SHA1	Message	Date
Felipe Cardoso	70009676a3	fix(dashboard): disable SSE in demo mode and remove unused hooks - Skip SSE connection in demo mode (MSW doesn't support SSE). - Remove unused `useProjectEvents` and related real-time hooks from `Dashboard`. - Temporarily disable activity feed SSE until a global endpoint is available.	2026-01-06 02:29:00 +01:00
Felipe Cardoso	192237e69b	fix(memory): unify Outcome enum and add ABANDONED support - Add ABANDONED value to core Outcome enum in types.py - Replace duplicate OutcomeType class in mcp/tools.py with alias to Outcome - Simplify mcp/service.py to use outcome directly (no more silent mapping) - Add migration 0006 to extend PostgreSQL episode_outcome enum - Add missing constraints to migration 0005 (ix_facts_unique_triple_global) This fixes the semantic issue where ABANDONED outcomes were silently converted to FAILURE, losing information about task abandonment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-06 01:46:48 +01:00
Felipe Cardoso	3edce9cd26	fix(memory): address critical bugs from multi-agent review Bug Fixes: - Remove singleton pattern from consolidation/reflection services to prevent stale database session bugs (session is now passed per-request) - Add LRU eviction to MemoryToolService._working dict (max 1000 sessions) to prevent unbounded memory growth - Replace O(n) list.remove() with O(1) OrderedDict.move_to_end() in RetrievalCache for better performance under load - Use deque with maxlen for metrics histograms to prevent unbounded memory growth (circular buffer with 10k max samples) - Use full UUID for checkpoint IDs instead of 8-char prefix to avoid collision risk at scale (birthday paradox at ~50k checkpoints) Test Updates: - Update checkpoint test to expect 36-char UUID - Update reflection singleton tests to expect new factory behavior - Add reset_memory_reflection() no-op for backwards compatibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 18:55:32 +01:00
Felipe Cardoso	35aea2d73a	perf(mcp): optimize test performance with parallel connections and reduced retries - Connect to MCP servers concurrently instead of sequentially - Reduce retry settings in test mode (IS_TEST=True): - 1 attempt instead of 3 - 100ms retry delay instead of 1s - 2s timeout instead of 30-120s Reduces MCP E2E test time from ~16s to under 1s. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 18:33:38 +01:00
Felipe Cardoso	d0f32d04f7	fix(tests): reduce TTL durations to improve test reliability - Adjusted TTL durations and sleep intervals across memory and cache tests for consistent expiration behavior. - Prevented test flakiness caused by timing discrepancies in token expiration and cache cleanup.	2026-01-05 18:29:02 +01:00
Felipe Cardoso	da85a8aba8	fix(memory): prevent entry metadata mutation in vector search - Create shallow copy of VectorIndexEntry when adding similarity score - Prevents mutation of cached entries that could corrupt shared state 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:54 +01:00
Felipe Cardoso	f8bd1011e9	security(memory): escape SQL ILIKE patterns to prevent injection - Add _escape_like_pattern() helper to escape SQL wildcards (%, _, \) - Apply escaping in SemanticMemory.search_facts and get_by_entity - Apply escaping in ProceduralMemory.search and find_best_for_task Prevents attackers from injecting SQL wildcard patterns through user-controlled search terms. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:47 +01:00
Felipe Cardoso	f057c2f0b6	fix(memory): add thread-safe singleton initialization - Add threading.Lock with double-check locking to ScopeManager - Add asyncio.Lock with double-check locking to MemoryReflection - Make reset_memory_metrics async with proper locking - Update test fixtures to handle async reset functions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:39 +01:00
Felipe Cardoso	33ec889fc4	fix(memory): add data integrity constraints to Fact model - Change source_episode_ids from JSON to JSONB for PostgreSQL consistency - Add unique constraint for global facts (project_id IS NULL) - Add CHECK constraint ensuring reinforcement_count >= 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 17:39:30 +01:00
Felipe Cardoso	74b8c65741	fix(tests): move memory model tests to avoid import conflicts Moved tests/unit/models/memory/ to tests/models/memory/ to avoid Python import path conflicts when pytest collects all tests. The conflict was caused by tests/models/ and tests/unit/models/ both having __init__.py files, causing Python to confuse app.models.memory imports. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 15:45:30 +01:00
Felipe Cardoso	b232298c61	feat(memory): add memory consolidation task and switch `source_episode_ids` to JSON - Added `memory_consolidation` to the task list and updated `__all__` in test files. - Updated `source_episode_ids` in `Fact` model to use JSON for cross-database compatibility. - Revised related database migrations to use JSONB instead of ARRAY. - Adjusted test concurrency in Makefile for improved test performance.	2026-01-05 15:38:52 +01:00
Felipe Cardoso	cf6291ac8e	style(memory): apply ruff formatting and linting fixes Auto-fixed linting errors and formatting issues: - Removed unused imports (F401): pytest, Any, AnalysisType, MemoryType, OutcomeType - Removed unused variable (F841): hooks variable in test - Applied consistent formatting across memory service and test files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 14:07:48 +01:00
Felipe Cardoso	e3fe0439fd	docs(memory): add comprehensive memory system documentation (#101 ) Add complete documentation for the Agent Memory System including: - Architecture overview with ASCII diagram - Memory type descriptions (working, episodic, semantic, procedural) - Usage examples for all memory operations - Memory scoping hierarchy explanation - Consolidation flow documentation - MCP tools reference - Reflection capabilities - Configuration reference table - Integration with Context Engine - Metrics reference - Performance targets - Troubleshooting guide - Directory structure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 11:03:57 +01:00
Felipe Cardoso	57680c3772	feat(memory): implement metrics and observability (#100 ) Add comprehensive metrics collector for memory system with: - Counter metrics: operations, retrievals, cache hits/misses, consolidations, episodes recorded, patterns/anomalies/insights detected - Gauge metrics: item counts, memory size, cache size, procedure success rates, active sessions, pending consolidations - Histogram metrics: working memory latency, retrieval latency, consolidation duration, embedding latency - Prometheus format export - Summary and cache stats helpers 31 tests covering all metric types, singleton pattern, and edge cases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 11:00:53 +01:00
Felipe Cardoso	997cfaa03a	feat(memory): implement memory reflection service (#99 ) Add reflection layer for memory system with pattern detection, success/failure factor analysis, anomaly detection, and insights generation. Enables agents to learn from past experiences and identify optimization opportunities. Key components: - Pattern detection: recurring success/failure, action sequences, temporal, efficiency - Factor analysis: action, context, timing, resource, preceding state factors - Anomaly detection: unusual duration, token usage, failure rates, action patterns - Insight generation: optimization, warning, learning, recommendation, trend insights Also fixes pre-existing timezone issues in test_types.py (datetime.now() -> datetime.now(UTC)). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 04:22:23 +01:00
Felipe Cardoso	6954774e36	feat(memory): implement caching layer for memory operations (#98 ) Add comprehensive caching layer for the Agent Memory System: - HotMemoryCache: LRU cache for frequently accessed memories - Python 3.12 type parameter syntax - Thread-safe operations with RLock - TTL-based expiration - Access count tracking for hot memory identification - Scoped invalidation by type, scope, or pattern - EmbeddingCache: Cache embeddings by content hash - Content-hash based deduplication - Optional Redis backing for persistence - LRU eviction with configurable max size - CachedEmbeddingGenerator wrapper for transparent caching - CacheManager: Unified cache management - Coordinates hot cache, embedding cache, and retrieval cache - Centralized invalidation across all caches - Aggregated statistics and hit rate tracking - Automatic cleanup scheduling - Cache warmup support Performance targets: - Cache hit rate > 80% for hot memories - Cache operations < 1ms (memory), < 5ms (Redis) 83 new tests with comprehensive coverage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 04:04:13 +01:00
Felipe Cardoso	30e5c68304	feat(memory): integrate memory system with context engine (#97 ) ## Changes ### New Context Type - Add MEMORY to ContextType enum for agent memory context - Create MemoryContext class with subtypes (working, episodic, semantic, procedural) - Factory methods: from_working_memory, from_episodic_memory, from_semantic_memory, from_procedural_memory ### Memory Context Source - MemoryContextSource service fetches relevant memories for context assembly - Configurable fetch limits per memory type - Parallel fetching from all memory types ### Agent Lifecycle Hooks - AgentLifecycleManager handles spawn, pause, resume, terminate events - spawn: Initialize working memory with optional initial state - pause: Create checkpoint of working memory - resume: Restore from checkpoint - terminate: Consolidate working memory to episodic memory - LifecycleHooks for custom extension points ### Context Engine Integration - Add memory_query parameter to assemble_context() - Add session_id and agent_type_id for memory scoping - Memory budget allocation (15% by default) - set_memory_source() for runtime configuration ### Tests - 48 new tests for MemoryContext, MemoryContextSource, and lifecycle hooks - All 108 memory-related tests passing - mypy and ruff checks passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:49:22 +01:00
Felipe Cardoso	0b24d4c6cc	feat(memory): implement MCP tools for agent memory operations (#96 ) Add MCP-compatible tools that expose memory operations to agents: Tools implemented: - remember: Store data in working, episodic, semantic, or procedural memory - recall: Retrieve memories by query across multiple memory types - forget: Delete specific keys or bulk delete by pattern - reflect: Analyze patterns in recent episodes (success/failure factors) - get_memory_stats: Return usage statistics and breakdowns - search_procedures: Find procedures matching trigger patterns - record_outcome: Record task outcomes and update procedure success rates Key components: - tools.py: Pydantic schemas for tool argument validation with comprehensive field constraints (importance 0-1, TTL limits, limit ranges) - service.py: MemoryToolService coordinating memory type operations with proper scoping via ToolContext (project_id, agent_instance_id, session_id) - Lazy initialization of memory services (WorkingMemory, EpisodicMemory, SemanticMemory, ProceduralMemory) Test coverage: - 60 tests covering tool definitions, argument validation, and service execution paths - Mock-based tests for all memory type interactions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:32:10 +01:00
Felipe Cardoso	1670e05e0d	feat(memory): implement memory consolidation service and tasks (#95 ) - Add MemoryConsolidationService with Working→Episodic→Semantic/Procedural transfer - Add Celery tasks for session and nightly consolidation - Implement memory pruning with importance-based retention - Add comprehensive test suite (32 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 03:04:28 +01:00
Felipe Cardoso	999b7ac03f	feat(memory): implement memory indexing and retrieval engine (#94 ) Add comprehensive indexing and retrieval system for memory search: - VectorIndex for semantic similarity search using cosine similarity - TemporalIndex for time-based queries with range and recency support - EntityIndex for entity-based lookups with multi-entity intersection - OutcomeIndex for success/failure filtering on episodes - MemoryIndexer as unified interface for all index types - RetrievalEngine with hybrid search combining all indices - RelevanceScorer for multi-signal relevance scoring - RetrievalCache for LRU caching of search results 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:50:13 +01:00
Felipe Cardoso	48ecb40f18	feat(memory): implement memory scoping with hierarchy and access control (#93 ) Add scope management system for hierarchical memory access: - ScopeManager with hierarchy: Global → Project → Agent Type → Agent Instance → Session - ScopePolicy for access control (read, write, inherit permissions) - ScopeResolver for resolving queries across scope hierarchies with inheritance - ScopeFilter for filtering scopes by type, project, or agent - Access control enforcement with parent scope visibility - Deduplication support during resolution across scopes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:39:22 +01:00
Felipe Cardoso	b818f17418	feat(memory): add procedural memory implementation (Issue #92 ) Implements procedural memory for learned skills and procedures: Core functionality: - ProceduralMemory class for procedure storage/retrieval - record_procedure with duplicate detection and step merging - find_matching for context-based procedure search - record_outcome for success/failure tracking - get_best_procedure for finding highest success rate - update_steps for procedure refinement Supporting modules: - ProcedureMatcher: Keyword-based procedure matching - MatchResult/MatchContext: Matching result types - Success rate weighting in match scoring Test coverage: - 43 unit tests covering all modules - matching.py: 97% coverage - memory.py: 86% coverage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:31:32 +01:00
Felipe Cardoso	e946787a61	feat(memory): add semantic memory implementation (Issue #91 ) Implements semantic memory with fact storage, retrieval, and verification: Core functionality: - SemanticMemory class for fact storage/retrieval - Fact storage as subject-predicate-object triples - Duplicate detection with reinforcement - Semantic search with text-based fallback - Entity-based retrieval - Confidence scoring and decay - Conflict resolution Supporting modules: - FactExtractor: Pattern-based fact extraction from episodes - FactVerifier: Contradiction detection and reliability scoring Test coverage: - 47 unit tests covering all modules - extraction.py: 99% coverage - verification.py: 95% coverage - memory.py: 78% coverage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:23:06 +01:00
Felipe Cardoso	3554efe66a	feat(memory): add episodic memory implementation (Issue #90 ) Implements the episodic memory service for storing and retrieving agent task execution experiences. This enables learning from past successes and failures. Components: - EpisodicMemory: Main service class combining recording and retrieval - EpisodeRecorder: Handles episode creation, importance scoring - EpisodeRetriever: Multiple retrieval strategies (recency, semantic, outcome, importance, task type) Key features: - Records task completions with context, actions, outcomes - Calculates importance scores based on outcome, duration, lessons - Semantic search with fallback to recency when embeddings unavailable - Full CRUD operations with statistics and summarization - Comprehensive unit tests (50 tests, all passing) Closes #90 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 02:08:16 +01:00
Felipe Cardoso	bd988f76b0	fix(memory): address review findings from Issue #88 Fixes based on multi-agent review: Model Improvements: - Remove duplicate index ix_procedures_agent_type (already indexed via Column) - Fix postgresql_where to use text() instead of string literal in Fact model - Add thread-safety to Procedure.success_rate property (snapshot values) Data Integrity Constraints: - Add CheckConstraint for Episode: importance_score 0-1, duration >= 0, tokens >= 0 - Add CheckConstraint for Fact: confidence 0-1 - Add CheckConstraint for Procedure: success_count >= 0, failure_count >= 0 Migration Updates: - Add check constraints creation in upgrade() - Add check constraints removal in downgrade() Note: SQLAlchemy Column default=list is correct (callable factory pattern) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:54:51 +01:00
Felipe Cardoso	4974233169	feat(memory): add working memory implementation (Issue #89 ) Implements session-scoped ephemeral memory with: Storage Backends: - InMemoryStorage: Thread-safe fallback with TTL support and capacity limits - RedisStorage: Primary storage with connection pooling and JSON serialization - Auto-fallback from Redis to in-memory when unavailable WorkingMemory Class: - Key-value storage with TTL and reserved key protection - Task state tracking with progress updates - Scratchpad for reasoning steps with timestamps - Checkpoint/snapshot support for recovery - Factory methods for auto-configured storage Tests: - 55 unit tests covering all functionality - Tests for basic ops, TTL, capacity, concurrency - Tests for task state, scratchpad, checkpoints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:51:03 +01:00
Felipe Cardoso	c9d8c0835c	feat(memory): add database schema and storage layer (Issue #88 ) Add SQLAlchemy models for the Agent Memory System: - WorkingMemory: Key-value storage with TTL for active sessions - Episode: Experiential memories from task executions - Fact: Semantic knowledge triples with confidence scores - Procedure: Learned skills and procedures with success tracking - MemoryConsolidationLog: Tracks consolidation jobs between memory tiers Create enums for memory system: - ScopeType: global, project, agent_type, agent_instance, session - EpisodeOutcome: success, failure, partial - ConsolidationType: working_to_episodic, episodic_to_semantic, etc. - ConsolidationStatus: pending, running, completed, failed Add Alembic migration (0005) for all memory tables with: - Foreign key relationships to projects, agent_instances, agent_types - Comprehensive indexes for query patterns - Unique constraints for key lookups and triple uniqueness - Vector embedding column placeholders (Text fallback until pgvector enabled) Fix timezone-naive datetime.now() in types.py TaskState (review feedback) Includes 30 unit tests for models and enums. Closes #88 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:37:58 +01:00
Felipe Cardoso	085a748929	feat(memory): #87 project setup & core architecture Implements Sub-Issue #87 of Issue #62 (Agent Memory System). Core infrastructure: - memory/types.py: Type definitions for all memory types (Working, Episodic, Semantic, Procedural) with enums for MemoryType, ScopeLevel, Outcome - memory/config.py: MemorySettings with MEM_ env prefix, thread-safe singleton - memory/exceptions.py: Comprehensive exception hierarchy for memory operations - memory/manager.py: MemoryManager facade with placeholder methods Directory structure: - working/: Working memory (Redis/in-memory) - to be implemented in #89 - episodic/: Episodic memory (experiences) - to be implemented in #90 - semantic/: Semantic memory (facts) - to be implemented in #91 - procedural/: Procedural memory (skills) - to be implemented in #92 - scoping/: Scope management - to be implemented in #93 - indexing/: Vector indexing - to be implemented in #94 - consolidation/: Memory consolidation - to be implemented in #95 Tests: 71 unit tests for config, types, and exceptions Docs: Comprehensive implementation plan at docs/architecture/memory-system-plan.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 01:27:36 +01:00
Felipe Cardoso	4b149b8a52	feat(tests): add unit tests for Context Management API routes - Added detailed unit tests for `/context` endpoints, covering health checks, context assembly, token counting, budget retrieval, and cache invalidation. - Included edge cases, error handling, and input validation for context-related operations. - Improved test coverage for the Context Management module with mocked dependencies and integration scenarios.	2026-01-05 01:02:49 +01:00
Felipe Cardoso	ad0c06851d	feat(tests): add comprehensive E2E tests for MCP and Agent workflows - Introduced end-to-end tests for MCP workflows, including server discovery, authentication, context engine operations, error handling, and input validation. - Added full lifecycle tests for agent workflows, covering type management, instance spawning, status transitions, and admin-only operations. - Enhanced test coverage for real-world MCP and Agent scenarios across PostgreSQL and async environments.	2026-01-05 01:02:41 +01:00
Felipe Cardoso	49359b1416	feat(api): add Context Management API and routes - Introduced a new `context` module and its endpoints for Context Management. - Added `/context` route to the API router for assembling LLM context, token counting, budget management, and cache invalidation. - Implemented health checks, context assembly, token counting, and caching operations in the Context Management Engine. - Included schemas for request/response models and tightened error handling for context-related operations.	2026-01-05 01:02:33 +01:00
Felipe Cardoso	911d950c15	feat(tests): add comprehensive integration tests for MCP stack - Introduced integration tests covering backend, LLM Gateway, Knowledge Base, and Context Engine. - Includes health checks, tool listing, token counting, and end-to-end MCP flows. - Added `RUN_INTEGRATION_TESTS` environment flag to enable selective test execution. - Includes a quick health check script to verify service availability before running tests.	2026-01-05 01:02:22 +01:00
Felipe Cardoso	b2a3ac60e0	feat: add integration testing target to Makefile - Introduced `test-integration` command for MCP integration tests. - Expanded help section with details about running integration tests. - Improved Makefile's testing capabilities for enhanced developer workflows.	2026-01-05 01:02:16 +01:00
Felipe Cardoso	dea092e1bb	feat: extend Makefile with testing and validation commands, expand help section - Added new targets for testing (`test`, `test-backend`, `test-mcp`, `test-frontend`, etc.) and validation (`validate`, `validate-all`). - Enhanced help section to reflect updates, including detailed descriptions for testing, validation, and new MCP-specific commands. - Improved developer workflow by centralizing testing and linting processes in the Makefile.	2026-01-05 01:02:09 +01:00
Felipe Cardoso	4154dd5268	feat: enhance database transactions, add Makefiles, and improve Docker setup - Refactored database batch operations to ensure transaction atomicity and simplify nested structure. - Added `Makefile` for `knowledge-base` and `llm-gateway` modules to streamline development workflows. - Simplified `Dockerfile` for `llm-gateway` by removing multi-stage builds and optimizing dependencies. - Improved code readability in `collection_manager` and `failover` modules with refined logic. - Minor fixes in `test_server` and Redis health check handling for better diagnostics.	2026-01-05 00:49:19 +01:00
Felipe Cardoso	db12937495	feat: integrate MCP servers into Docker Compose files for development and deployment - Added `mcp-llm-gateway` and `mcp-knowledge-base` services to `docker-compose.dev.yml`, `docker-compose.deploy.yml`, and `docker-compose.yml` for AI agent capabilities. - Configured health checks, environment variables, and dependencies for MCP services. - Included updated resource limits and deployment settings for production environments. - Connected backend and agent services to the MCP servers.	2026-01-05 00:49:10 +01:00
Felipe Cardoso	81e1456631	test(activity): fix flaky test by generating fresh events for today group - Resolves timezone and day boundary issues by creating fresh "today" events in the test case.	2026-01-05 00:30:36 +01:00
Felipe Cardoso	58e78d8700	docs(workflow): add pre-commit hooks documentation Document the pre-commit hook setup, behavior, and rationale for protecting only main/dev branches while allowing flexibility on feature branches. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 19:49:45 +01:00
Felipe Cardoso	5e80139afa	chore: add pre-commit hook for protected branch validation Adds a git hook that: - Blocks commits to main/dev if validation fails - Runs `make validate` for backend changes - Runs `npm run validate` for frontend changes - Skips validation for feature branches (can run manually) To enable: git config core.hooksPath .githooks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 19:42:53 +01:00
Felipe Cardoso	60ebeaa582	test(safety): add comprehensive tests for safety framework modules Add tests to improve backend coverage from 85% to 93%: - test_audit.py: 60 tests for AuditLogger (20% -> 99%) - Hash chain integrity, sanitization, retention, handlers - Fixed bug: hash chain modification after event creation - Fixed bug: verification not using correct prev_hash - test_hitl.py: Tests for HITL manager (0% -> 100%) - test_permissions.py: Tests for permissions manager (0% -> 99%) - test_rollback.py: Tests for rollback manager (0% -> 100%) - test_metrics.py: Tests for metrics collector (0% -> 100%) - test_mcp_integration.py: Tests for MCP safety wrapper (0% -> 100%) - test_validation.py: Additional cache and edge case tests (76% -> 100%) - test_scoring.py: Lock cleanup and edge case tests (78% -> 91%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 19:41:54 +01:00
Felipe Cardoso	758052dcff	feat(context): improve budget validation and XML safety in ranking and Claude adapter - Added stricter budget validation in ContextRanker with explicit error handling for invalid configurations. - Introduced `_get_valid_token_count()` helper to validate and safeguard token counts. - Enhanced XML escaping in Claude adapter to prevent injection risks from scores and unhandled content.	2026-01-04 16:02:18 +01:00
Felipe Cardoso	1628eacf2b	feat(context): enhance timeout handling, tenant isolation, and budget management - Added timeout enforcement for token counting, scoring, and compression with detailed error handling. - Introduced tenant isolation in context caching using project and agent identifiers. - Enhanced budget management with stricter checks for critical context overspending and buffer limitations. - Optimized per-context locking with cleanup to prevent memory leaks in concurrent environments. - Updated default assembly timeout settings for improved performance and reliability. - Improved XML escaping in Claude adapter for safety against injection attacks. - Standardized token estimation using model-specific ratios.	2026-01-04 15:52:50 +01:00
Felipe Cardoso	2bea057fb1	chore(context): refactor for consistency, optimize formatting, and simplify logic - Cleaned up unnecessary comments in `__all__` definitions for better readability. - Adjusted indentation and formatting across modules for improved clarity (e.g., long lines, logical grouping). - Simplified conditional expressions and inline comments for context scoring and ranking. - Replaced some hard-coded values with type-safe annotations (e.g., `ClassVar`). - Removed unused imports and ensured consistent usage across test files. - Updated `test_score_not_cached_on_context` to clarify caching behavior. - Improved truncation strategy logic and marker handling.	2026-01-04 15:23:14 +01:00
Felipe Cardoso	9e54f16e56	test(context): add edge case tests for truncation and scoring concurrency - Add tests for truncation edge cases, including zero tokens, short content, and marker handling. - Add concurrency tests for scoring to verify per-context locking and handling of multiple contexts.	2026-01-04 12:38:04 +01:00
Felipe Cardoso	96e6400bd8	feat(context): enhance performance, caching, and settings management - Replace hard-coded limits with configurable settings (e.g., cache memory size, truncation strategy, relevance settings). - Optimize parallel execution in token counting, scoring, and reranking for source diversity. - Improve caching logic: - Add per-context locks for safe parallel scoring. - Reuse precomputed fingerprints for cache efficiency. - Make truncation, scoring, and ranker behaviors fully configurable via settings. - Add support for middle truncation, context hash-based hashing, and dynamic token limiting. - Refactor methods for scalability and better error handling. Tests: Updated all affected components with additional test cases.	2026-01-04 12:37:58 +01:00
Felipe Cardoso	6c7b72f130	chore(context): apply linter fixes and sort imports (#86 ) Phase 8 of Context Management Engine - Final Cleanup: - Sort __all__ exports alphabetically - Sort imports per isort conventions - Fix minor linting issues Final test results: - 311 context management tests passing - 2507 total backend tests passing - 85% code coverage Context Management Engine is complete with all 8 phases: 1. Foundation: Types, Config, Exceptions 2. Token Budget Management 3. Context Scoring & Ranking 4. Context Assembly Pipeline 5. Model Adapters (Claude, OpenAI) 6. Caching Layer (Redis + in-memory) 7. Main Engine & Integration 8. Testing & Documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 02:46:56 +01:00
Felipe Cardoso	027ebfc332	feat(context): implement main ContextEngine with full integration (#85 ) Phase 7 of Context Management Engine - Main Engine: - Add ContextEngine as main orchestration class - Integrate all components: calculator, scorer, ranker, compressor, cache - Add high-level assemble_context() API with: - System prompt support - Task description support - Knowledge Base integration via MCP - Conversation history conversion - Tool results conversion - Custom contexts support - Add helper methods: - get_budget_for_model() - count_tokens() with caching - invalidate_cache() - get_stats() - Add create_context_engine() factory function Tests: 26 new tests, 311 total context tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 02:44:40 +01:00
Felipe Cardoso	c2466ab401	feat(context): implement Redis-based caching layer (#84 ) Phase 6 of Context Management Engine - Caching Layer: - Add ContextCache with Redis integration - Support fingerprint-based assembled context caching - Support token count caching (model-specific) - Support score caching (scorer + context + query) - Add in-memory fallback with LRU eviction - Add cache invalidation with pattern matching - Add cache statistics reporting Key features: - Hierarchical cache key structure (ctx:type:hash) - Automatic TTL expiration - Memory cache for fast repeated access - Graceful degradation when Redis unavailable Tests: 29 new tests, 285 total context tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 02:41:21 +01:00
Felipe Cardoso	7828d35e06	feat(context): implement model adapters for Claude and OpenAI (#83 ) Phase 5 of Context Management Engine - Model Adapters: - Add ModelAdapter abstract base class with model matching - Add DefaultAdapter for unknown models (plain text) - Add ClaudeAdapter with XML-based formatting: - <system_instructions> for system context - <reference_documents>/<document> for knowledge - <conversation_history>/<message> for chat - <tool_results>/<tool_result> for tool outputs - XML escaping for special characters - Add OpenAIAdapter with markdown formatting: - ## headers for sections - ### Source headers for documents - ROLE bold labels for conversation - Code blocks for tool outputs - Add get_adapter() factory function for model selection Tests: 33 new tests, 256 total context tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 02:36:32 +01:00
Felipe Cardoso	6b07e62f00	feat(context): implement assembly pipeline and compression (#82 ) Phase 4 of Context Management Engine - Assembly Pipeline: - Add TruncationStrategy with end/middle/sentence-aware truncation - Add TruncationResult dataclass for tracking compression metrics - Add ContextCompressor for type-specific compression - Add ContextPipeline orchestrating full assembly workflow: - Token counting for all contexts - Scoring and ranking via ContextRanker - Optional compression when budget threshold exceeded - Model-specific formatting (XML for Claude, markdown for OpenAI) - Add PipelineMetrics for performance tracking - Update AssembledContext with new fields (model, contexts, metadata) - Add backward compatibility aliases for renamed fields Tests: 34 new tests, 223 total context tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 02:32:25 +01:00

1 2 3 4 5 ...

473 Commits