- Added timeout enforcement for token counting, scoring, and compression with detailed error handling.
- Introduced tenant isolation in context caching using project and agent identifiers.
- Enhanced budget management with stricter checks for critical context overspending and buffer limitations.
- Optimized per-context locking with cleanup to prevent memory leaks in concurrent environments.
- Updated default assembly timeout settings for improved performance and reliability.
- Improved XML escaping in Claude adapter for safety against injection attacks.
- Standardized token estimation using model-specific ratios.
- Cleaned up unnecessary comments in `__all__` definitions for better readability.
- Adjusted indentation and formatting across modules for improved clarity (e.g., long lines, logical grouping).
- Simplified conditional expressions and inline comments for context scoring and ranking.
- Replaced some hard-coded values with type-safe annotations (e.g., `ClassVar`).
- Removed unused imports and ensured consistent usage across test files.
- Updated `test_score_not_cached_on_context` to clarify caching behavior.
- Improved truncation strategy logic and marker handling.
- Add tests for truncation edge cases, including zero tokens, short content, and marker handling.
- Add concurrency tests for scoring to verify per-context locking and handling of multiple contexts.
Phase 7 of Context Management Engine - Main Engine:
- Add ContextEngine as main orchestration class
- Integrate all components: calculator, scorer, ranker, compressor, cache
- Add high-level assemble_context() API with:
- System prompt support
- Task description support
- Knowledge Base integration via MCP
- Conversation history conversion
- Tool results conversion
- Custom contexts support
- Add helper methods:
- get_budget_for_model()
- count_tokens() with caching
- invalidate_cache()
- get_stats()
- Add create_context_engine() factory function
Tests: 26 new tests, 311 total context tests passing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 5 of Context Management Engine - Model Adapters:
- Add ModelAdapter abstract base class with model matching
- Add DefaultAdapter for unknown models (plain text)
- Add ClaudeAdapter with XML-based formatting:
- <system_instructions> for system context
- <reference_documents>/<document> for knowledge
- <conversation_history>/<message> for chat
- <tool_results>/<tool_result> for tool outputs
- XML escaping for special characters
- Add OpenAIAdapter with markdown formatting:
- ## headers for sections
- ### Source headers for documents
- **ROLE** bold labels for conversation
- Code blocks for tool outputs
- Add get_adapter() factory function for model selection
Tests: 33 new tests, 256 total context tests passing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TokenCalculator with LLM Gateway integration for accurate token
counting with in-memory caching and fallback character-based estimation.
Implement TokenBudget for tracking allocations per context type with
budget enforcement, and BudgetAllocator for creating budgets based on
model context window sizes.
- TokenCalculator: MCP integration, caching, model-specific ratios
- TokenBudget: allocation tracking, can_fit/allocate/deallocate/reset
- BudgetAllocator: model context sizes, budget creation and adjustment
- 35 comprehensive tests covering all budget functionality
Part of #61 - Context Management Engine
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added `record_action` in `RateLimiter` for precise tracking of slot consumption post-validation.
- Introduced deduplication mechanism for warning alerts in `CostController` to prevent spamming.
- Refactored `CostController`'s session and daily budget alert handling for improved clarity.
- Implemented test suites for `CostController` and `SafetyGuardian` to validate changes.
- Expanded integration testing to cover deduplication, validation, and loop detection edge cases.
Improved code readability and uniformity by standardizing line breaks, indentation, and inline conditions across safety-related services, models, and tests, including content filters, validation rules, and emergency controls.
The delay2 and delay3 variables were calculated but never asserted,
causing lint warnings. Added assertions to verify all delays are
positive and within max bounds.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Core MCP client implementation with comprehensive tooling:
**Services:**
- MCPClientManager: Main facade for all MCP operations
- MCPServerRegistry: Thread-safe singleton for server configs
- ConnectionPool: Connection pooling with auto-reconnection
- ToolRouter: Automatic tool routing with circuit breaker
- AsyncCircuitBreaker: Custom async-compatible circuit breaker
**Configuration:**
- YAML-based config with Pydantic models
- Environment variable expansion support
- Transport types: HTTP, SSE, STDIO
**API Endpoints:**
- GET /mcp/servers - List all MCP servers
- GET /mcp/servers/{name}/tools - List server tools
- GET /mcp/tools - List all tools from all servers
- GET /mcp/health - Health check all servers
- POST /mcp/call - Execute tool (admin only)
- GET /mcp/circuit-breakers - Circuit breaker status
- POST /mcp/circuit-breakers/{name}/reset - Reset circuit breaker
- POST /mcp/servers/{name}/reconnect - Force reconnection
**Testing:**
- 156 unit tests with comprehensive coverage
- Tests for all services, routes, and error handling
- Proper mocking and async test support
**Documentation:**
- MCP_CLIENT.md with usage examples
- Phase 2+ workflow documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added tests for OAuth provider admin and consent endpoints covering edge cases.
- Extended agent-related tests to handle incorrect project associations and lifecycle state transitions.
- Introduced tests for sprint status transitions and validation checks.
- Improved multiline formatting consistency across all test functions.
Bug Fixes:
- bulk_terminate_by_project now unassigns issues before terminating agents
to prevent orphaned issue assignments
- PATCH /issues/{id} now validates sprint status - cannot assign issues
to COMPLETED or CANCELLED sprints
- archive_project now performs cascading cleanup:
- Terminates all active agent instances
- Cancels all planned/active sprints
- Unassigns issues from terminated agents
Added edge case tests for all fixed bugs (19 new tests total):
- TestBulkTerminateEdgeCases
- TestSprintStatusValidation
- TestArchiveProjectCleanup
- TestDataIntegrityEdgeCases (IDOR protection)
Coverage: 93% (1836 tests passing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit fixes 4 production bugs found via edge case testing:
1. BUG: System allowed assigning issues to terminated agents
- Added validation in issue creation endpoint
- Added validation in issue update endpoint
- Added validation in issue assign endpoint
2. BUG: Issues remained orphaned when agent was terminated
- Agent termination now auto-unassigns all issues from that agent
These bugs could lead to issues being assigned to non-functional agents
that would never work on them, causing work to stall silently.
Tests added in tests/api/routes/syndarix/test_edge_cases.py to verify:
- Cannot assign issue to terminated agent (3 variations)
- Issues are auto-unassigned when agent is terminated
- Various other edge cases (sprints, projects, IDOR protection)
Coverage: 88% → 93% (1830 tests passing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 22 tests for agents API covering:
- CRUD operations (spawn, list, get, update, delete)
- Lifecycle management (pause, resume)
- Agent metrics (single and project-level)
- Authorization and access control
- Status filtering
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 24 tests for issues API covering:
- CRUD operations (create, list, get, update, delete)
- Status and priority filtering
- Search functionality
- Issue statistics
- Authorization and access control
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 46 tests for projects API covering:
- CRUD operations (create, list, get, update, archive)
- Lifecycle management (pause, resume)
- Authorization and access control
- Pagination and filtering
- All autonomy levels
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Infrastructure:
- Add Redis and Celery workers to all docker-compose files
- Fix celery migration race condition in entrypoint.sh
- Add healthchecks and resource limits to dev compose
- Update .env.template with Redis/Celery variables
Backend Models & Schemas:
- Rename Sprint.completed_points to velocity (per requirements)
- Add AgentInstance.name as required field
- Rename Issue external tracker fields for consistency
- Add IssueSource and TrackerType enums
- Add Project.default_tracker_type field
Backend Fixes:
- Add Celery retry configuration with exponential backoff
- Remove unused sequence counter from EventBus
- Add mypy overrides for test dependencies
- Fix test file using wrong schema (UserUpdate -> dict)
Frontend Fixes:
- Fix memory leak in useProjectEvents (proper cleanup)
- Fix race condition with stale closure in reconnection
- Sync TokenWithUser type with regenerated API client
- Fix expires_in null handling in useAuth
- Clean up unused imports in prototype pages
- Add ESLint relaxed rules for prototype files
CI/CD:
- Add E2E testing stage with Testcontainers
- Add security scanning with Trivy and pip-audit
- Add dependency caching for faster builds
Tests:
- Update all tests to use renamed fields (velocity, name, etc.)
- Fix 14 schema test failures
- All 1500 tests pass with 91% coverage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Project model with slug, description, autonomy level, and settings
- Add AgentType model for agent templates with model config and failover
- Add AgentInstance model for running agents with status and memory
- Add Issue model with external tracker sync (Gitea/GitHub/GitLab)
- Add Sprint model with velocity tracking and lifecycle management
- Add comprehensive Pydantic schemas with validation
- Add full CRUD operations for all models with filtering/sorting
- Add 280+ tests for models, schemas, and CRUD operations
Implements #23, #24, #25, #26, #27🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Renamed unused `code_verifier` variables to `_code_verifier` for clarity.
- Improved test readability by reformatting long lines and assertions.
- Streamlined `get` request calls by consolidating parameters into single lines.
- Implemented OAuth 2.0 Authorization Server endpoints per RFCs, including token, introspection, revocation, and metadata discovery.
- Added user consent submission, listing, and revocation APIs alongside frontend integration for improved UX.
- Enforced stricter OAuth security measures (PKCE, state validation, scopes).
- Refactored schemas and services for consistency and expanded coverage of OAuth workflows.
- Updated documentation and type definitions for new API behaviors.
- Reformatted headers in E2E tests to improve readability and ensure consistent style.
- Updated confidential client fixture to use bcrypt for secret hashing, enhancing security and testing backward compatibility with legacy SHA-256 hashes.
- Added new test cases for PKCE verification, rejecting insecure 'plain' methods, and improved error handling.
- Refined session workflows and user agent handling in E2E tests for session management.
- Consolidated schema operation tests and fixed minor formatting inconsistencies.
- Implemented stricter OAuth security measures, including CSRF protection via state parameter validation and redirect_uri checks.
- Updated OAuth models to support timezone-aware datetime comparisons, replacing deprecated `utcnow`.
- Enhanced logging for malformed Basic auth headers during token, introspect, and revoke requests.
- Added allowlist validation for OAuth provider domains to prevent open redirect attacks.
- Improved nonce validation for OpenID Connect tokens, ensuring token integrity during Google provider flows.
- Updated E2E and unit tests to cover new security features and expanded OAuth state handling scenarios.
- Introduced E2E tests for admin user and organization management workflows: user listing, creation, updates, bulk actions, and organization membership management.
- Added comprehensive tests for organization CRUD operations, membership visibility, roles, and permission validation.
- Expanded fixtures for superuser and member setup to streamline testing of admin-specific operations.
- Verified pagination, filtering, and action consistency across admin endpoints.
- Introduced full OAuth 2.0 Authorization Server functionality for MCP clients.
- Updated documentation with details on endpoints, scopes, and consent management.
- Added a new frontend OAuth consent page for user authorization flows.
- Implemented database models for authorization codes, refresh tokens, and user consents.
- Created unit tests for service methods (PKCE verification, client validation, scope handling).
- Included comprehensive integration tests for OAuth provider workflows.
- Introduced comprehensive E2E tests for organization workflows: creation, membership management, and updates.
- Added tests for user management workflows: profile viewing, updates, password changes, and settings.
- Implemented session management tests, including listing, revocation, multi-device handling, and cleanup.
- Included API contract validation tests using Schemathesis, covering protected endpoints and schema structure.
- Enhanced E2E testing infrastructure with full PostgreSQL support and detailed workflow coverage.
- Reformatted assertions in `test_database_workflows.py` for better readability.
- Refactored `postgres_url` transformation logic in `conftest.py` for improved clarity.
- Adjusted import handling in `test_api_contracts.py` to streamline usage of Hypothesis and Schemathesis libraries.
- Introduced make commands for E2E tests using Testcontainers and Schemathesis.
- Updated `.env.demo` with configurable OAuth settings for Google and GitHub.
- Enhanced `README.md` with updated environment setup instructions.
- Added E2E testing dependencies and markers in `pyproject.toml` for real PostgreSQL and API contract validation.
- Included new libraries (`arrow`, `attrs`, `docker`, etc.) for testing and schema validation workflows.
- Extended OAuth callback tests to cover various scenarios (e.g., account linking, user creation, inactive users, and token/user info failures).
- Added `app/init_db.py` to the excluded files in `pyproject.toml`.
- Added models for `OAuthClient`, `OAuthState`, and `OAuthAccount`.
- Created Pydantic schemas to support OAuth flows, client management, and linked accounts.
- Implemented skeleton endpoints for OAuth Provider mode: authorization, token, and revocation.
- Updated router imports to include new `/oauth` and `/oauth/provider` routes.
- Added Alembic migration script to create OAuth-related database tables.
- Enhanced `users` table to allow OAuth-only accounts by making `password_hash` nullable.
- Deleted `admin.ts`, `auth.ts`, and `users.ts` MSW handler files to streamline demo mode setup.
- Updated demo credentials logic in `DemoCredentialsModal` and `DemoModeBanner` for stronger password requirements (≥12 characters).
- Refined documentation in `CLAUDE.md` to align with new credential standards and auto-generated MSW workflows.
- Replaced `SUPPORTED_LOCALES` with `supported_locales` for naming consistency.
- Applied formatting improvements to multiline statements for better readability.
- Cleaned up redundant comments and streamlined test assertions.
- Introduced `locale` field in user model and schemas with BCP 47 format validation.
- Created Alembic migration to add `locale` column to the `users` table with indexing for better query performance.
- Implemented `get_locale` dependency to detect locale using user preference, `Accept-Language` header, or default to English.
- Added extensive tests for locale validation, dependency logic, and fallback handling.
- Enhanced documentation and comments detailing the locale detection workflow and SUPPORTED_LOCALES configuration.
- Introduced `pyproject.toml` to centralize backend tool configurations (e.g., Ruff, mypy, coverage, pytest).
- Replaced Black, isort, and Flake8 with Ruff for linting, formatting, and import sorting.
- Updated `requirements.txt` to include Ruff and remove replaced tools.
- Added `Makefile` to streamline development workflows with commands for linting, formatting, type-checking, testing, and cleanup.
- Introduced `/api/v1/admin/sessions` endpoint to fetch paginated session data for admin monitoring.
- Added `AdminSessionResponse` schema to include user details in session responses.
- Implemented session data retrieval with filtering and pagination in `session_crud`.
- Created comprehensive test suite for session management, covering success, filtering, pagination, and unauthorized access scenarios.