feat(llm-gateway): implement LLM Gateway MCP Server (#56) #71

Closed
cardosofelipe wants to merge 0 commits from feature/56-llm-gateway-mcp-server into dev

Summary

  • Implements complete LLM Gateway MCP Server providing unified LLM access via LiteLLM
  • Adds 4 MCP tools: chat_completion, list_models, get_usage, count_tokens
  • Implements multi-provider failover chains with circuit breaker pattern
  • Adds Redis-based cost tracking per project/agent with budget limits
  • Achieves 92% test coverage with 209 tests

Changes

New Files

  • config.py - Pydantic Settings with environment variables
  • models.py - Model configurations, groups, and data models
  • exceptions.py - Comprehensive exception hierarchy
  • providers.py - LiteLLM Router configuration
  • routing.py - Model selection with fallover logic
  • failover.py - Circuit breaker implementation
  • cost_tracking.py - Redis-based usage tracking
  • streaming.py - Async streaming support
  • Dockerfile - Container configuration

Model Groups (per ADR-004)

Group Primary Fallback Chain
reasoning claude-opus-4 gpt-4.1 → gemini-2.5-pro
code claude-sonnet-4 gpt-4.1 → deepseek-coder
fast claude-haiku gpt-4.1-mini → gemini-2.0-flash

Test Coverage

  • 209 tests passing
  • 92.35% code coverage (exceeds 90% target)
  • All ruff linting checks pass
  • Multi-sweep code review completed

Test plan

  • All unit tests pass (IS_TEST=True uv run pytest)
  • Code coverage exceeds 90%
  • Ruff linting passes
  • Multi-sweep code review (5 sweeps) completed
  • Manual integration test with backend MCP client

Closes #56

🤖 Generated with Claude Code

## Summary - Implements complete LLM Gateway MCP Server providing unified LLM access via LiteLLM - Adds 4 MCP tools: `chat_completion`, `list_models`, `get_usage`, `count_tokens` - Implements multi-provider failover chains with circuit breaker pattern - Adds Redis-based cost tracking per project/agent with budget limits - Achieves 92% test coverage with 209 tests ## Changes ### New Files - `config.py` - Pydantic Settings with environment variables - `models.py` - Model configurations, groups, and data models - `exceptions.py` - Comprehensive exception hierarchy - `providers.py` - LiteLLM Router configuration - `routing.py` - Model selection with fallover logic - `failover.py` - Circuit breaker implementation - `cost_tracking.py` - Redis-based usage tracking - `streaming.py` - Async streaming support - `Dockerfile` - Container configuration ### Model Groups (per ADR-004) | Group | Primary | Fallback Chain | |-------|---------|----------------| | reasoning | claude-opus-4 | gpt-4.1 → gemini-2.5-pro | | code | claude-sonnet-4 | gpt-4.1 → deepseek-coder | | fast | claude-haiku | gpt-4.1-mini → gemini-2.0-flash | ### Test Coverage - 209 tests passing - 92.35% code coverage (exceeds 90% target) - All ruff linting checks pass - Multi-sweep code review completed ## Test plan - [x] All unit tests pass (`IS_TEST=True uv run pytest`) - [x] Code coverage exceeds 90% - [x] Ruff linting passes - [x] Multi-sweep code review (5 sweeps) completed - [ ] Manual integration test with backend MCP client Closes #56 🤖 Generated with [Claude Code](https://claude.com/claude-code)
cardosofelipe added 1 commit 2026-01-03 19:31:42 +00:00
Implements complete LLM Gateway MCP Server with:
- FastMCP server with 4 tools: chat_completion, list_models, get_usage, count_tokens
- LiteLLM Router with multi-provider failover chains
- Circuit breaker pattern for fault tolerance
- Redis-based cost tracking per project/agent
- Comprehensive test suite (209 tests, 92% coverage)

Model groups defined per ADR-004:
- reasoning: claude-opus-4 → gpt-4.1 → gemini-2.5-pro
- code: claude-sonnet-4 → gpt-4.1 → deepseek-coder
- fast: claude-haiku → gpt-4.1-mini → gemini-2.0-flash

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
cardosofelipe added 1 commit 2026-01-03 19:56:13 +00:00
- Add type annotations for mypy compliance
- Use UTC-aware datetimes consistently (datetime.now(UTC))
- Add type: ignore comments for LiteLLM incomplete stubs
- Fix import ordering and formatting
- Update pyproject.toml mypy configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
cardosofelipe closed this pull request 2026-01-03 19:56:56 +00:00
cardosofelipe deleted branch feature/56-llm-gateway-mcp-server 2026-01-03 19:56:56 +00:00

Pull request closed

Sign in to join this conversation.