From 2ab69f8561e7f619dd77c1109f12d9f7fcb205a9 Mon Sep 17 00:00:00 2001 From: Felipe Cardoso Date: Sun, 4 Jan 2026 01:37:04 +0100 Subject: [PATCH] docs(mcp): add comprehensive MCP server documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add docs/architecture/MCP_SERVERS.md with full architecture overview - Add README.md for LLM Gateway with quick start, tools, and model groups - Add README.md for Knowledge Base with search types, chunking strategies - Include API endpoints, security guidelines, and testing instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 --- docs/architecture/MCP_SERVERS.md | 192 +++++++++++++++++++++++++++ mcp-servers/knowledge-base/README.md | 178 +++++++++++++++++++++++++ mcp-servers/llm-gateway/README.md | 129 ++++++++++++++++++ 3 files changed, 499 insertions(+) create mode 100644 docs/architecture/MCP_SERVERS.md create mode 100644 mcp-servers/knowledge-base/README.md create mode 100644 mcp-servers/llm-gateway/README.md diff --git a/docs/architecture/MCP_SERVERS.md b/docs/architecture/MCP_SERVERS.md new file mode 100644 index 0000000..c583d0f --- /dev/null +++ b/docs/architecture/MCP_SERVERS.md @@ -0,0 +1,192 @@ +# MCP Servers Architecture + +This document describes the Model Context Protocol (MCP) server architecture in Syndarix. + +## Overview + +Syndarix uses MCP servers to provide specialized capabilities to AI agents. Each MCP server exposes tools via JSON-RPC 2.0 that agents can invoke through the MCPClientManager. + +## Architecture Diagram + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ Backend (FastAPI) │ +│ ┌─────────────────────────────────────────────────────────────┐ │ +│ │ MCPClientManager │ │ +│ │ - Connection pooling - Health checks - Tool routing │ │ +│ └──────────────────────────┬──────────────────────────────────┘ │ +└─────────────────────────────┼───────────────────────────────────────┘ + │ HTTP/JSON-RPC 2.0 + ┌───────────────────┼───────────────────┐ + │ │ │ + ▼ ▼ ▼ +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ LLM Gateway │ │ Knowledge Base │ │ Future MCP │ +│ Port 8001 │ │ Port 8002 │ │ Servers │ +│ │ │ │ │ │ +│ - chat_complete │ │ - search │ │ - git_ops │ +│ - count_tokens │ │ - ingest │ │ - issues │ +│ - list_models │ │ - delete │ │ - etc. │ +│ - get_usage │ │ - update │ │ │ +└────────┬────────┘ └────────┬────────┘ └─────────────────┘ + │ │ + ▼ ▼ +┌─────────────────┐ ┌─────────────────┐ +│ LiteLLM │ │ PostgreSQL │ +│ (Anthropic, │ │ + pgvector │ +│ OpenAI, etc.) │ │ │ +└─────────────────┘ └─────────────────┘ +``` + +## MCP Servers + +### 1. LLM Gateway (`mcp-servers/llm-gateway/`) + +**Purpose**: Unified access to multiple LLM providers with failover, streaming, and cost tracking. + +**Port**: 8001 + +**Tools**: +| Tool | Description | +|------|-------------| +| `chat_completion` | Generate completions with automatic failover | +| `count_tokens` | Count tokens in text using tiktoken | +| `list_models` | List available models by group | +| `get_usage` | Get token/cost usage statistics | + +**Model Groups**: +- `reasoning`: Claude Opus 4.5 → GPT-4.1 → Gemini 2.5 Pro +- `code`: Claude Sonnet 4 → Codex → DeepSeek Coder +- `fast`: Claude Haiku → GPT-4.1 Mini → Gemini Flash +- `vision`: Claude Opus 4.5 → GPT-4.1 Vision +- `embedding`: text-embedding-3-large → voyage-3 + +**Features**: +- Circuit breaker for provider failures (5 failures → 30s cooldown) +- Redis-based cost tracking per project/agent +- Streaming support via SSE +- Automatic failover chain + +### 2. Knowledge Base (`mcp-servers/knowledge-base/`) + +**Purpose**: RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management. + +**Port**: 8002 + +**Tools**: +| Tool | Description | +|------|-------------| +| `search_knowledge` | Semantic, keyword, or hybrid search | +| `ingest_content` | Add content with automatic chunking | +| `delete_content` | Remove by source, collection, or IDs | +| `list_collections` | List collections in a project | +| `get_collection_stats` | Get collection statistics | +| `update_document` | Atomically replace document content | + +**Chunking Strategies**: +- **Code**: AST-aware for Python, tree-sitter for JS/TS/Go/Rust +- **Markdown**: Heading-hierarchy aware, preserves structure +- **Text**: Sentence-based with configurable overlap + +**Search Types**: +- **Semantic**: pgvector cosine similarity (HNSW index) +- **Keyword**: PostgreSQL full-text search (ts_rank) +- **Hybrid**: Reciprocal Rank Fusion (RRF) combining both + +**Features**: +- Redis caching for embedding deduplication +- 1536-dimension embeddings via LLM Gateway +- Atomic document updates (delete + insert in transaction) +- Per-project collection isolation + +## Communication Protocol + +All MCP servers use JSON-RPC 2.0 over HTTP: + +### Tool Discovery +``` +GET /mcp/tools +Response: { "tools": [{ "name": "...", "description": "...", "inputSchema": {...} }] } +``` + +### Tool Execution +``` +POST /mcp +Request: { + "jsonrpc": "2.0", + "method": "tool_name", + "params": { "project_id": "...", "agent_id": "...", ... }, + "id": 1 +} +Response: { + "jsonrpc": "2.0", + "result": { "success": true, ... }, + "id": 1 +} +``` + +### Health Check +``` +GET /health +Response: { "status": "healthy", "dependencies": {...} } +``` + +## Configuration + +### Environment Variables + +**LLM Gateway**: +```bash +LLM_GATEWAY_HOST=0.0.0.0 +LLM_GATEWAY_PORT=8001 +LLM_GATEWAY_REDIS_URL=redis://redis:6379/1 +ANTHROPIC_API_KEY=... +OPENAI_API_KEY=... +``` + +**Knowledge Base**: +```bash +KB_HOST=0.0.0.0 +KB_PORT=8002 +KB_DATABASE_URL=postgresql://... +KB_REDIS_URL=redis://redis:6379/2 +KB_LLM_GATEWAY_URL=http://llm-gateway:8001 +``` + +## Security + +### Input Validation +- `project_id`, `agent_id`: Alphanumeric + hyphens/underscores (1-128 chars) +- `collection`: Alphanumeric + hyphens/underscores (1-64 chars) +- `source_path`: No path traversal (`..`), no null bytes, max 4096 chars +- `content`: Max size limit (configurable, default 10MB) + +### Error Codes +| Code | Meaning | +|------|---------| +| `INVALID_REQUEST` | Input validation failed | +| `NOT_FOUND` | Resource not found | +| `INTERNAL_ERROR` | Unexpected server error | +| `EMBEDDING_ERROR` | Embedding generation failed | +| `SEARCH_ERROR` | Search operation failed | + +## Testing + +```bash +# Run LLM Gateway tests +cd mcp-servers/llm-gateway +IS_TEST=True uv run pytest -v --cov=. + +# Run Knowledge Base tests +cd mcp-servers/knowledge-base +IS_TEST=True uv run pytest -v --cov=. +``` + +## Adding New MCP Servers + +1. Create directory under `mcp-servers//` +2. Use FastMCP for tool registration +3. Implement `/health`, `/mcp/tools`, `/mcp` endpoints +4. Add Docker configuration +5. Register in MCPClientManager config +6. Add tests (>90% coverage target) diff --git a/mcp-servers/knowledge-base/README.md b/mcp-servers/knowledge-base/README.md new file mode 100644 index 0000000..f625c3c --- /dev/null +++ b/mcp-servers/knowledge-base/README.md @@ -0,0 +1,178 @@ +# Knowledge Base MCP Server + +RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management. + +## Features + +- **Semantic Search**: pgvector cosine similarity with HNSW indexing +- **Keyword Search**: PostgreSQL full-text search +- **Hybrid Search**: Reciprocal Rank Fusion combining both +- **Intelligent Chunking**: Code-aware, markdown-aware, and text chunking +- **Collection Management**: Per-project knowledge organization +- **Embedding Caching**: Redis deduplication for efficiency + +## Quick Start + +```bash +# Install dependencies +uv sync + +# Run tests +IS_TEST=True uv run pytest -v + +# Start server +uv run python server.py +``` + +## Configuration + +Environment variables: +```bash +KB_HOST=0.0.0.0 +KB_PORT=8002 +KB_DEBUG=false +KB_DATABASE_URL=postgresql://user:pass@localhost:5432/syndarix +KB_REDIS_URL=redis://localhost:6379/2 +KB_LLM_GATEWAY_URL=http://localhost:8001 +``` + +## MCP Tools + +### search_knowledge + +Search the knowledge base. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "query": "authentication flow", + "search_type": "hybrid", + "collection": "code", + "limit": 10, + "threshold": 0.7, + "file_types": ["python", "typescript"] +} +``` + +### ingest_content + +Add content to the knowledge base. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "content": "def authenticate(user): ...", + "source_path": "/src/auth.py", + "collection": "code", + "chunk_type": "code", + "file_type": "python" +} +``` + +### delete_content + +Remove content from the knowledge base. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "source_path": "/src/old_file.py" +} +``` + +### list_collections + +List all collections in a project. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456" +} +``` + +### get_collection_stats + +Get detailed collection statistics. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "collection": "code" +} +``` + +### update_document + +Atomically replace document content. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "source_path": "/src/auth.py", + "content": "def authenticate_v2(user): ...", + "collection": "code", + "chunk_type": "code", + "file_type": "python" +} +``` + +## Chunking Strategies + +### Code Chunking +- **Python**: AST-based (functions, classes, methods) +- **JavaScript/TypeScript**: Tree-sitter based +- **Go/Rust**: Tree-sitter based +- Target: ~500 tokens, 50 token overlap + +### Markdown Chunking +- Heading-hierarchy aware +- Preserves code blocks +- Target: ~800 tokens, 100 token overlap + +### Text Chunking +- Sentence-based splitting +- Target: ~400 tokens, 50 token overlap + +## Search Types + +### Semantic Search +Uses pgvector cosine similarity with HNSW indexing for fast approximate nearest neighbor search. + +### Keyword Search +Uses PostgreSQL full-text search with ts_rank scoring. + +### Hybrid Search +Combines semantic and keyword results using Reciprocal Rank Fusion (RRF): +- Default weights: 70% semantic, 30% keyword +- Configurable via settings + +## Security + +- Input validation for all IDs and paths +- Path traversal prevention +- Content size limits (default 10MB) +- Per-project data isolation + +## Testing + +```bash +# Full test suite with coverage +IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing + +# Specific test file +IS_TEST=True uv run pytest tests/test_server.py -v +``` + +## API Endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/health` | GET | Health check with dependency status | +| `/mcp/tools` | GET | List available tools | +| `/mcp` | POST | JSON-RPC 2.0 tool execution | diff --git a/mcp-servers/llm-gateway/README.md b/mcp-servers/llm-gateway/README.md new file mode 100644 index 0000000..f3d2f58 --- /dev/null +++ b/mcp-servers/llm-gateway/README.md @@ -0,0 +1,129 @@ +# LLM Gateway MCP Server + +Unified LLM access with failover chains, cost tracking, and streaming support. + +## Features + +- **Multi-Provider Support**: Anthropic, OpenAI, Google, DeepSeek +- **Automatic Failover**: Circuit breaker with configurable thresholds +- **Cost Tracking**: Redis-based per-project/agent usage tracking +- **Streaming**: SSE support for real-time token delivery +- **Model Groups**: Pre-configured chains for different use cases + +## Quick Start + +```bash +# Install dependencies +uv sync + +# Run tests +IS_TEST=True uv run pytest -v + +# Start server +uv run python server.py +``` + +## Configuration + +Environment variables: +```bash +LLM_GATEWAY_HOST=0.0.0.0 +LLM_GATEWAY_PORT=8001 +LLM_GATEWAY_DEBUG=false +LLM_GATEWAY_REDIS_URL=redis://localhost:6379/1 + +# Provider API keys +ANTHROPIC_API_KEY=sk-ant-... +OPENAI_API_KEY=sk-... +GOOGLE_API_KEY=... +DEEPSEEK_API_KEY=... +``` + +## MCP Tools + +### chat_completion + +Generate completions with automatic failover. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "messages": [{"role": "user", "content": "Hello"}], + "model_group": "reasoning", + "max_tokens": 4096, + "temperature": 0.7, + "stream": false +} +``` + +### count_tokens + +Count tokens in text using tiktoken. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "text": "Hello, world!", + "model": "gpt-4" +} +``` + +### list_models + +List available models by group. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "model_group": "code" +} +``` + +### get_usage + +Get usage statistics. + +```json +{ + "project_id": "proj-123", + "agent_id": "agent-456", + "period": "day" +} +``` + +## Model Groups + +| Group | Primary | Fallback 1 | Fallback 2 | +|-------|---------|------------|------------| +| reasoning | claude-opus-4-5 | gpt-4.1 | gemini-2.5-pro | +| code | claude-sonnet-4 | gpt-4.1 | deepseek-coder | +| fast | claude-haiku | gpt-4.1-mini | gemini-flash | +| vision | claude-sonnet-4 | gpt-4.1 | gemini-2.5-pro | +| embedding | text-embedding-3-large | voyage-3 | - | + +## Circuit Breaker + +- **Threshold**: 5 consecutive failures +- **Cooldown**: 30 seconds +- **Half-Open**: After cooldown, allows one test request + +## Testing + +```bash +# Full test suite with coverage +IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing + +# Specific test file +IS_TEST=True uv run pytest tests/test_server.py -v +``` + +## API Endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/health` | GET | Health check | +| `/mcp/tools` | GET | List available tools | +| `/mcp` | POST | JSON-RPC 2.0 tool execution |