syndarix/docs/architecture/MCP_SERVERS.md

# MCP Servers Architecture

This document describes the Model Context Protocol (MCP) server architecture in Syndarix.

## Overview

Syndarix uses MCP servers to provide specialized capabilities to AI agents. Each MCP server exposes tools via JSON-RPC 2.0 that agents can invoke through the MCPClientManager.

## Architecture Diagram

```
┌─────────────────────────────────────────────────────────────────────┐
│                         Backend (FastAPI)                            │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                    MCPClientManager                          │    │
│  │  - Connection pooling    - Health checks    - Tool routing   │    │
│  └──────────────────────────┬──────────────────────────────────┘    │
└─────────────────────────────┼───────────────────────────────────────┘
                              │ HTTP/JSON-RPC 2.0
          ┌───────────────────┼───────────────────┐
          │                   │                   │
          ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│   LLM Gateway   │ │ Knowledge Base  │ │   Future MCP    │
│   Port 8001     │ │   Port 8002     │ │   Servers       │
│                 │ │                 │ │                 │
│ - chat_complete │ │ - search        │ │ - git_ops       │
│ - count_tokens  │ │ - ingest        │ │ - issues        │
│ - list_models   │ │ - delete        │ │ - etc.          │
│ - get_usage     │ │ - update        │ │                 │
└────────┬────────┘ └────────┬────────┘ └─────────────────┘
         │                   │
         ▼                   ▼
┌─────────────────┐ ┌─────────────────┐
│    LiteLLM      │ │    PostgreSQL   │
│  (Anthropic,    │ │   + pgvector    │
│   OpenAI, etc.) │ │                 │
└─────────────────┘ └─────────────────┘
```

## MCP Servers

### 1. LLM Gateway (`mcp-servers/llm-gateway/`)

**Purpose**: Unified access to multiple LLM providers with failover, streaming, and cost tracking.

**Port**: 8001

**Tools**:
| Tool | Description |
|------|-------------|
| `chat_completion` | Generate completions with automatic failover |
| `count_tokens` | Count tokens in text using tiktoken |
| `list_models` | List available models by group |
| `get_usage` | Get token/cost usage statistics |

**Model Groups**:
- `reasoning`: Claude Opus 4.5 → GPT-4.1 → Gemini 2.5 Pro
- `code`: Claude Sonnet 4 → Codex → DeepSeek Coder
- `fast`: Claude Haiku → GPT-4.1 Mini → Gemini Flash
- `vision`: Claude Opus 4.5 → GPT-4.1 Vision
- `embedding`: text-embedding-3-large → voyage-3

**Features**:
- Circuit breaker for provider failures (5 failures → 30s cooldown)
- Redis-based cost tracking per project/agent
- Streaming support via SSE
- Automatic failover chain

### 2. Knowledge Base (`mcp-servers/knowledge-base/`)

**Purpose**: RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.

**Port**: 8002

**Tools**:
| Tool | Description |
|------|-------------|
| `search_knowledge` | Semantic, keyword, or hybrid search |
| `ingest_content` | Add content with automatic chunking |
| `delete_content` | Remove by source, collection, or IDs |
| `list_collections` | List collections in a project |
| `get_collection_stats` | Get collection statistics |
| `update_document` | Atomically replace document content |

**Chunking Strategies**:
- **Code**: AST-aware for Python, tree-sitter for JS/TS/Go/Rust
- **Markdown**: Heading-hierarchy aware, preserves structure
- **Text**: Sentence-based with configurable overlap

**Search Types**:
- **Semantic**: pgvector cosine similarity (HNSW index)
- **Keyword**: PostgreSQL full-text search (ts_rank)
- **Hybrid**: Reciprocal Rank Fusion (RRF) combining both

**Features**:
- Redis caching for embedding deduplication
- 1536-dimension embeddings via LLM Gateway
- Atomic document updates (delete + insert in transaction)
- Per-project collection isolation

## Communication Protocol

All MCP servers use JSON-RPC 2.0 over HTTP:

### Tool Discovery
```
GET /mcp/tools
Response: { "tools": [{ "name": "...", "description": "...", "inputSchema": {...} }] }
```

### Tool Execution
```
POST /mcp
Request: {
  "jsonrpc": "2.0",
  "method": "tool_name",
  "params": { "project_id": "...", "agent_id": "...", ... },
  "id": 1
}
Response: {
  "jsonrpc": "2.0",
  "result": { "success": true, ... },
  "id": 1
}
```

### Health Check
```
GET /health
Response: { "status": "healthy", "dependencies": {...} }
```

## Configuration

### Environment Variables

**LLM Gateway**:
```bash
LLM_GATEWAY_HOST=0.0.0.0
LLM_GATEWAY_PORT=8001
LLM_GATEWAY_REDIS_URL=redis://redis:6379/1
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
```

**Knowledge Base**:
```bash
KB_HOST=0.0.0.0
KB_PORT=8002
KB_DATABASE_URL=postgresql://...
KB_REDIS_URL=redis://redis:6379/2
KB_LLM_GATEWAY_URL=http://llm-gateway:8001
```

## Security

### Input Validation
- `project_id`, `agent_id`: Alphanumeric + hyphens/underscores (1-128 chars)
- `collection`: Alphanumeric + hyphens/underscores (1-64 chars)
- `source_path`: No path traversal (`..`), no null bytes, max 4096 chars
- `content`: Max size limit (configurable, default 10MB)

### Error Codes
| Code | Meaning |
|------|---------|
| `INVALID_REQUEST` | Input validation failed |
| `NOT_FOUND` | Resource not found |
| `INTERNAL_ERROR` | Unexpected server error |
| `EMBEDDING_ERROR` | Embedding generation failed |
| `SEARCH_ERROR` | Search operation failed |

## Testing

```bash
# Run LLM Gateway tests
cd mcp-servers/llm-gateway
IS_TEST=True uv run pytest -v --cov=.

# Run Knowledge Base tests
cd mcp-servers/knowledge-base
IS_TEST=True uv run pytest -v --cov=.
```

## Adding New MCP Servers

1. Create directory under `mcp-servers/<name>/`
2. Use FastMCP for tool registration
3. Implement `/health`, `/mcp/tools`, `/mcp` endpoints
4. Add Docker configuration
5. Register in MCPClientManager config
6. Add tests (>90% coverage target)