forked from cardosofelipe/fast-next-template
docs(mcp): add comprehensive MCP server documentation
- Add docs/architecture/MCP_SERVERS.md with full architecture overview - Add README.md for LLM Gateway with quick start, tools, and model groups - Add README.md for Knowledge Base with search types, chunking strategies - Include API endpoints, security guidelines, and testing instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
192
docs/architecture/MCP_SERVERS.md
Normal file
192
docs/architecture/MCP_SERVERS.md
Normal file
@@ -0,0 +1,192 @@
|
||||
# MCP Servers Architecture
|
||||
|
||||
This document describes the Model Context Protocol (MCP) server architecture in Syndarix.
|
||||
|
||||
## Overview
|
||||
|
||||
Syndarix uses MCP servers to provide specialized capabilities to AI agents. Each MCP server exposes tools via JSON-RPC 2.0 that agents can invoke through the MCPClientManager.
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ Backend (FastAPI) │
|
||||
│ ┌─────────────────────────────────────────────────────────────┐ │
|
||||
│ │ MCPClientManager │ │
|
||||
│ │ - Connection pooling - Health checks - Tool routing │ │
|
||||
│ └──────────────────────────┬──────────────────────────────────┘ │
|
||||
└─────────────────────────────┼───────────────────────────────────────┘
|
||||
│ HTTP/JSON-RPC 2.0
|
||||
┌───────────────────┼───────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ LLM Gateway │ │ Knowledge Base │ │ Future MCP │
|
||||
│ Port 8001 │ │ Port 8002 │ │ Servers │
|
||||
│ │ │ │ │ │
|
||||
│ - chat_complete │ │ - search │ │ - git_ops │
|
||||
│ - count_tokens │ │ - ingest │ │ - issues │
|
||||
│ - list_models │ │ - delete │ │ - etc. │
|
||||
│ - get_usage │ │ - update │ │ │
|
||||
└────────┬────────┘ └────────┬────────┘ └─────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ LiteLLM │ │ PostgreSQL │
|
||||
│ (Anthropic, │ │ + pgvector │
|
||||
│ OpenAI, etc.) │ │ │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
## MCP Servers
|
||||
|
||||
### 1. LLM Gateway (`mcp-servers/llm-gateway/`)
|
||||
|
||||
**Purpose**: Unified access to multiple LLM providers with failover, streaming, and cost tracking.
|
||||
|
||||
**Port**: 8001
|
||||
|
||||
**Tools**:
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `chat_completion` | Generate completions with automatic failover |
|
||||
| `count_tokens` | Count tokens in text using tiktoken |
|
||||
| `list_models` | List available models by group |
|
||||
| `get_usage` | Get token/cost usage statistics |
|
||||
|
||||
**Model Groups**:
|
||||
- `reasoning`: Claude Opus 4.5 → GPT-4.1 → Gemini 2.5 Pro
|
||||
- `code`: Claude Sonnet 4 → Codex → DeepSeek Coder
|
||||
- `fast`: Claude Haiku → GPT-4.1 Mini → Gemini Flash
|
||||
- `vision`: Claude Opus 4.5 → GPT-4.1 Vision
|
||||
- `embedding`: text-embedding-3-large → voyage-3
|
||||
|
||||
**Features**:
|
||||
- Circuit breaker for provider failures (5 failures → 30s cooldown)
|
||||
- Redis-based cost tracking per project/agent
|
||||
- Streaming support via SSE
|
||||
- Automatic failover chain
|
||||
|
||||
### 2. Knowledge Base (`mcp-servers/knowledge-base/`)
|
||||
|
||||
**Purpose**: RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.
|
||||
|
||||
**Port**: 8002
|
||||
|
||||
**Tools**:
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `search_knowledge` | Semantic, keyword, or hybrid search |
|
||||
| `ingest_content` | Add content with automatic chunking |
|
||||
| `delete_content` | Remove by source, collection, or IDs |
|
||||
| `list_collections` | List collections in a project |
|
||||
| `get_collection_stats` | Get collection statistics |
|
||||
| `update_document` | Atomically replace document content |
|
||||
|
||||
**Chunking Strategies**:
|
||||
- **Code**: AST-aware for Python, tree-sitter for JS/TS/Go/Rust
|
||||
- **Markdown**: Heading-hierarchy aware, preserves structure
|
||||
- **Text**: Sentence-based with configurable overlap
|
||||
|
||||
**Search Types**:
|
||||
- **Semantic**: pgvector cosine similarity (HNSW index)
|
||||
- **Keyword**: PostgreSQL full-text search (ts_rank)
|
||||
- **Hybrid**: Reciprocal Rank Fusion (RRF) combining both
|
||||
|
||||
**Features**:
|
||||
- Redis caching for embedding deduplication
|
||||
- 1536-dimension embeddings via LLM Gateway
|
||||
- Atomic document updates (delete + insert in transaction)
|
||||
- Per-project collection isolation
|
||||
|
||||
## Communication Protocol
|
||||
|
||||
All MCP servers use JSON-RPC 2.0 over HTTP:
|
||||
|
||||
### Tool Discovery
|
||||
```
|
||||
GET /mcp/tools
|
||||
Response: { "tools": [{ "name": "...", "description": "...", "inputSchema": {...} }] }
|
||||
```
|
||||
|
||||
### Tool Execution
|
||||
```
|
||||
POST /mcp
|
||||
Request: {
|
||||
"jsonrpc": "2.0",
|
||||
"method": "tool_name",
|
||||
"params": { "project_id": "...", "agent_id": "...", ... },
|
||||
"id": 1
|
||||
}
|
||||
Response: {
|
||||
"jsonrpc": "2.0",
|
||||
"result": { "success": true, ... },
|
||||
"id": 1
|
||||
}
|
||||
```
|
||||
|
||||
### Health Check
|
||||
```
|
||||
GET /health
|
||||
Response: { "status": "healthy", "dependencies": {...} }
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
**LLM Gateway**:
|
||||
```bash
|
||||
LLM_GATEWAY_HOST=0.0.0.0
|
||||
LLM_GATEWAY_PORT=8001
|
||||
LLM_GATEWAY_REDIS_URL=redis://redis:6379/1
|
||||
ANTHROPIC_API_KEY=...
|
||||
OPENAI_API_KEY=...
|
||||
```
|
||||
|
||||
**Knowledge Base**:
|
||||
```bash
|
||||
KB_HOST=0.0.0.0
|
||||
KB_PORT=8002
|
||||
KB_DATABASE_URL=postgresql://...
|
||||
KB_REDIS_URL=redis://redis:6379/2
|
||||
KB_LLM_GATEWAY_URL=http://llm-gateway:8001
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
### Input Validation
|
||||
- `project_id`, `agent_id`: Alphanumeric + hyphens/underscores (1-128 chars)
|
||||
- `collection`: Alphanumeric + hyphens/underscores (1-64 chars)
|
||||
- `source_path`: No path traversal (`..`), no null bytes, max 4096 chars
|
||||
- `content`: Max size limit (configurable, default 10MB)
|
||||
|
||||
### Error Codes
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| `INVALID_REQUEST` | Input validation failed |
|
||||
| `NOT_FOUND` | Resource not found |
|
||||
| `INTERNAL_ERROR` | Unexpected server error |
|
||||
| `EMBEDDING_ERROR` | Embedding generation failed |
|
||||
| `SEARCH_ERROR` | Search operation failed |
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Run LLM Gateway tests
|
||||
cd mcp-servers/llm-gateway
|
||||
IS_TEST=True uv run pytest -v --cov=.
|
||||
|
||||
# Run Knowledge Base tests
|
||||
cd mcp-servers/knowledge-base
|
||||
IS_TEST=True uv run pytest -v --cov=.
|
||||
```
|
||||
|
||||
## Adding New MCP Servers
|
||||
|
||||
1. Create directory under `mcp-servers/<name>/`
|
||||
2. Use FastMCP for tool registration
|
||||
3. Implement `/health`, `/mcp/tools`, `/mcp` endpoints
|
||||
4. Add Docker configuration
|
||||
5. Register in MCPClientManager config
|
||||
6. Add tests (>90% coverage target)
|
||||
178
mcp-servers/knowledge-base/README.md
Normal file
178
mcp-servers/knowledge-base/README.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Knowledge Base MCP Server
|
||||
|
||||
RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.
|
||||
|
||||
## Features
|
||||
|
||||
- **Semantic Search**: pgvector cosine similarity with HNSW indexing
|
||||
- **Keyword Search**: PostgreSQL full-text search
|
||||
- **Hybrid Search**: Reciprocal Rank Fusion combining both
|
||||
- **Intelligent Chunking**: Code-aware, markdown-aware, and text chunking
|
||||
- **Collection Management**: Per-project knowledge organization
|
||||
- **Embedding Caching**: Redis deduplication for efficiency
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
uv sync
|
||||
|
||||
# Run tests
|
||||
IS_TEST=True uv run pytest -v
|
||||
|
||||
# Start server
|
||||
uv run python server.py
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables:
|
||||
```bash
|
||||
KB_HOST=0.0.0.0
|
||||
KB_PORT=8002
|
||||
KB_DEBUG=false
|
||||
KB_DATABASE_URL=postgresql://user:pass@localhost:5432/syndarix
|
||||
KB_REDIS_URL=redis://localhost:6379/2
|
||||
KB_LLM_GATEWAY_URL=http://localhost:8001
|
||||
```
|
||||
|
||||
## MCP Tools
|
||||
|
||||
### search_knowledge
|
||||
|
||||
Search the knowledge base.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"query": "authentication flow",
|
||||
"search_type": "hybrid",
|
||||
"collection": "code",
|
||||
"limit": 10,
|
||||
"threshold": 0.7,
|
||||
"file_types": ["python", "typescript"]
|
||||
}
|
||||
```
|
||||
|
||||
### ingest_content
|
||||
|
||||
Add content to the knowledge base.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"content": "def authenticate(user): ...",
|
||||
"source_path": "/src/auth.py",
|
||||
"collection": "code",
|
||||
"chunk_type": "code",
|
||||
"file_type": "python"
|
||||
}
|
||||
```
|
||||
|
||||
### delete_content
|
||||
|
||||
Remove content from the knowledge base.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"source_path": "/src/old_file.py"
|
||||
}
|
||||
```
|
||||
|
||||
### list_collections
|
||||
|
||||
List all collections in a project.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456"
|
||||
}
|
||||
```
|
||||
|
||||
### get_collection_stats
|
||||
|
||||
Get detailed collection statistics.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"collection": "code"
|
||||
}
|
||||
```
|
||||
|
||||
### update_document
|
||||
|
||||
Atomically replace document content.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"source_path": "/src/auth.py",
|
||||
"content": "def authenticate_v2(user): ...",
|
||||
"collection": "code",
|
||||
"chunk_type": "code",
|
||||
"file_type": "python"
|
||||
}
|
||||
```
|
||||
|
||||
## Chunking Strategies
|
||||
|
||||
### Code Chunking
|
||||
- **Python**: AST-based (functions, classes, methods)
|
||||
- **JavaScript/TypeScript**: Tree-sitter based
|
||||
- **Go/Rust**: Tree-sitter based
|
||||
- Target: ~500 tokens, 50 token overlap
|
||||
|
||||
### Markdown Chunking
|
||||
- Heading-hierarchy aware
|
||||
- Preserves code blocks
|
||||
- Target: ~800 tokens, 100 token overlap
|
||||
|
||||
### Text Chunking
|
||||
- Sentence-based splitting
|
||||
- Target: ~400 tokens, 50 token overlap
|
||||
|
||||
## Search Types
|
||||
|
||||
### Semantic Search
|
||||
Uses pgvector cosine similarity with HNSW indexing for fast approximate nearest neighbor search.
|
||||
|
||||
### Keyword Search
|
||||
Uses PostgreSQL full-text search with ts_rank scoring.
|
||||
|
||||
### Hybrid Search
|
||||
Combines semantic and keyword results using Reciprocal Rank Fusion (RRF):
|
||||
- Default weights: 70% semantic, 30% keyword
|
||||
- Configurable via settings
|
||||
|
||||
## Security
|
||||
|
||||
- Input validation for all IDs and paths
|
||||
- Path traversal prevention
|
||||
- Content size limits (default 10MB)
|
||||
- Per-project data isolation
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Full test suite with coverage
|
||||
IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
|
||||
|
||||
# Specific test file
|
||||
IS_TEST=True uv run pytest tests/test_server.py -v
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/health` | GET | Health check with dependency status |
|
||||
| `/mcp/tools` | GET | List available tools |
|
||||
| `/mcp` | POST | JSON-RPC 2.0 tool execution |
|
||||
129
mcp-servers/llm-gateway/README.md
Normal file
129
mcp-servers/llm-gateway/README.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# LLM Gateway MCP Server
|
||||
|
||||
Unified LLM access with failover chains, cost tracking, and streaming support.
|
||||
|
||||
## Features
|
||||
|
||||
- **Multi-Provider Support**: Anthropic, OpenAI, Google, DeepSeek
|
||||
- **Automatic Failover**: Circuit breaker with configurable thresholds
|
||||
- **Cost Tracking**: Redis-based per-project/agent usage tracking
|
||||
- **Streaming**: SSE support for real-time token delivery
|
||||
- **Model Groups**: Pre-configured chains for different use cases
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
uv sync
|
||||
|
||||
# Run tests
|
||||
IS_TEST=True uv run pytest -v
|
||||
|
||||
# Start server
|
||||
uv run python server.py
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables:
|
||||
```bash
|
||||
LLM_GATEWAY_HOST=0.0.0.0
|
||||
LLM_GATEWAY_PORT=8001
|
||||
LLM_GATEWAY_DEBUG=false
|
||||
LLM_GATEWAY_REDIS_URL=redis://localhost:6379/1
|
||||
|
||||
# Provider API keys
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
OPENAI_API_KEY=sk-...
|
||||
GOOGLE_API_KEY=...
|
||||
DEEPSEEK_API_KEY=...
|
||||
```
|
||||
|
||||
## MCP Tools
|
||||
|
||||
### chat_completion
|
||||
|
||||
Generate completions with automatic failover.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"messages": [{"role": "user", "content": "Hello"}],
|
||||
"model_group": "reasoning",
|
||||
"max_tokens": 4096,
|
||||
"temperature": 0.7,
|
||||
"stream": false
|
||||
}
|
||||
```
|
||||
|
||||
### count_tokens
|
||||
|
||||
Count tokens in text using tiktoken.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"text": "Hello, world!",
|
||||
"model": "gpt-4"
|
||||
}
|
||||
```
|
||||
|
||||
### list_models
|
||||
|
||||
List available models by group.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"model_group": "code"
|
||||
}
|
||||
```
|
||||
|
||||
### get_usage
|
||||
|
||||
Get usage statistics.
|
||||
|
||||
```json
|
||||
{
|
||||
"project_id": "proj-123",
|
||||
"agent_id": "agent-456",
|
||||
"period": "day"
|
||||
}
|
||||
```
|
||||
|
||||
## Model Groups
|
||||
|
||||
| Group | Primary | Fallback 1 | Fallback 2 |
|
||||
|-------|---------|------------|------------|
|
||||
| reasoning | claude-opus-4-5 | gpt-4.1 | gemini-2.5-pro |
|
||||
| code | claude-sonnet-4 | gpt-4.1 | deepseek-coder |
|
||||
| fast | claude-haiku | gpt-4.1-mini | gemini-flash |
|
||||
| vision | claude-sonnet-4 | gpt-4.1 | gemini-2.5-pro |
|
||||
| embedding | text-embedding-3-large | voyage-3 | - |
|
||||
|
||||
## Circuit Breaker
|
||||
|
||||
- **Threshold**: 5 consecutive failures
|
||||
- **Cooldown**: 30 seconds
|
||||
- **Half-Open**: After cooldown, allows one test request
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Full test suite with coverage
|
||||
IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
|
||||
|
||||
# Specific test file
|
||||
IS_TEST=True uv run pytest tests/test_server.py -v
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/health` | GET | Health check |
|
||||
| `/mcp/tools` | GET | List available tools |
|
||||
| `/mcp` | POST | JSON-RPC 2.0 tool execution |
|
||||
Reference in New Issue
Block a user