docs(mcp): add comprehensive MCP server documentation

- Add docs/architecture/MCP_SERVERS.md with full architecture overview - Add README.md for LLM Gateway with quick start, tools, and model groups - Add README.md for Knowledge Base with search types, chunking strategies - Include API endpoints, security guidelines, and testing instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 01:37:04 +01:00
parent 95342cc94d
commit 2ab69f8561
3 changed files with 499 additions and 0 deletions
--- a/mcp-servers/knowledge-base/README.md
+++ b/mcp-servers/knowledge-base/README.md
@@ -0,0 +1,178 @@
+# Knowledge Base MCP Server
+
+RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.
+
+## Features
+
+- **Semantic Search**: pgvector cosine similarity with HNSW indexing
+- **Keyword Search**: PostgreSQL full-text search
+- **Hybrid Search**: Reciprocal Rank Fusion combining both
+- **Intelligent Chunking**: Code-aware, markdown-aware, and text chunking
+- **Collection Management**: Per-project knowledge organization
+- **Embedding Caching**: Redis deduplication for efficiency
+
+## Quick Start
+
+```bash
+# Install dependencies
+uv sync
+
+# Run tests
+IS_TEST=True uv run pytest -v
+
+# Start server
+uv run python server.py
+```
+
+## Configuration
+
+Environment variables:
+```bash
+KB_HOST=0.0.0.0
+KB_PORT=8002
+KB_DEBUG=false
+KB_DATABASE_URL=postgresql://user:pass@localhost:5432/syndarix
+KB_REDIS_URL=redis://localhost:6379/2
+KB_LLM_GATEWAY_URL=http://localhost:8001
+```
+
+## MCP Tools
+
+### search_knowledge
+
+Search the knowledge base.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "query": "authentication flow",
+  "search_type": "hybrid",
+  "collection": "code",
+  "limit": 10,
+  "threshold": 0.7,
+  "file_types": ["python", "typescript"]
+}
+```
+
+### ingest_content
+
+Add content to the knowledge base.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "content": "def authenticate(user): ...",
+  "source_path": "/src/auth.py",
+  "collection": "code",
+  "chunk_type": "code",
+  "file_type": "python"
+}
+```
+
+### delete_content
+
+Remove content from the knowledge base.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "source_path": "/src/old_file.py"
+}
+```
+
+### list_collections
+
+List all collections in a project.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456"
+}
+```
+
+### get_collection_stats
+
+Get detailed collection statistics.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "collection": "code"
+}
+```
+
+### update_document
+
+Atomically replace document content.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "source_path": "/src/auth.py",
+  "content": "def authenticate_v2(user): ...",
+  "collection": "code",
+  "chunk_type": "code",
+  "file_type": "python"
+}
+```
+
+## Chunking Strategies
+
+### Code Chunking
+- **Python**: AST-based (functions, classes, methods)
+- **JavaScript/TypeScript**: Tree-sitter based
+- **Go/Rust**: Tree-sitter based
+- Target: ~500 tokens, 50 token overlap
+
+### Markdown Chunking
+- Heading-hierarchy aware
+- Preserves code blocks
+- Target: ~800 tokens, 100 token overlap
+
+### Text Chunking
+- Sentence-based splitting
+- Target: ~400 tokens, 50 token overlap
+
+## Search Types
+
+### Semantic Search
+Uses pgvector cosine similarity with HNSW indexing for fast approximate nearest neighbor search.
+
+### Keyword Search
+Uses PostgreSQL full-text search with ts_rank scoring.
+
+### Hybrid Search
+Combines semantic and keyword results using Reciprocal Rank Fusion (RRF):
+- Default weights: 70% semantic, 30% keyword
+- Configurable via settings
+
+## Security
+
+- Input validation for all IDs and paths
+- Path traversal prevention
+- Content size limits (default 10MB)
+- Per-project data isolation
+
+## Testing
+
+```bash
+# Full test suite with coverage
+IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
+
+# Specific test file
+IS_TEST=True uv run pytest tests/test_server.py -v
+```
+
+## API Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/health` | GET | Health check with dependency status |
+| `/mcp/tools` | GET | List available tools |
+| `/mcp` | POST | JSON-RPC 2.0 tool execution |
--- a/mcp-servers/llm-gateway/README.md
+++ b/mcp-servers/llm-gateway/README.md
@@ -0,0 +1,129 @@
+# LLM Gateway MCP Server
+
+Unified LLM access with failover chains, cost tracking, and streaming support.
+
+## Features
+
+- **Multi-Provider Support**: Anthropic, OpenAI, Google, DeepSeek
+- **Automatic Failover**: Circuit breaker with configurable thresholds
+- **Cost Tracking**: Redis-based per-project/agent usage tracking
+- **Streaming**: SSE support for real-time token delivery
+- **Model Groups**: Pre-configured chains for different use cases
+
+## Quick Start
+
+```bash
+# Install dependencies
+uv sync
+
+# Run tests
+IS_TEST=True uv run pytest -v
+
+# Start server
+uv run python server.py
+```
+
+## Configuration
+
+Environment variables:
+```bash
+LLM_GATEWAY_HOST=0.0.0.0
+LLM_GATEWAY_PORT=8001
+LLM_GATEWAY_DEBUG=false
+LLM_GATEWAY_REDIS_URL=redis://localhost:6379/1
+
+# Provider API keys
+ANTHROPIC_API_KEY=sk-ant-...
+OPENAI_API_KEY=sk-...
+GOOGLE_API_KEY=...
+DEEPSEEK_API_KEY=...
+```
+
+## MCP Tools
+
+### chat_completion
+
+Generate completions with automatic failover.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "messages": [{"role": "user", "content": "Hello"}],
+  "model_group": "reasoning",
+  "max_tokens": 4096,
+  "temperature": 0.7,
+  "stream": false
+}
+```
+
+### count_tokens
+
+Count tokens in text using tiktoken.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "text": "Hello, world!",
+  "model": "gpt-4"
+}
+```
+
+### list_models
+
+List available models by group.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "model_group": "code"
+}
+```
+
+### get_usage
+
+Get usage statistics.
+
+```json
+{
+  "project_id": "proj-123",
+  "agent_id": "agent-456",
+  "period": "day"
+}
+```
+
+## Model Groups
+
+| Group | Primary | Fallback 1 | Fallback 2 |
+|-------|---------|------------|------------|
+| reasoning | claude-opus-4-5 | gpt-4.1 | gemini-2.5-pro |
+| code | claude-sonnet-4 | gpt-4.1 | deepseek-coder |
+| fast | claude-haiku | gpt-4.1-mini | gemini-flash |
+| vision | claude-sonnet-4 | gpt-4.1 | gemini-2.5-pro |
+| embedding | text-embedding-3-large | voyage-3 | - |
+
+## Circuit Breaker
+
+- **Threshold**: 5 consecutive failures
+- **Cooldown**: 30 seconds
+- **Half-Open**: After cooldown, allows one test request
+
+## Testing
+
+```bash
+# Full test suite with coverage
+IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
+
+# Specific test file
+IS_TEST=True uv run pytest tests/test_server.py -v
+```
+
+## API Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/health` | GET | Health check |
+| `/mcp/tools` | GET | List available tools |
+| `/mcp` | POST | JSON-RPC 2.0 tool execution |