docs(mcp): add comprehensive MCP server documentation

- Add docs/architecture/MCP_SERVERS.md with full architecture overview
- Add README.md for LLM Gateway with quick start, tools, and model groups
- Add README.md for Knowledge Base with search types, chunking strategies
- Include API endpoints, security guidelines, and testing instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-04 01:37:04 +01:00
parent 95342cc94d
commit 2ab69f8561
3 changed files with 499 additions and 0 deletions

View File

@@ -0,0 +1,178 @@
# Knowledge Base MCP Server
RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.
## Features
- **Semantic Search**: pgvector cosine similarity with HNSW indexing
- **Keyword Search**: PostgreSQL full-text search
- **Hybrid Search**: Reciprocal Rank Fusion combining both
- **Intelligent Chunking**: Code-aware, markdown-aware, and text chunking
- **Collection Management**: Per-project knowledge organization
- **Embedding Caching**: Redis deduplication for efficiency
## Quick Start
```bash
# Install dependencies
uv sync
# Run tests
IS_TEST=True uv run pytest -v
# Start server
uv run python server.py
```
## Configuration
Environment variables:
```bash
KB_HOST=0.0.0.0
KB_PORT=8002
KB_DEBUG=false
KB_DATABASE_URL=postgresql://user:pass@localhost:5432/syndarix
KB_REDIS_URL=redis://localhost:6379/2
KB_LLM_GATEWAY_URL=http://localhost:8001
```
## MCP Tools
### search_knowledge
Search the knowledge base.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"query": "authentication flow",
"search_type": "hybrid",
"collection": "code",
"limit": 10,
"threshold": 0.7,
"file_types": ["python", "typescript"]
}
```
### ingest_content
Add content to the knowledge base.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"content": "def authenticate(user): ...",
"source_path": "/src/auth.py",
"collection": "code",
"chunk_type": "code",
"file_type": "python"
}
```
### delete_content
Remove content from the knowledge base.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"source_path": "/src/old_file.py"
}
```
### list_collections
List all collections in a project.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456"
}
```
### get_collection_stats
Get detailed collection statistics.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"collection": "code"
}
```
### update_document
Atomically replace document content.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"source_path": "/src/auth.py",
"content": "def authenticate_v2(user): ...",
"collection": "code",
"chunk_type": "code",
"file_type": "python"
}
```
## Chunking Strategies
### Code Chunking
- **Python**: AST-based (functions, classes, methods)
- **JavaScript/TypeScript**: Tree-sitter based
- **Go/Rust**: Tree-sitter based
- Target: ~500 tokens, 50 token overlap
### Markdown Chunking
- Heading-hierarchy aware
- Preserves code blocks
- Target: ~800 tokens, 100 token overlap
### Text Chunking
- Sentence-based splitting
- Target: ~400 tokens, 50 token overlap
## Search Types
### Semantic Search
Uses pgvector cosine similarity with HNSW indexing for fast approximate nearest neighbor search.
### Keyword Search
Uses PostgreSQL full-text search with ts_rank scoring.
### Hybrid Search
Combines semantic and keyword results using Reciprocal Rank Fusion (RRF):
- Default weights: 70% semantic, 30% keyword
- Configurable via settings
## Security
- Input validation for all IDs and paths
- Path traversal prevention
- Content size limits (default 10MB)
- Per-project data isolation
## Testing
```bash
# Full test suite with coverage
IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
# Specific test file
IS_TEST=True uv run pytest tests/test_server.py -v
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check with dependency status |
| `/mcp/tools` | GET | List available tools |
| `/mcp` | POST | JSON-RPC 2.0 tool execution |

View File

@@ -0,0 +1,129 @@
# LLM Gateway MCP Server
Unified LLM access with failover chains, cost tracking, and streaming support.
## Features
- **Multi-Provider Support**: Anthropic, OpenAI, Google, DeepSeek
- **Automatic Failover**: Circuit breaker with configurable thresholds
- **Cost Tracking**: Redis-based per-project/agent usage tracking
- **Streaming**: SSE support for real-time token delivery
- **Model Groups**: Pre-configured chains for different use cases
## Quick Start
```bash
# Install dependencies
uv sync
# Run tests
IS_TEST=True uv run pytest -v
# Start server
uv run python server.py
```
## Configuration
Environment variables:
```bash
LLM_GATEWAY_HOST=0.0.0.0
LLM_GATEWAY_PORT=8001
LLM_GATEWAY_DEBUG=false
LLM_GATEWAY_REDIS_URL=redis://localhost:6379/1
# Provider API keys
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
DEEPSEEK_API_KEY=...
```
## MCP Tools
### chat_completion
Generate completions with automatic failover.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"messages": [{"role": "user", "content": "Hello"}],
"model_group": "reasoning",
"max_tokens": 4096,
"temperature": 0.7,
"stream": false
}
```
### count_tokens
Count tokens in text using tiktoken.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"text": "Hello, world!",
"model": "gpt-4"
}
```
### list_models
List available models by group.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"model_group": "code"
}
```
### get_usage
Get usage statistics.
```json
{
"project_id": "proj-123",
"agent_id": "agent-456",
"period": "day"
}
```
## Model Groups
| Group | Primary | Fallback 1 | Fallback 2 |
|-------|---------|------------|------------|
| reasoning | claude-opus-4-5 | gpt-4.1 | gemini-2.5-pro |
| code | claude-sonnet-4 | gpt-4.1 | deepseek-coder |
| fast | claude-haiku | gpt-4.1-mini | gemini-flash |
| vision | claude-sonnet-4 | gpt-4.1 | gemini-2.5-pro |
| embedding | text-embedding-3-large | voyage-3 | - |
## Circuit Breaker
- **Threshold**: 5 consecutive failures
- **Cooldown**: 30 seconds
- **Half-Open**: After cooldown, allows one test request
## Testing
```bash
# Full test suite with coverage
IS_TEST=True uv run pytest -v --cov=. --cov-report=term-missing
# Specific test file
IS_TEST=True uv run pytest tests/test_server.py -v
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check |
| `/mcp/tools` | GET | List available tools |
| `/mcp` | POST | JSON-RPC 2.0 tool execution |