forked from cardosofelipe/fast-next-template
- Add docs/architecture/MCP_SERVERS.md with full architecture overview - Add README.md for LLM Gateway with quick start, tools, and model groups - Add README.md for Knowledge Base with search types, chunking strategies - Include API endpoints, security guidelines, and testing instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.0 KiB
7.0 KiB
MCP Servers Architecture
This document describes the Model Context Protocol (MCP) server architecture in Syndarix.
Overview
Syndarix uses MCP servers to provide specialized capabilities to AI agents. Each MCP server exposes tools via JSON-RPC 2.0 that agents can invoke through the MCPClientManager.
Architecture Diagram
┌─────────────────────────────────────────────────────────────────────┐
│ Backend (FastAPI) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ MCPClientManager │ │
│ │ - Connection pooling - Health checks - Tool routing │ │
│ └──────────────────────────┬──────────────────────────────────┘ │
└─────────────────────────────┼───────────────────────────────────────┘
│ HTTP/JSON-RPC 2.0
┌───────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ LLM Gateway │ │ Knowledge Base │ │ Future MCP │
│ Port 8001 │ │ Port 8002 │ │ Servers │
│ │ │ │ │ │
│ - chat_complete │ │ - search │ │ - git_ops │
│ - count_tokens │ │ - ingest │ │ - issues │
│ - list_models │ │ - delete │ │ - etc. │
│ - get_usage │ │ - update │ │ │
└────────┬────────┘ └────────┬────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ LiteLLM │ │ PostgreSQL │
│ (Anthropic, │ │ + pgvector │
│ OpenAI, etc.) │ │ │
└─────────────────┘ └─────────────────┘
MCP Servers
1. LLM Gateway (mcp-servers/llm-gateway/)
Purpose: Unified access to multiple LLM providers with failover, streaming, and cost tracking.
Port: 8001
Tools:
| Tool | Description |
|---|---|
chat_completion |
Generate completions with automatic failover |
count_tokens |
Count tokens in text using tiktoken |
list_models |
List available models by group |
get_usage |
Get token/cost usage statistics |
Model Groups:
reasoning: Claude Opus 4.5 → GPT-4.1 → Gemini 2.5 Procode: Claude Sonnet 4 → Codex → DeepSeek Coderfast: Claude Haiku → GPT-4.1 Mini → Gemini Flashvision: Claude Opus 4.5 → GPT-4.1 Visionembedding: text-embedding-3-large → voyage-3
Features:
- Circuit breaker for provider failures (5 failures → 30s cooldown)
- Redis-based cost tracking per project/agent
- Streaming support via SSE
- Automatic failover chain
2. Knowledge Base (mcp-servers/knowledge-base/)
Purpose: RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.
Port: 8002
Tools:
| Tool | Description |
|---|---|
search_knowledge |
Semantic, keyword, or hybrid search |
ingest_content |
Add content with automatic chunking |
delete_content |
Remove by source, collection, or IDs |
list_collections |
List collections in a project |
get_collection_stats |
Get collection statistics |
update_document |
Atomically replace document content |
Chunking Strategies:
- Code: AST-aware for Python, tree-sitter for JS/TS/Go/Rust
- Markdown: Heading-hierarchy aware, preserves structure
- Text: Sentence-based with configurable overlap
Search Types:
- Semantic: pgvector cosine similarity (HNSW index)
- Keyword: PostgreSQL full-text search (ts_rank)
- Hybrid: Reciprocal Rank Fusion (RRF) combining both
Features:
- Redis caching for embedding deduplication
- 1536-dimension embeddings via LLM Gateway
- Atomic document updates (delete + insert in transaction)
- Per-project collection isolation
Communication Protocol
All MCP servers use JSON-RPC 2.0 over HTTP:
Tool Discovery
GET /mcp/tools
Response: { "tools": [{ "name": "...", "description": "...", "inputSchema": {...} }] }
Tool Execution
POST /mcp
Request: {
"jsonrpc": "2.0",
"method": "tool_name",
"params": { "project_id": "...", "agent_id": "...", ... },
"id": 1
}
Response: {
"jsonrpc": "2.0",
"result": { "success": true, ... },
"id": 1
}
Health Check
GET /health
Response: { "status": "healthy", "dependencies": {...} }
Configuration
Environment Variables
LLM Gateway:
LLM_GATEWAY_HOST=0.0.0.0
LLM_GATEWAY_PORT=8001
LLM_GATEWAY_REDIS_URL=redis://redis:6379/1
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
Knowledge Base:
KB_HOST=0.0.0.0
KB_PORT=8002
KB_DATABASE_URL=postgresql://...
KB_REDIS_URL=redis://redis:6379/2
KB_LLM_GATEWAY_URL=http://llm-gateway:8001
Security
Input Validation
project_id,agent_id: Alphanumeric + hyphens/underscores (1-128 chars)collection: Alphanumeric + hyphens/underscores (1-64 chars)source_path: No path traversal (..), no null bytes, max 4096 charscontent: Max size limit (configurable, default 10MB)
Error Codes
| Code | Meaning |
|---|---|
INVALID_REQUEST |
Input validation failed |
NOT_FOUND |
Resource not found |
INTERNAL_ERROR |
Unexpected server error |
EMBEDDING_ERROR |
Embedding generation failed |
SEARCH_ERROR |
Search operation failed |
Testing
# Run LLM Gateway tests
cd mcp-servers/llm-gateway
IS_TEST=True uv run pytest -v --cov=.
# Run Knowledge Base tests
cd mcp-servers/knowledge-base
IS_TEST=True uv run pytest -v --cov=.
Adding New MCP Servers
- Create directory under
mcp-servers/<name>/ - Use FastMCP for tool registration
- Implement
/health,/mcp/tools,/mcpendpoints - Add Docker configuration
- Register in MCPClientManager config
- Add tests (>90% coverage target)