forked from cardosofelipe/fast-next-template

Files

Felipe Cardoso 2ab69f8561 docs(mcp): add comprehensive MCP server documentation

- Add docs/architecture/MCP_SERVERS.md with full architecture overview
- Add README.md for LLM Gateway with quick start, tools, and model groups
- Add README.md for Knowledge Base with search types, chunking strategies
- Include API endpoints, security guidelines, and testing instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-04 01:37:04 +01:00

7.0 KiB

Raw Permalink Blame History

MCP Servers Architecture

This document describes the Model Context Protocol (MCP) server architecture in Syndarix.

Overview

Syndarix uses MCP servers to provide specialized capabilities to AI agents. Each MCP server exposes tools via JSON-RPC 2.0 that agents can invoke through the MCPClientManager.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                         Backend (FastAPI)                            │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                    MCPClientManager                          │    │
│  │  - Connection pooling    - Health checks    - Tool routing   │    │
│  └──────────────────────────┬──────────────────────────────────┘    │
└─────────────────────────────┼───────────────────────────────────────┘
                              │ HTTP/JSON-RPC 2.0
          ┌───────────────────┼───────────────────┐
          │                   │                   │
          ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│   LLM Gateway   │ │ Knowledge Base  │ │   Future MCP    │
│   Port 8001     │ │   Port 8002     │ │   Servers       │
│                 │ │                 │ │                 │
│ - chat_complete │ │ - search        │ │ - git_ops       │
│ - count_tokens  │ │ - ingest        │ │ - issues        │
│ - list_models   │ │ - delete        │ │ - etc.          │
│ - get_usage     │ │ - update        │ │                 │
└────────┬────────┘ └────────┬────────┘ └─────────────────┘
         │                   │
         ▼                   ▼
┌─────────────────┐ ┌─────────────────┐
│    LiteLLM      │ │    PostgreSQL   │
│  (Anthropic,    │ │   + pgvector    │
│   OpenAI, etc.) │ │                 │
└─────────────────┘ └─────────────────┘

MCP Servers

1. LLM Gateway (`mcp-servers/llm-gateway/`)

Purpose: Unified access to multiple LLM providers with failover, streaming, and cost tracking.

Port: 8001

Tools:

Tool	Description
`chat_completion`	Generate completions with automatic failover
`count_tokens`	Count tokens in text using tiktoken
`list_models`	List available models by group
`get_usage`	Get token/cost usage statistics

Model Groups:

reasoning: Claude Opus 4.5 → GPT-4.1 → Gemini 2.5 Pro
code: Claude Sonnet 4 → Codex → DeepSeek Coder
fast: Claude Haiku → GPT-4.1 Mini → Gemini Flash
vision: Claude Opus 4.5 → GPT-4.1 Vision
embedding: text-embedding-3-large → voyage-3

Features:

Circuit breaker for provider failures (5 failures → 30s cooldown)
Redis-based cost tracking per project/agent
Streaming support via SSE
Automatic failover chain

2. Knowledge Base (`mcp-servers/knowledge-base/`)

Purpose: RAG capabilities with pgvector for semantic search, intelligent chunking, and collection management.

Port: 8002

Tools:

Tool	Description
`search_knowledge`	Semantic, keyword, or hybrid search
`ingest_content`	Add content with automatic chunking
`delete_content`	Remove by source, collection, or IDs
`list_collections`	List collections in a project
`get_collection_stats`	Get collection statistics
`update_document`	Atomically replace document content

Chunking Strategies:

Code: AST-aware for Python, tree-sitter for JS/TS/Go/Rust
Markdown: Heading-hierarchy aware, preserves structure
Text: Sentence-based with configurable overlap

Search Types:

Semantic: pgvector cosine similarity (HNSW index)
Keyword: PostgreSQL full-text search (ts_rank)
Hybrid: Reciprocal Rank Fusion (RRF) combining both

Features:

Redis caching for embedding deduplication
1536-dimension embeddings via LLM Gateway
Atomic document updates (delete + insert in transaction)
Per-project collection isolation

Communication Protocol

All MCP servers use JSON-RPC 2.0 over HTTP:

Tool Discovery

GET /mcp/tools
Response: { "tools": [{ "name": "...", "description": "...", "inputSchema": {...} }] }

Tool Execution

POST /mcp
Request: {
  "jsonrpc": "2.0",
  "method": "tool_name",
  "params": { "project_id": "...", "agent_id": "...", ... },
  "id": 1
}
Response: {
  "jsonrpc": "2.0",
  "result": { "success": true, ... },
  "id": 1
}

Health Check

GET /health
Response: { "status": "healthy", "dependencies": {...} }

Configuration

Environment Variables

LLM Gateway:

LLM_GATEWAY_HOST=0.0.0.0
LLM_GATEWAY_PORT=8001
LLM_GATEWAY_REDIS_URL=redis://redis:6379/1
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...

Knowledge Base:

KB_HOST=0.0.0.0
KB_PORT=8002
KB_DATABASE_URL=postgresql://...
KB_REDIS_URL=redis://redis:6379/2
KB_LLM_GATEWAY_URL=http://llm-gateway:8001

Security

Input Validation

project_id, agent_id: Alphanumeric + hyphens/underscores (1-128 chars)
collection: Alphanumeric + hyphens/underscores (1-64 chars)
source_path: No path traversal (..), no null bytes, max 4096 chars
content: Max size limit (configurable, default 10MB)

Error Codes

Code	Meaning
`INVALID_REQUEST`	Input validation failed
`NOT_FOUND`	Resource not found
`INTERNAL_ERROR`	Unexpected server error
`EMBEDDING_ERROR`	Embedding generation failed
`SEARCH_ERROR`	Search operation failed

Testing

# Run LLM Gateway tests
cd mcp-servers/llm-gateway
IS_TEST=True uv run pytest -v --cov=.

# Run Knowledge Base tests
cd mcp-servers/knowledge-base
IS_TEST=True uv run pytest -v --cov=.

Adding New MCP Servers

Create directory under mcp-servers/<name>/
Use FastMCP for tool registration
Implement /health, /mcp/tools, /mcp endpoints
Add Docker configuration
Register in MCPClientManager config
Add tests (>90% coverage target)

7.0 KiB Raw Permalink Blame History