Files
syndarix/docs/architecture/ARCHITECTURE_OVERVIEW.md
Felipe Cardoso 6e3cdebbfb docs: add architecture decision records (ADRs) for key technical choices
- Added the following ADRs to `docs/adrs/` directory:
  - ADR-001: MCP Integration Architecture
  - ADR-002: Real-time Communication Architecture
  - ADR-003: Background Task Architecture
  - ADR-004: LLM Provider Abstraction
  - ADR-005: Technology Stack Selection
- Each ADR details the context, decision drivers, considered options, final decisions, and implementation plans.
- Documentation aligns technical choices with architecture principles and system requirements for Syndarix.
2025-12-29 13:16:02 +01:00

488 lines
29 KiB
Markdown

# Syndarix Architecture Overview
**Version:** 1.0
**Date:** 2025-12-29
**Status:** Draft
---
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [System Context](#2-system-context)
3. [High-Level Architecture](#3-high-level-architecture)
4. [Core Components](#4-core-components)
5. [Data Architecture](#5-data-architecture)
6. [Integration Architecture](#6-integration-architecture)
7. [Security Architecture](#7-security-architecture)
8. [Deployment Architecture](#8-deployment-architecture)
9. [Cross-Cutting Concerns](#9-cross-cutting-concerns)
10. [Architecture Decisions](#10-architecture-decisions)
---
## 1. Executive Summary
Syndarix is an AI-powered software consulting agency platform that orchestrates specialized AI agents to deliver complete software solutions autonomously. This document describes the technical architecture that enables:
- **Multi-Agent Orchestration:** 10 specialized agent roles collaborating on projects
- **MCP-First Integration:** All external tools via Model Context Protocol
- **Real-time Visibility:** SSE-based event streaming for progress tracking
- **Autonomous Workflows:** Configurable autonomy levels from full control to autonomous
- **Full Artifact Delivery:** Code, documentation, tests, and ADRs
### Architecture Principles
1. **MCP-First:** All integrations through unified MCP servers
2. **Event-Driven:** Async communication via Redis Pub/Sub
3. **Type-Safe:** Full typing in Python and TypeScript
4. **Stateless Services:** Horizontal scaling through stateless design
5. **Explicit Scoping:** All operations scoped to project/agent
---
## 2. System Context
### Context Diagram
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ EXTERNAL ACTORS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Client │ │ Admin │ │ LLM APIs │ │ Git Hosts │ │
│ │ (Human) │ │ (Human) │ │ (Anthropic) │ │ (Gitea) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │ │
└─────────│──────────────────│──────────────────│──────────────────│──────────┘
│ │ │ │
│ Web UI │ Admin UI │ API │ API
│ SSE │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ SYNDARIX PLATFORM │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Agent Orchestration │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │ PO │ │ PM │ │ Arch │ │ Eng │ │ QA │ ... │ │
│ │ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
│ │ │ │
│ Storage │ Events │ Tasks │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ INFRASTRUCTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ Celery │ │MCP Servers │ │
│ │ + pgvector │ │ Pub/Sub │ │ Workers │ │ (7 types) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Key Actors
| Actor | Type | Interaction |
|-------|------|-------------|
| Client | Human | Web UI, approvals, feedback |
| Admin | Human | Configuration, monitoring |
| LLM Providers | External | Claude, GPT-4, local models |
| Git Hosts | External | Gitea, GitHub, GitLab |
| CI/CD Systems | External | Gitea Actions, etc. |
---
## 3. High-Level Architecture
### Layered Architecture
```
┌───────────────────────────────────────────────────────────────────┐
│ PRESENTATION LAYER │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Next.js 16 Frontend │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │Dashboard │ │ Projects │ │ Agents │ │ Issues │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
│ REST + SSE + WebSocket
┌───────────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Backend │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Auth │ │ API │ │ Services │ │ Events │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────┐
│ ORCHESTRATION LAYER │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ │
│ │ │ Agent │ │ Workflow │ │ Project │ │ │
│ │ │ Orchestrator │ │ Engine │ │ Manager │ │ │
│ │ └───────────────┘ └───────────────┘ └───────────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────┐
│ INTEGRATION LAYER │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ MCP Client Manager │ │
│ │ Connects to: LLM, Git, KB, Issues, FS, Code, CI/CD MCPs │ │
│ └─────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ File Store │ │
│ │ + pgvector │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└───────────────────────────────────────────────────────────────────┘
```
---
## 4. Core Components
### 4.1 Agent Orchestrator
**Purpose:** Manages agent lifecycle, spawning, communication, and coordination.
**Responsibilities:**
- Spawn agent instances from type definitions
- Route messages between agents
- Manage agent context and memory
- Handle agent failover
- Track resource usage
**Key Patterns:**
- Type-Instance pattern (types define templates, instances are runtime)
- Message routing with priority queues
- Context compression for long-running agents
See: [ADR-006: Agent Orchestration](../adrs/ADR-006-agent-orchestration.md)
### 4.2 Workflow Engine
**Purpose:** Orchestrates multi-step workflows and agent collaboration.
**Responsibilities:**
- Execute workflow templates (requirements discovery, sprint, etc.)
- Track workflow state and progress
- Handle branching and conditions
- Manage approval gates
**Workflow Types:**
- Requirements Discovery
- Architecture Spike
- Sprint Planning
- Implementation
- Sprint Demo
### 4.3 Project Manager (Component)
**Purpose:** Manages project lifecycle, configuration, and state.
**Responsibilities:**
- Create and configure projects
- Manage complexity levels
- Track project status
- Generate reports
### 4.4 LLM Gateway
**Purpose:** Unified LLM access with failover and cost tracking.
**Implementation:** LiteLLM-based router with:
- Multiple model groups (high-reasoning, fast-response)
- Automatic failover chain
- Per-agent token tracking
- Redis-backed caching
See: [ADR-004: LLM Provider Abstraction](../adrs/ADR-004-llm-provider-abstraction.md)
### 4.5 MCP Client Manager
**Purpose:** Connects to all MCP servers and routes tool calls.
**Implementation:**
- SSE connections to 7 MCP server types
- Automatic reconnection
- Request/response correlation
- Scoped tool calls with project_id/agent_id
See: [ADR-001: MCP Integration Architecture](../adrs/ADR-001-mcp-integration-architecture.md)
### 4.6 Event Bus
**Purpose:** Real-time event distribution using Redis Pub/Sub.
**Channels:**
- `project:{project_id}` - Project-scoped events
- `agent:{agent_id}` - Agent-specific events
- `system` - System-wide announcements
See: [ADR-002: Real-time Communication](../adrs/ADR-002-realtime-communication.md)
---
## 5. Data Architecture
### 5.1 Entity Model
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ User │───1:N─│ Project │───1:N─│ Sprint │
└─────────────┘ └─────────────┘ └─────────────┘
│ 1:N │ 1:N
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ │ │ │
┌──────┴──────┐ ┌────┴────┐ │ ┌─────┴─────┐
│ AgentInstance│ │Repository│ │ │ Issue │
└─────────────┘ └─────────┘ │ └───────────┘
│ │ │ │
│ 1:N │ 1:N │ │ 1:N
┌──────┴──────┐ ┌──────┴────┐│ ┌──────┴──────┐
│ Message │ │PullRequest│└───────│IssueComment │
└─────────────┘ └───────────┘ └─────────────┘
```
### 5.2 Key Entities
| Entity | Purpose | Key Fields |
|--------|---------|------------|
| User | Human users | email, auth |
| Project | Work containers | name, complexity, autonomy_level |
| AgentType | Agent templates | base_model, expertise, system_prompt |
| AgentInstance | Running agents | name, project_id, context |
| Issue | Work items | type, status, external_tracker_fields |
| Sprint | Time-boxed iterations | goal, velocity |
| Repository | Git repos | provider, clone_url |
| KnowledgeDocument | RAG documents | content, embedding_id |
### 5.3 Vector Storage
**pgvector** extension for:
- Document embeddings (RAG)
- Semantic search across knowledge base
- Agent context similarity
---
## 6. Integration Architecture
### 6.1 MCP Server Registry
| Server | Port | Purpose | Priority Providers |
|--------|------|---------|-------------------|
| LLM Gateway | 9001 | LLM routing | Anthropic, OpenAI, Ollama |
| Git MCP | 9002 | Git operations | Gitea, GitHub, GitLab |
| Knowledge Base | 9003 | RAG search | pgvector |
| Issues MCP | 9004 | Issue tracking | Gitea, GitHub, GitLab |
| File System | 9005 | Workspace files | Local FS |
| Code Analysis | 9006 | Static analysis | Ruff, ESLint |
| CI/CD MCP | 9007 | Pipelines | Gitea Actions |
### 6.2 External Integration Diagram
```
┌─────────────────────────────────────────────────────────────────┐
│ Syndarix Backend │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ MCP Client Manager │ │
│ │ │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │ LLM │ │ Git │ │ KB │ │ Issues │ │ CI/CD │ │ │
│ │ │ Client │ │ Client │ │ Client │ │ Client │ │ Client │ │ │
│ │ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ │ │
│ └──────│──────────│──────────│──────────│──────────│──────┘ │
└─────────│──────────│──────────│──────────│──────────│──────────┘
│ │ │ │ │
│ SSE │ SSE │ SSE │ SSE │ SSE
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ LLM │ │ Git │ │ KB │ │ Issues │ │ CI/CD │
│ MCP │ │ MCP │ │ MCP │ │ MCP │ │ MCP │
│ Server │ │ Server │ │ Server │ │ Server │ │ Server │
└───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌─────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Anthropic│ │ Gitea │ │pgvector│ │ Gitea │ │ Gitea │
│ OpenAI │ │ GitHub │ │ │ │ Issues │ │Actions │
│ Ollama │ │ GitLab │ │ │ │ │ │ │
└─────────┘ └────────┘ └────────┘ └────────┘ └────────┘
```
---
## 7. Security Architecture
### 7.1 Authentication
- **JWT Dual-Token:** Access token (15 min) + Refresh token (7 days)
- **OAuth 2.0 Provider:** For MCP client authentication
- **Service Tokens:** Internal service-to-service auth
### 7.2 Authorization
- **RBAC:** Role-based access control
- **Project Scoping:** All operations scoped to projects
- **Agent Permissions:** Agents operate within project scope
### 7.3 Data Protection
- **TLS 1.3:** All external communications
- **Encryption at Rest:** Database encryption
- **Secrets Management:** Environment-based, never in code
---
## 8. Deployment Architecture
### 8.1 Container Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Docker Compose │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Frontend │ │ Backend │ │ Workers │ │ Flower │ │
│ │ (Next.js)│ │ (FastAPI)│ │ (Celery) │ │(Monitor) │ │
│ │ :3000 │ │ :8000 │ │ │ │ :5555 │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ LLM MCP │ │ Git MCP │ │ KB MCP │ │Issues MCP│ │
│ │ :9001 │ │ :9002 │ │ :9003 │ │ :9004 │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ FS MCP │ │ Code MCP │ │CI/CD MCP │ │
│ │ :9005 │ │ :9006 │ │ :9007 │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Infrastructure │ │
│ │ ┌──────────┐ ┌──────────┐ │ │
│ │ │PostgreSQL│ │ Redis │ │ │
│ │ │ :5432 │ │ :6379 │ │ │
│ │ └──────────┘ └──────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
### 8.2 Scaling Strategy
| Component | Scaling | Strategy |
|-----------|---------|----------|
| Frontend | Horizontal | Stateless, behind LB |
| Backend | Horizontal | Stateless, behind LB |
| Celery Workers | Horizontal | Queue-based routing |
| MCP Servers | Horizontal | Stateless singletons |
| PostgreSQL | Vertical + Read Replicas | Primary/replica |
| Redis | Cluster | Sentinel or Cluster mode |
---
## 9. Cross-Cutting Concerns
### 9.1 Logging
- **Format:** Structured JSON
- **Correlation:** Request IDs across services
- **Levels:** DEBUG, INFO, WARNING, ERROR, CRITICAL
### 9.2 Monitoring
- **Metrics:** Prometheus-compatible export
- **Traces:** OpenTelemetry (future)
- **Dashboards:** Grafana (optional)
### 9.3 Error Handling
- **Agent Errors:** Logged, published via SSE
- **Task Failures:** Celery retry with backoff
- **Integration Errors:** Circuit breaker pattern
---
## 10. Architecture Decisions
### Summary of ADRs
| ADR | Title | Status |
|-----|-------|--------|
| [ADR-001](../adrs/ADR-001-mcp-integration-architecture.md) | MCP Integration Architecture | Accepted |
| [ADR-002](../adrs/ADR-002-realtime-communication.md) | Real-time Communication | Accepted |
| [ADR-003](../adrs/ADR-003-background-task-architecture.md) | Background Task Architecture | Accepted |
| [ADR-004](../adrs/ADR-004-llm-provider-abstraction.md) | LLM Provider Abstraction | Accepted |
| [ADR-005](../adrs/ADR-005-tech-stack-selection.md) | Tech Stack Selection | Accepted |
| [ADR-006](../adrs/ADR-006-agent-orchestration.md) | Agent Orchestration | Accepted |
### Key Decisions Summary
1. **Unified Singleton MCP Servers** with project/agent scoping
2. **SSE for real-time events**, WebSocket only for chat
3. **Celery + Redis** for background tasks
4. **LiteLLM** for unified LLM abstraction with failover
5. **PragmaStack** as foundation with Syndarix extensions
6. **Type-Instance pattern** for agent orchestration
---
## Appendix A: Technology Stack Quick Reference
| Layer | Technology |
|-------|------------|
| Frontend | Next.js 16, React 19, TypeScript, Tailwind, shadcn/ui |
| Backend | FastAPI, Python 3.11+, SQLAlchemy 2.0, Pydantic 2.0 |
| Database | PostgreSQL 15+ with pgvector |
| Cache/Queue | Redis 7.0+ |
| Task Queue | Celery 5.3+ |
| MCP | FastMCP 2.0 |
| LLM | LiteLLM (Claude, GPT-4, Ollama) |
| Testing | pytest, Jest, Playwright |
| Container | Docker, Docker Compose |
---
## Appendix B: Port Reference
| Service | Port |
|---------|------|
| Frontend | 3000 |
| Backend | 8000 |
| PostgreSQL | 5432 |
| Redis | 6379 |
| Flower | 5555 |
| LLM MCP | 9001 |
| Git MCP | 9002 |
| KB MCP | 9003 |
| Issues MCP | 9004 |
| FS MCP | 9005 |
| Code MCP | 9006 |
| CI/CD MCP | 9007 |
---
*This document provides the comprehensive architecture overview for Syndarix. For detailed decisions, see the individual ADRs.*