## Model Stack Updates (User's Actual Models) Updated all documentation to reflect production models: - Claude Opus 4.5 (primary reasoning) - GPT 5.1 Codex max (code generation specialist) - Gemini 3 Pro/Flash (multimodal, fast inference) - Qwen3-235B (cost-effective, self-hostable) - DeepSeek V3.2 (self-hosted, open weights) ### Files Updated: - ADR-004: Full model groups, failover chains, cost tables - ADR-007: Code example with correct model identifiers - ADR-012: Cost tracking with new model prices - ARCHITECTURE.md: Model groups, failover diagram - IMPLEMENTATION_ROADMAP.md: External services list ## Architecture Diagram Updates - Added LangGraph Runtime to orchestration layer - Added technology labels (Type-Instance, transitions) ## Self-Hostability Table Expanded Added entries for: - LangGraph (MIT) - transitions (MIT) - DeepSeek V3.2 (MIT) - Qwen3-235B (Apache 2.0) ## Metric Alignments - Response time: Split into API (<200ms) and Agent (<10s/<60s) - Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing - Added concurrent projects (10+) and agents (50+) metrics ## Infrastructure Updates - Celery workers: 4-8 instances (was 2-4) across 4 queues - MCP servers: Clarified Phase 2 + Phase 5 deployment - Sync interval: Clarified 60s fallback + 15min reconciliation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
21 KiB
Syndarix Architecture
Version: 1.0 Date: 2025-12-29 Status: Approved
Executive Summary
Syndarix is an autonomous AI-powered software consulting platform that orchestrates specialized AI agents to deliver complete software solutions. This document describes the chosen architecture, key decisions, and component interactions.
Core Principles
- Self-Hostable First: All components are fully self-hostable with permissive licenses (MIT/BSD)
- Production-Ready: Use battle-tested technologies, not experimental frameworks
- Hybrid Architecture: Combine best-in-class tools rather than monolithic frameworks
- Auditability: Every agent action is logged and traceable
- Human-in-the-Loop: Configurable autonomy with approval checkpoints
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────────┐
│ SYNDARIX PLATFORM │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
│ │ FRONTEND (Next.js 16) │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
│ │ │ Dashboard │ │ Project │ │ Agent │ │ Approval │ │ │
│ │ │ Pages │ │ Views │ │ Monitor │ │ Queue │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ REST + SSE + WebSocket │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
│ │ BACKEND (FastAPI) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
│ │ │ ORCHESTRATION LAYER │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │ │ │
│ │ │ │ Agent │ │ Workflow │ │ Approval │ │ LangGraph │ │ │ │
│ │ │ │ Orchestrator│ │ Engine │ │ Service │ │ Runtime │ │ │ │
│ │ │ │(Type-Inst.) │ │(transitions)│ │ │ │ │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └───────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
│ │ │ INTEGRATION LAYER │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ LLM Gateway │ │ MCP Client │ │ Event │ │ │ │
│ │ │ │ (LiteLLM) │ │ Manager │ │ Bus │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ Celery Workers│ │
│ │ + pgvector │ │ (Cache/Queue) │ │ (Background) │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
│ │ MCP SERVERS │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ LLM │ │Knowledge │ │ Git │ │ Issues │ │ File │ │ │
│ │ │ Gateway │ │ Base │ │ MCP │ │ MCP │ │ System │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
Key Architecture Decisions
ADR Summary Matrix
| ADR | Decision | Key Technology |
|---|---|---|
| ADR-001 | MCP Integration | FastMCP 2.0, Unified Singletons |
| ADR-002 | Real-time Communication | SSE primary, WebSocket for chat |
| ADR-003 | Background Tasks | Celery + Redis |
| ADR-004 | LLM Provider | LiteLLM with failover |
| ADR-005 | Tech Stack | PragmaStack + extensions |
| ADR-006 | Agent Orchestration | Type-Instance pattern |
| ADR-007 | Framework Selection | Hybrid (LangGraph + transitions + Celery) |
| ADR-008 | Knowledge Base | pgvector for RAG |
| ADR-009 | Agent Communication | Structured messages + Redis Streams |
| ADR-010 | Workflows | transitions + PostgreSQL + Celery |
| ADR-011 | Issue Sync | Webhook-first + polling fallback |
| ADR-012 | Cost Tracking | LiteLLM callbacks + Redis budgets |
| ADR-013 | Audit Logging | Structlog + hash chaining |
| ADR-014 | Client Approval | Checkpoint-based + notifications |
Component Deep Dives
1. Agent Orchestration
Pattern: Type-Instance
- Agent Types: Templates defining model, expertise, personality, capabilities
- Agent Instances: Runtime instances spawned from types, assigned to projects
- Orchestrator: Manages lifecycle, routing, and resource tracking
Agent Type (Template) Agent Instance (Runtime)
┌─────────────────────┐ ┌─────────────────────┐
│ name: "Engineer" │───spawn───▶│ id: "eng-001" │
│ model: "sonnet" │ │ name: "Dave" │
│ expertise: [py, js] │ │ project: "proj-123" │
│ capabilities: [...] │ │ context: {...} │
└─────────────────────┘ │ status: ACTIVE │
└─────────────────────┘
2. LLM Gateway (LiteLLM)
Failover Chain:
Claude Opus 4.5 (Primary)
│
▼ (on failure/rate limit)
GPT 5.1 Codex max (Code specialist)
│
▼ (on failure/rate limit)
Gemini 3 Pro (Multimodal)
│
▼ (on failure)
Qwen3-235B / DeepSeek V3.2 (Self-hosted)
Model Groups:
| Group | Use Case | Primary Model | Fallback |
|---|---|---|---|
| high-reasoning | Architecture, complex analysis | Claude Opus 4.5 | GPT 5.1 Codex max |
| code-generation | Code writing, refactoring | GPT 5.1 Codex max | Claude Opus 4.5 |
| fast-response | Quick tasks, status updates | Gemini 3 Flash | Qwen3-235B |
| cost-optimized | High-volume, non-critical | Qwen3-235B | DeepSeek V3.2 |
| self-hosted | Privacy-sensitive, air-gapped | DeepSeek V3.2 | Qwen3-235B |
3. Knowledge Base (RAG)
Stack: pgvector + LiteLLM embeddings
Chunking Strategy:
| Content | Strategy | Model |
|---|---|---|
| Code | AST-based (function/class) | voyage-code-3 |
| Docs | Heading-based | text-embedding-3-small |
| Conversations | Turn-based | text-embedding-3-small |
Search: Hybrid (70% vector + 30% keyword)
4. Workflow Engine
Stack: transitions library + PostgreSQL + Celery
Core Workflows:
- Sprint Workflow: planning → active → review → done
- Story Workflow: analysis → design → implementation → review → testing → done
- PR Workflow: submitted → reviewing → changes_requested → approved → merged
Durability: Event sourcing with state persistence to PostgreSQL
5. Real-time Communication
SSE (90% of use cases):
- Agent activity streams
- Project progress updates
- Approval notifications
- Issue change notifications
WebSocket (10% - bidirectional):
- Interactive chat with agents
- Real-time debugging
Event Bus: Redis Pub/Sub for cross-instance distribution
6. Issue Synchronization
Architecture: Webhook-first + polling fallback
Supported Providers:
- Gitea (primary)
- GitHub
- GitLab
Conflict Resolution: Last-Writer-Wins with version vectors
7. Cost Tracking
Real-time Pipeline:
LLM Request → LiteLLM Callback → Redis INCR → Budget Check
│
Async Queue → PostgreSQL → SSE Dashboard Update
Budget Enforcement:
- Soft limits: Alerts + model downgrade
- Hard limits: Block requests
8. Audit Logging
Immutability: SHA-256 hash chaining
Storage Tiers:
| Tier | Storage | Retention |
|---|---|---|
| Hot | PostgreSQL | 0-90 days |
| Cold | S3/MinIO | 90+ days |
9. Client Approval Flow
Autonomy Levels:
| Level | Description |
|---|---|
| FULL_CONTROL | Approve every action |
| MILESTONE | Approve sprint boundaries |
| AUTONOMOUS | Only critical decisions |
Notifications: SSE + Email + Mobile Push
Technology Stack
Core Technologies
| Layer | Technology | Version | License |
|---|---|---|---|
| Backend | FastAPI | 0.115+ | MIT |
| Frontend | Next.js | 16 | MIT |
| Database | PostgreSQL + pgvector | 15+ | PostgreSQL |
| Cache/Queue | Redis | 7.0+ | BSD-3 |
| Task Queue | Celery | 5.3+ | BSD-3 |
| LLM Gateway | LiteLLM | Latest | MIT |
| MCP Framework | FastMCP | 2.0+ | MIT |
Self-Hostability Guarantee
All components are fully self-hostable with no mandatory subscriptions:
| Component | License | Self-Hosted | Managed Alternative (Optional) |
|---|---|---|---|
| PostgreSQL | PostgreSQL | Yes | RDS, Neon, Supabase |
| Redis | BSD-3 | Yes | Redis Cloud |
| LiteLLM | MIT | Yes | LiteLLM Enterprise |
| Celery | BSD-3 | Yes | - |
| FastMCP | MIT | Yes | - |
| LangGraph | MIT | Yes | LangSmith (observability only) |
| transitions | MIT | Yes | - |
| DeepSeek V3.2 | MIT | Yes | API available |
| Qwen3-235B | Apache 2.0 | Yes | Alibaba Cloud |
Data Flow Diagrams
Agent Task Execution
1. Client creates story in Syndarix
│
▼
2. Story workflow transitions to "implementation"
│
▼
3. Agent Orchestrator spawns Engineer instance
│
▼
4. Engineer queries Knowledge Base (RAG)
│
▼
5. Engineer calls LLM Gateway for code generation
│
▼
6. Engineer calls Git MCP to create branch & commit
│
▼
7. Engineer creates PR via Git MCP
│
▼
8. Workflow transitions to "review"
│
▼
9. If autonomy_level != AUTONOMOUS:
└── Approval request created
└── Client notified via SSE + email
│
▼
10. Client approves → PR merged → Workflow to "testing"
Real-time Event Flow
Agent Action
│
▼
Event Bus (Redis Pub/Sub)
│
├──▶ SSE Endpoint ──▶ Frontend Dashboard
│
├──▶ Audit Logger ──▶ PostgreSQL
│
└──▶ Other Backend Instances (horizontal scaling)
Security Architecture
Authentication Flow
- Users: JWT dual-token (access + refresh) via PragmaStack
- Agents: Service tokens for MCP communication
- MCP Servers: Internal network only, validated service tokens
Multi-Tenancy
- Project Isolation: All queries scoped by project_id
- Row-Level Security: PostgreSQL RLS for knowledge base
- Agent Scoping: Every MCP tool requires project_id + agent_id
Audit Trail
- Hash Chaining: Tamper-evident event log
- Complete Coverage: All agent actions, LLM calls, MCP tool invocations
Scalability Considerations
Horizontal Scaling
| Component | Scaling Strategy |
|---|---|
| FastAPI | Multiple instances behind load balancer |
| Celery Workers | Add workers per queue as needed |
| PostgreSQL | Read replicas, connection pooling |
| Redis | Cluster mode for high availability |
Expected Scale
| Metric | Target |
|---|---|
| Concurrent Projects | 50+ |
| Concurrent Agent Instances | 200+ |
| Background Jobs/minute | 500+ |
| SSE Connections | 200+ |
Deployment Architecture
Local Development
docker-compose up
├── PostgreSQL (+ pgvector)
├── Redis
├── FastAPI Backend
├── Next.js Frontend
├── Celery Workers (agent, git, sync queues)
├── Celery Beat (scheduler)
├── Flower (monitoring)
└── MCP Servers (7 containers)
Production
┌─────────────────────────────────────────────────────────────────┐
│ Load Balancer │
└─────────────────────────────┬───────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ API Instance 1 │ │ API Instance 2 │ │ API Instance N │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└────────────────────┼────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ PostgreSQL │ │ Redis Cluster │ │ Celery Workers │
│ (Primary + │ │ │ │ (Auto-scaled) │
│ Replicas) │ │ │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Related Documents
- Implementation Roadmap
- Architecture Deep Analysis
- ADRs - All architecture decision records
- Spikes - Research documents
Appendix: Full ADR List
- ADR-001: MCP Integration Architecture
- ADR-002: Real-time Communication
- ADR-003: Background Task Architecture
- ADR-004: LLM Provider Abstraction
- ADR-005: Technology Stack Selection
- ADR-006: Agent Orchestration
- ADR-007: Agentic Framework Selection
- ADR-008: Knowledge Base and RAG
- ADR-009: Agent Communication Protocol
- ADR-010: Workflow State Machine
- ADR-011: Issue Synchronization
- ADR-012: Cost Tracking
- ADR-013: Audit Logging
- ADR-014: Client Approval Flow
This document serves as the authoritative architecture reference for Syndarix.