forked from cardosofelipe/fast-next-template
## Model Stack Updates (User's Actual Models) Updated all documentation to reflect production models: - Claude Opus 4.5 (primary reasoning) - GPT 5.1 Codex max (code generation specialist) - Gemini 3 Pro/Flash (multimodal, fast inference) - Qwen3-235B (cost-effective, self-hostable) - DeepSeek V3.2 (self-hosted, open weights) ### Files Updated: - ADR-004: Full model groups, failover chains, cost tables - ADR-007: Code example with correct model identifiers - ADR-012: Cost tracking with new model prices - ARCHITECTURE.md: Model groups, failover diagram - IMPLEMENTATION_ROADMAP.md: External services list ## Architecture Diagram Updates - Added LangGraph Runtime to orchestration layer - Added technology labels (Type-Instance, transitions) ## Self-Hostability Table Expanded Added entries for: - LangGraph (MIT) - transitions (MIT) - DeepSeek V3.2 (MIT) - Qwen3-235B (Apache 2.0) ## Metric Alignments - Response time: Split into API (<200ms) and Agent (<10s/<60s) - Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing - Added concurrent projects (10+) and agents (50+) metrics ## Infrastructure Updates - Celery workers: 4-8 instances (was 2-4) across 4 queues - MCP servers: Clarified Phase 2 + Phase 5 deployment - Sync interval: Clarified 60s fallback + 15min reconciliation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
436 lines
21 KiB
Markdown
436 lines
21 KiB
Markdown
# Syndarix Architecture
|
|
|
|
**Version:** 1.0
|
|
**Date:** 2025-12-29
|
|
**Status:** Approved
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Syndarix is an autonomous AI-powered software consulting platform that orchestrates specialized AI agents to deliver complete software solutions. This document describes the chosen architecture, key decisions, and component interactions.
|
|
|
|
### Core Principles
|
|
|
|
1. **Self-Hostable First:** All components are fully self-hostable with permissive licenses (MIT/BSD)
|
|
2. **Production-Ready:** Use battle-tested technologies, not experimental frameworks
|
|
3. **Hybrid Architecture:** Combine best-in-class tools rather than monolithic frameworks
|
|
4. **Auditability:** Every agent action is logged and traceable
|
|
5. **Human-in-the-Loop:** Configurable autonomy with approval checkpoints
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────────┐
|
|
│ SYNDARIX PLATFORM │
|
|
├─────────────────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ FRONTEND (Next.js 16) │ │
|
|
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
|
|
│ │ │ Dashboard │ │ Project │ │ Agent │ │ Approval │ │ │
|
|
│ │ │ Pages │ │ Views │ │ Monitor │ │ Queue │ │ │
|
|
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
|
|
│ └──────────────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ REST + SSE + WebSocket │
|
|
│ ▼ │
|
|
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ BACKEND (FastAPI) │ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
|
|
│ │ │ ORCHESTRATION LAYER │ │ │
|
|
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │ │ │
|
|
│ │ │ │ Agent │ │ Workflow │ │ Approval │ │ LangGraph │ │ │ │
|
|
│ │ │ │ Orchestrator│ │ Engine │ │ Service │ │ Runtime │ │ │ │
|
|
│ │ │ │(Type-Inst.) │ │(transitions)│ │ │ │ │ │ │ │
|
|
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └───────────┘ │ │ │
|
|
│ │ └─────────────────────────────────────────────────────────────────────┘ │ │
|
|
│ │ │ │
|
|
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
|
|
│ │ │ INTEGRATION LAYER │ │ │
|
|
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
|
|
│ │ │ │ LLM Gateway │ │ MCP Client │ │ Event │ │ │ │
|
|
│ │ │ │ (LiteLLM) │ │ Manager │ │ Bus │ │ │ │
|
|
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
|
|
│ │ └─────────────────────────────────────────────────────────────────────┘ │ │
|
|
│ └──────────────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌───────────────────────────┼───────────────────────────┐ │
|
|
│ ▼ ▼ ▼ │
|
|
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
|
│ │ PostgreSQL │ │ Redis │ │ Celery Workers│ │
|
|
│ │ + pgvector │ │ (Cache/Queue) │ │ (Background) │ │
|
|
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ MCP SERVERS │ │
|
|
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
|
|
│ │ │ LLM │ │Knowledge │ │ Git │ │ Issues │ │ File │ │ │
|
|
│ │ │ Gateway │ │ Base │ │ MCP │ │ MCP │ │ System │ │ │
|
|
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
|
|
│ └──────────────────────────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Key Architecture Decisions
|
|
|
|
### ADR Summary Matrix
|
|
|
|
| ADR | Decision | Key Technology |
|
|
|-----|----------|----------------|
|
|
| ADR-001 | MCP Integration | FastMCP 2.0, Unified Singletons |
|
|
| ADR-002 | Real-time Communication | SSE primary, WebSocket for chat |
|
|
| ADR-003 | Background Tasks | Celery + Redis |
|
|
| ADR-004 | LLM Provider | LiteLLM with failover |
|
|
| ADR-005 | Tech Stack | PragmaStack + extensions |
|
|
| ADR-006 | Agent Orchestration | Type-Instance pattern |
|
|
| ADR-007 | Framework Selection | Hybrid (LangGraph + transitions + Celery) |
|
|
| ADR-008 | Knowledge Base | pgvector for RAG |
|
|
| ADR-009 | Agent Communication | Structured messages + Redis Streams |
|
|
| ADR-010 | Workflows | transitions + PostgreSQL + Celery |
|
|
| ADR-011 | Issue Sync | Webhook-first + polling fallback |
|
|
| ADR-012 | Cost Tracking | LiteLLM callbacks + Redis budgets |
|
|
| ADR-013 | Audit Logging | Structlog + hash chaining |
|
|
| ADR-014 | Client Approval | Checkpoint-based + notifications |
|
|
|
|
---
|
|
|
|
## Component Deep Dives
|
|
|
|
### 1. Agent Orchestration
|
|
|
|
**Pattern:** Type-Instance
|
|
|
|
- **Agent Types:** Templates defining model, expertise, personality, capabilities
|
|
- **Agent Instances:** Runtime instances spawned from types, assigned to projects
|
|
- **Orchestrator:** Manages lifecycle, routing, and resource tracking
|
|
|
|
```
|
|
Agent Type (Template) Agent Instance (Runtime)
|
|
┌─────────────────────┐ ┌─────────────────────┐
|
|
│ name: "Engineer" │───spawn───▶│ id: "eng-001" │
|
|
│ model: "sonnet" │ │ name: "Dave" │
|
|
│ expertise: [py, js] │ │ project: "proj-123" │
|
|
│ capabilities: [...] │ │ context: {...} │
|
|
└─────────────────────┘ │ status: ACTIVE │
|
|
└─────────────────────┘
|
|
```
|
|
|
|
### 2. LLM Gateway (LiteLLM)
|
|
|
|
**Failover Chain:**
|
|
```
|
|
Claude Opus 4.5 (Primary)
|
|
│
|
|
▼ (on failure/rate limit)
|
|
GPT 5.1 Codex max (Code specialist)
|
|
│
|
|
▼ (on failure/rate limit)
|
|
Gemini 3 Pro (Multimodal)
|
|
│
|
|
▼ (on failure)
|
|
Qwen3-235B / DeepSeek V3.2 (Self-hosted)
|
|
```
|
|
|
|
**Model Groups:**
|
|
| Group | Use Case | Primary Model | Fallback |
|
|
|-------|----------|---------------|----------|
|
|
| high-reasoning | Architecture, complex analysis | Claude Opus 4.5 | GPT 5.1 Codex max |
|
|
| code-generation | Code writing, refactoring | GPT 5.1 Codex max | Claude Opus 4.5 |
|
|
| fast-response | Quick tasks, status updates | Gemini 3 Flash | Qwen3-235B |
|
|
| cost-optimized | High-volume, non-critical | Qwen3-235B | DeepSeek V3.2 |
|
|
| self-hosted | Privacy-sensitive, air-gapped | DeepSeek V3.2 | Qwen3-235B |
|
|
|
|
### 3. Knowledge Base (RAG)
|
|
|
|
**Stack:** pgvector + LiteLLM embeddings
|
|
|
|
**Chunking Strategy:**
|
|
| Content | Strategy | Model |
|
|
|---------|----------|-------|
|
|
| Code | AST-based (function/class) | voyage-code-3 |
|
|
| Docs | Heading-based | text-embedding-3-small |
|
|
| Conversations | Turn-based | text-embedding-3-small |
|
|
|
|
**Search:** Hybrid (70% vector + 30% keyword)
|
|
|
|
### 4. Workflow Engine
|
|
|
|
**Stack:** transitions library + PostgreSQL + Celery
|
|
|
|
**Core Workflows:**
|
|
- **Sprint Workflow:** planning → active → review → done
|
|
- **Story Workflow:** analysis → design → implementation → review → testing → done
|
|
- **PR Workflow:** submitted → reviewing → changes_requested → approved → merged
|
|
|
|
**Durability:** Event sourcing with state persistence to PostgreSQL
|
|
|
|
### 5. Real-time Communication
|
|
|
|
**SSE (90% of use cases):**
|
|
- Agent activity streams
|
|
- Project progress updates
|
|
- Approval notifications
|
|
- Issue change notifications
|
|
|
|
**WebSocket (10% - bidirectional):**
|
|
- Interactive chat with agents
|
|
- Real-time debugging
|
|
|
|
**Event Bus:** Redis Pub/Sub for cross-instance distribution
|
|
|
|
### 6. Issue Synchronization
|
|
|
|
**Architecture:** Webhook-first + polling fallback
|
|
|
|
**Supported Providers:**
|
|
- Gitea (primary)
|
|
- GitHub
|
|
- GitLab
|
|
|
|
**Conflict Resolution:** Last-Writer-Wins with version vectors
|
|
|
|
### 7. Cost Tracking
|
|
|
|
**Real-time Pipeline:**
|
|
```
|
|
LLM Request → LiteLLM Callback → Redis INCR → Budget Check
|
|
│
|
|
Async Queue → PostgreSQL → SSE Dashboard Update
|
|
```
|
|
|
|
**Budget Enforcement:**
|
|
- Soft limits: Alerts + model downgrade
|
|
- Hard limits: Block requests
|
|
|
|
### 8. Audit Logging
|
|
|
|
**Immutability:** SHA-256 hash chaining
|
|
|
|
**Storage Tiers:**
|
|
| Tier | Storage | Retention |
|
|
|------|---------|-----------|
|
|
| Hot | PostgreSQL | 0-90 days |
|
|
| Cold | S3/MinIO | 90+ days |
|
|
|
|
### 9. Client Approval Flow
|
|
|
|
**Autonomy Levels:**
|
|
| Level | Description |
|
|
|-------|-------------|
|
|
| FULL_CONTROL | Approve every action |
|
|
| MILESTONE | Approve sprint boundaries |
|
|
| AUTONOMOUS | Only critical decisions |
|
|
|
|
**Notifications:** SSE + Email + Mobile Push
|
|
|
|
---
|
|
|
|
## Technology Stack
|
|
|
|
### Core Technologies
|
|
|
|
| Layer | Technology | Version | License |
|
|
|-------|------------|---------|---------|
|
|
| Backend | FastAPI | 0.115+ | MIT |
|
|
| Frontend | Next.js | 16 | MIT |
|
|
| Database | PostgreSQL + pgvector | 15+ | PostgreSQL |
|
|
| Cache/Queue | Redis | 7.0+ | BSD-3 |
|
|
| Task Queue | Celery | 5.3+ | BSD-3 |
|
|
| LLM Gateway | LiteLLM | Latest | MIT |
|
|
| MCP Framework | FastMCP | 2.0+ | MIT |
|
|
|
|
### Self-Hostability Guarantee
|
|
|
|
**All components are fully self-hostable with no mandatory subscriptions:**
|
|
|
|
| Component | License | Self-Hosted | Managed Alternative (Optional) |
|
|
|-----------|---------|-------------|--------------------------------|
|
|
| PostgreSQL | PostgreSQL | Yes | RDS, Neon, Supabase |
|
|
| Redis | BSD-3 | Yes | Redis Cloud |
|
|
| LiteLLM | MIT | Yes | LiteLLM Enterprise |
|
|
| Celery | BSD-3 | Yes | - |
|
|
| FastMCP | MIT | Yes | - |
|
|
| LangGraph | MIT | Yes | LangSmith (observability only) |
|
|
| transitions | MIT | Yes | - |
|
|
| DeepSeek V3.2 | MIT | Yes | API available |
|
|
| Qwen3-235B | Apache 2.0 | Yes | Alibaba Cloud |
|
|
|
|
---
|
|
|
|
## Data Flow Diagrams
|
|
|
|
### Agent Task Execution
|
|
|
|
```
|
|
1. Client creates story in Syndarix
|
|
│
|
|
▼
|
|
2. Story workflow transitions to "implementation"
|
|
│
|
|
▼
|
|
3. Agent Orchestrator spawns Engineer instance
|
|
│
|
|
▼
|
|
4. Engineer queries Knowledge Base (RAG)
|
|
│
|
|
▼
|
|
5. Engineer calls LLM Gateway for code generation
|
|
│
|
|
▼
|
|
6. Engineer calls Git MCP to create branch & commit
|
|
│
|
|
▼
|
|
7. Engineer creates PR via Git MCP
|
|
│
|
|
▼
|
|
8. Workflow transitions to "review"
|
|
│
|
|
▼
|
|
9. If autonomy_level != AUTONOMOUS:
|
|
└── Approval request created
|
|
└── Client notified via SSE + email
|
|
│
|
|
▼
|
|
10. Client approves → PR merged → Workflow to "testing"
|
|
```
|
|
|
|
### Real-time Event Flow
|
|
|
|
```
|
|
Agent Action
|
|
│
|
|
▼
|
|
Event Bus (Redis Pub/Sub)
|
|
│
|
|
├──▶ SSE Endpoint ──▶ Frontend Dashboard
|
|
│
|
|
├──▶ Audit Logger ──▶ PostgreSQL
|
|
│
|
|
└──▶ Other Backend Instances (horizontal scaling)
|
|
```
|
|
|
|
---
|
|
|
|
## Security Architecture
|
|
|
|
### Authentication Flow
|
|
|
|
- **Users:** JWT dual-token (access + refresh) via PragmaStack
|
|
- **Agents:** Service tokens for MCP communication
|
|
- **MCP Servers:** Internal network only, validated service tokens
|
|
|
|
### Multi-Tenancy
|
|
|
|
- **Project Isolation:** All queries scoped by project_id
|
|
- **Row-Level Security:** PostgreSQL RLS for knowledge base
|
|
- **Agent Scoping:** Every MCP tool requires project_id + agent_id
|
|
|
|
### Audit Trail
|
|
|
|
- **Hash Chaining:** Tamper-evident event log
|
|
- **Complete Coverage:** All agent actions, LLM calls, MCP tool invocations
|
|
|
|
---
|
|
|
|
## Scalability Considerations
|
|
|
|
### Horizontal Scaling
|
|
|
|
| Component | Scaling Strategy |
|
|
|-----------|-----------------|
|
|
| FastAPI | Multiple instances behind load balancer |
|
|
| Celery Workers | Add workers per queue as needed |
|
|
| PostgreSQL | Read replicas, connection pooling |
|
|
| Redis | Cluster mode for high availability |
|
|
|
|
### Expected Scale
|
|
|
|
| Metric | Target |
|
|
|--------|--------|
|
|
| Concurrent Projects | 50+ |
|
|
| Concurrent Agent Instances | 200+ |
|
|
| Background Jobs/minute | 500+ |
|
|
| SSE Connections | 200+ |
|
|
|
|
---
|
|
|
|
## Deployment Architecture
|
|
|
|
### Local Development
|
|
|
|
```
|
|
docker-compose up
|
|
├── PostgreSQL (+ pgvector)
|
|
├── Redis
|
|
├── FastAPI Backend
|
|
├── Next.js Frontend
|
|
├── Celery Workers (agent, git, sync queues)
|
|
├── Celery Beat (scheduler)
|
|
├── Flower (monitoring)
|
|
└── MCP Servers (7 containers)
|
|
```
|
|
|
|
### Production
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Load Balancer │
|
|
└─────────────────────────────┬───────────────────────────────────┘
|
|
│
|
|
┌────────────────────┼────────────────────┐
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ API Instance 1 │ │ API Instance 2 │ │ API Instance N │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
│ │ │
|
|
└────────────────────┼────────────────────┘
|
|
│
|
|
┌────────────────────┼────────────────────┐
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ PostgreSQL │ │ Redis Cluster │ │ Celery Workers │
|
|
│ (Primary + │ │ │ │ (Auto-scaled) │
|
|
│ Replicas) │ │ │ │ │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Related Documents
|
|
|
|
- [Implementation Roadmap](./IMPLEMENTATION_ROADMAP.md)
|
|
- [Architecture Deep Analysis](./ARCHITECTURE_DEEP_ANALYSIS.md)
|
|
- [ADRs](../adrs/) - All architecture decision records
|
|
- [Spikes](../spikes/) - Research documents
|
|
|
|
---
|
|
|
|
## Appendix: Full ADR List
|
|
|
|
1. [ADR-001: MCP Integration Architecture](../adrs/ADR-001-mcp-integration-architecture.md)
|
|
2. [ADR-002: Real-time Communication](../adrs/ADR-002-realtime-communication.md)
|
|
3. [ADR-003: Background Task Architecture](../adrs/ADR-003-background-task-architecture.md)
|
|
4. [ADR-004: LLM Provider Abstraction](../adrs/ADR-004-llm-provider-abstraction.md)
|
|
5. [ADR-005: Technology Stack Selection](../adrs/ADR-005-tech-stack-selection.md)
|
|
6. [ADR-006: Agent Orchestration](../adrs/ADR-006-agent-orchestration.md)
|
|
7. [ADR-007: Agentic Framework Selection](../adrs/ADR-007-agentic-framework-selection.md)
|
|
8. [ADR-008: Knowledge Base and RAG](../adrs/ADR-008-knowledge-base-rag.md)
|
|
9. [ADR-009: Agent Communication Protocol](../adrs/ADR-009-agent-communication-protocol.md)
|
|
10. [ADR-010: Workflow State Machine](../adrs/ADR-010-workflow-state-machine.md)
|
|
11. [ADR-011: Issue Synchronization](../adrs/ADR-011-issue-synchronization.md)
|
|
12. [ADR-012: Cost Tracking](../adrs/ADR-012-cost-tracking.md)
|
|
13. [ADR-013: Audit Logging](../adrs/ADR-013-audit-logging.md)
|
|
14. [ADR-014: Client Approval Flow](../adrs/ADR-014-client-approval-flow.md)
|
|
|
|
---
|
|
|
|
*This document serves as the authoritative architecture reference for Syndarix.*
|