Files
syndarix/docs/architecture/ARCHITECTURE.md
Felipe Cardoso 88cf4e0abc feat: Update to production model stack and fix remaining inconsistencies
## Model Stack Updates (User's Actual Models)

Updated all documentation to reflect production models:
- Claude Opus 4.5 (primary reasoning)
- GPT 5.1 Codex max (code generation specialist)
- Gemini 3 Pro/Flash (multimodal, fast inference)
- Qwen3-235B (cost-effective, self-hostable)
- DeepSeek V3.2 (self-hosted, open weights)

### Files Updated:
- ADR-004: Full model groups, failover chains, cost tables
- ADR-007: Code example with correct model identifiers
- ADR-012: Cost tracking with new model prices
- ARCHITECTURE.md: Model groups, failover diagram
- IMPLEMENTATION_ROADMAP.md: External services list

## Architecture Diagram Updates

- Added LangGraph Runtime to orchestration layer
- Added technology labels (Type-Instance, transitions)

## Self-Hostability Table Expanded

Added entries for:
- LangGraph (MIT)
- transitions (MIT)
- DeepSeek V3.2 (MIT)
- Qwen3-235B (Apache 2.0)

## Metric Alignments

- Response time: Split into API (<200ms) and Agent (<10s/<60s)
- Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing
- Added concurrent projects (10+) and agents (50+) metrics

## Infrastructure Updates

- Celery workers: 4-8 instances (was 2-4) across 4 queues
- MCP servers: Clarified Phase 2 + Phase 5 deployment
- Sync interval: Clarified 60s fallback + 15min reconciliation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 23:35:51 +01:00

21 KiB

Syndarix Architecture

Version: 1.0 Date: 2025-12-29 Status: Approved


Executive Summary

Syndarix is an autonomous AI-powered software consulting platform that orchestrates specialized AI agents to deliver complete software solutions. This document describes the chosen architecture, key decisions, and component interactions.

Core Principles

  1. Self-Hostable First: All components are fully self-hostable with permissive licenses (MIT/BSD)
  2. Production-Ready: Use battle-tested technologies, not experimental frameworks
  3. Hybrid Architecture: Combine best-in-class tools rather than monolithic frameworks
  4. Auditability: Every agent action is logged and traceable
  5. Human-in-the-Loop: Configurable autonomy with approval checkpoints

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────────┐
│                              SYNDARIX PLATFORM                                   │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────────┐   │
│  │                         FRONTEND (Next.js 16)                             │   │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐         │   │
│  │  │ Dashboard  │  │  Project   │  │  Agent     │  │  Approval  │         │   │
│  │  │   Pages    │  │   Views    │  │  Monitor   │  │   Queue    │         │   │
│  │  └────────────┘  └────────────┘  └────────────┘  └────────────┘         │   │
│  └──────────────────────────────────────────────────────────────────────────┘   │
│                                       │                                          │
│                          REST + SSE + WebSocket                                  │
│                                       ▼                                          │
│  ┌──────────────────────────────────────────────────────────────────────────┐   │
│  │                         BACKEND (FastAPI)                                 │   │
│  │                                                                           │   │
│  │  ┌─────────────────────────────────────────────────────────────────────┐ │   │
│  │  │                    ORCHESTRATION LAYER                               │ │   │
│  │  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌───────────┐  │ │   │
│  │  │  │   Agent     │  │  Workflow   │  │  Approval   │  │ LangGraph │  │ │   │
│  │  │  │ Orchestrator│  │   Engine    │  │   Service   │  │  Runtime  │  │ │   │
│  │  │  │(Type-Inst.) │  │(transitions)│  │             │  │           │  │ │   │
│  │  │  └─────────────┘  └─────────────┘  └─────────────┘  └───────────┘  │ │   │
│  │  └─────────────────────────────────────────────────────────────────────┘ │   │
│  │                                                                           │   │
│  │  ┌─────────────────────────────────────────────────────────────────────┐ │   │
│  │  │                    INTEGRATION LAYER                                 │ │   │
│  │  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                  │ │   │
│  │  │  │ LLM Gateway │  │  MCP Client │  │   Event     │                  │ │   │
│  │  │  │  (LiteLLM)  │  │   Manager   │  │    Bus      │                  │ │   │
│  │  │  └─────────────┘  └─────────────┘  └─────────────┘                  │ │   │
│  │  └─────────────────────────────────────────────────────────────────────┘ │   │
│  └──────────────────────────────────────────────────────────────────────────┘   │
│                                       │                                          │
│           ┌───────────────────────────┼───────────────────────────┐             │
│           ▼                           ▼                           ▼             │
│  ┌────────────────┐          ┌────────────────┐          ┌────────────────┐    │
│  │   PostgreSQL   │          │     Redis      │          │  Celery Workers│    │
│  │   + pgvector   │          │  (Cache/Queue) │          │  (Background)  │    │
│  └────────────────┘          └────────────────┘          └────────────────┘    │
│                                       │                                          │
│                                       ▼                                          │
│  ┌──────────────────────────────────────────────────────────────────────────┐   │
│  │                         MCP SERVERS                                       │   │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │   │
│  │  │   LLM    │  │Knowledge │  │   Git    │  │  Issues  │  │   File   │   │   │
│  │  │ Gateway  │  │   Base   │  │   MCP    │  │   MCP    │  │  System  │   │   │
│  │  └──────────┘  └──────────┘  └──────────┘  └──────────┘  └──────────┘   │   │
│  └──────────────────────────────────────────────────────────────────────────┘   │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘

Key Architecture Decisions

ADR Summary Matrix

ADR Decision Key Technology
ADR-001 MCP Integration FastMCP 2.0, Unified Singletons
ADR-002 Real-time Communication SSE primary, WebSocket for chat
ADR-003 Background Tasks Celery + Redis
ADR-004 LLM Provider LiteLLM with failover
ADR-005 Tech Stack PragmaStack + extensions
ADR-006 Agent Orchestration Type-Instance pattern
ADR-007 Framework Selection Hybrid (LangGraph + transitions + Celery)
ADR-008 Knowledge Base pgvector for RAG
ADR-009 Agent Communication Structured messages + Redis Streams
ADR-010 Workflows transitions + PostgreSQL + Celery
ADR-011 Issue Sync Webhook-first + polling fallback
ADR-012 Cost Tracking LiteLLM callbacks + Redis budgets
ADR-013 Audit Logging Structlog + hash chaining
ADR-014 Client Approval Checkpoint-based + notifications

Component Deep Dives

1. Agent Orchestration

Pattern: Type-Instance

  • Agent Types: Templates defining model, expertise, personality, capabilities
  • Agent Instances: Runtime instances spawned from types, assigned to projects
  • Orchestrator: Manages lifecycle, routing, and resource tracking
Agent Type (Template)              Agent Instance (Runtime)
┌─────────────────────┐            ┌─────────────────────┐
│ name: "Engineer"    │───spawn───▶│ id: "eng-001"       │
│ model: "sonnet"     │            │ name: "Dave"        │
│ expertise: [py, js] │            │ project: "proj-123" │
│ capabilities: [...]  │            │ context: {...}      │
└─────────────────────┘            │ status: ACTIVE      │
                                   └─────────────────────┘

2. LLM Gateway (LiteLLM)

Failover Chain:

Claude Opus 4.5 (Primary)
         │
         ▼ (on failure/rate limit)
    GPT 5.1 Codex max (Code specialist)
         │
         ▼ (on failure/rate limit)
    Gemini 3 Pro (Multimodal)
         │
         ▼ (on failure)
    Qwen3-235B / DeepSeek V3.2 (Self-hosted)

Model Groups:

Group Use Case Primary Model Fallback
high-reasoning Architecture, complex analysis Claude Opus 4.5 GPT 5.1 Codex max
code-generation Code writing, refactoring GPT 5.1 Codex max Claude Opus 4.5
fast-response Quick tasks, status updates Gemini 3 Flash Qwen3-235B
cost-optimized High-volume, non-critical Qwen3-235B DeepSeek V3.2
self-hosted Privacy-sensitive, air-gapped DeepSeek V3.2 Qwen3-235B

3. Knowledge Base (RAG)

Stack: pgvector + LiteLLM embeddings

Chunking Strategy:

Content Strategy Model
Code AST-based (function/class) voyage-code-3
Docs Heading-based text-embedding-3-small
Conversations Turn-based text-embedding-3-small

Search: Hybrid (70% vector + 30% keyword)

4. Workflow Engine

Stack: transitions library + PostgreSQL + Celery

Core Workflows:

  • Sprint Workflow: planning → active → review → done
  • Story Workflow: analysis → design → implementation → review → testing → done
  • PR Workflow: submitted → reviewing → changes_requested → approved → merged

Durability: Event sourcing with state persistence to PostgreSQL

5. Real-time Communication

SSE (90% of use cases):

  • Agent activity streams
  • Project progress updates
  • Approval notifications
  • Issue change notifications

WebSocket (10% - bidirectional):

  • Interactive chat with agents
  • Real-time debugging

Event Bus: Redis Pub/Sub for cross-instance distribution

6. Issue Synchronization

Architecture: Webhook-first + polling fallback

Supported Providers:

  • Gitea (primary)
  • GitHub
  • GitLab

Conflict Resolution: Last-Writer-Wins with version vectors

7. Cost Tracking

Real-time Pipeline:

LLM Request → LiteLLM Callback → Redis INCR → Budget Check
                    │
              Async Queue → PostgreSQL → SSE Dashboard Update

Budget Enforcement:

  • Soft limits: Alerts + model downgrade
  • Hard limits: Block requests

8. Audit Logging

Immutability: SHA-256 hash chaining

Storage Tiers:

Tier Storage Retention
Hot PostgreSQL 0-90 days
Cold S3/MinIO 90+ days

9. Client Approval Flow

Autonomy Levels:

Level Description
FULL_CONTROL Approve every action
MILESTONE Approve sprint boundaries
AUTONOMOUS Only critical decisions

Notifications: SSE + Email + Mobile Push


Technology Stack

Core Technologies

Layer Technology Version License
Backend FastAPI 0.115+ MIT
Frontend Next.js 16 MIT
Database PostgreSQL + pgvector 15+ PostgreSQL
Cache/Queue Redis 7.0+ BSD-3
Task Queue Celery 5.3+ BSD-3
LLM Gateway LiteLLM Latest MIT
MCP Framework FastMCP 2.0+ MIT

Self-Hostability Guarantee

All components are fully self-hostable with no mandatory subscriptions:

Component License Self-Hosted Managed Alternative (Optional)
PostgreSQL PostgreSQL Yes RDS, Neon, Supabase
Redis BSD-3 Yes Redis Cloud
LiteLLM MIT Yes LiteLLM Enterprise
Celery BSD-3 Yes -
FastMCP MIT Yes -
LangGraph MIT Yes LangSmith (observability only)
transitions MIT Yes -
DeepSeek V3.2 MIT Yes API available
Qwen3-235B Apache 2.0 Yes Alibaba Cloud

Data Flow Diagrams

Agent Task Execution

1. Client creates story in Syndarix
         │
         ▼
2. Story workflow transitions to "implementation"
         │
         ▼
3. Agent Orchestrator spawns Engineer instance
         │
         ▼
4. Engineer queries Knowledge Base (RAG)
         │
         ▼
5. Engineer calls LLM Gateway for code generation
         │
         ▼
6. Engineer calls Git MCP to create branch & commit
         │
         ▼
7. Engineer creates PR via Git MCP
         │
         ▼
8. Workflow transitions to "review"
         │
         ▼
9. If autonomy_level != AUTONOMOUS:
   └── Approval request created
   └── Client notified via SSE + email
         │
         ▼
10. Client approves → PR merged → Workflow to "testing"

Real-time Event Flow

Agent Action
     │
     ▼
Event Bus (Redis Pub/Sub)
     │
     ├──▶ SSE Endpoint ──▶ Frontend Dashboard
     │
     ├──▶ Audit Logger ──▶ PostgreSQL
     │
     └──▶ Other Backend Instances (horizontal scaling)

Security Architecture

Authentication Flow

  • Users: JWT dual-token (access + refresh) via PragmaStack
  • Agents: Service tokens for MCP communication
  • MCP Servers: Internal network only, validated service tokens

Multi-Tenancy

  • Project Isolation: All queries scoped by project_id
  • Row-Level Security: PostgreSQL RLS for knowledge base
  • Agent Scoping: Every MCP tool requires project_id + agent_id

Audit Trail

  • Hash Chaining: Tamper-evident event log
  • Complete Coverage: All agent actions, LLM calls, MCP tool invocations

Scalability Considerations

Horizontal Scaling

Component Scaling Strategy
FastAPI Multiple instances behind load balancer
Celery Workers Add workers per queue as needed
PostgreSQL Read replicas, connection pooling
Redis Cluster mode for high availability

Expected Scale

Metric Target
Concurrent Projects 50+
Concurrent Agent Instances 200+
Background Jobs/minute 500+
SSE Connections 200+

Deployment Architecture

Local Development

docker-compose up
├── PostgreSQL (+ pgvector)
├── Redis
├── FastAPI Backend
├── Next.js Frontend
├── Celery Workers (agent, git, sync queues)
├── Celery Beat (scheduler)
├── Flower (monitoring)
└── MCP Servers (7 containers)

Production

┌─────────────────────────────────────────────────────────────────┐
│                        Load Balancer                             │
└─────────────────────────────┬───────────────────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         ▼                    ▼                    ▼
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│  API Instance 1 │  │  API Instance 2 │  │  API Instance N │
└─────────────────┘  └─────────────────┘  └─────────────────┘
         │                    │                    │
         └────────────────────┼────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         ▼                    ▼                    ▼
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│   PostgreSQL    │  │  Redis Cluster  │  │  Celery Workers │
│   (Primary +    │  │                 │  │  (Auto-scaled)  │
│    Replicas)    │  │                 │  │                 │
└─────────────────┘  └─────────────────┘  └─────────────────┘


Appendix: Full ADR List

  1. ADR-001: MCP Integration Architecture
  2. ADR-002: Real-time Communication
  3. ADR-003: Background Task Architecture
  4. ADR-004: LLM Provider Abstraction
  5. ADR-005: Technology Stack Selection
  6. ADR-006: Agent Orchestration
  7. ADR-007: Agentic Framework Selection
  8. ADR-008: Knowledge Base and RAG
  9. ADR-009: Agent Communication Protocol
  10. ADR-010: Workflow State Machine
  11. ADR-011: Issue Synchronization
  12. ADR-012: Cost Tracking
  13. ADR-013: Audit Logging
  14. ADR-014: Client Approval Flow

This document serves as the authoritative architecture reference for Syndarix.