Files
syndarix/docs/architecture/IMPLEMENTATION_ROADMAP.md
Felipe Cardoso 88cf4e0abc feat: Update to production model stack and fix remaining inconsistencies
## Model Stack Updates (User's Actual Models)

Updated all documentation to reflect production models:
- Claude Opus 4.5 (primary reasoning)
- GPT 5.1 Codex max (code generation specialist)
- Gemini 3 Pro/Flash (multimodal, fast inference)
- Qwen3-235B (cost-effective, self-hostable)
- DeepSeek V3.2 (self-hosted, open weights)

### Files Updated:
- ADR-004: Full model groups, failover chains, cost tables
- ADR-007: Code example with correct model identifiers
- ADR-012: Cost tracking with new model prices
- ARCHITECTURE.md: Model groups, failover diagram
- IMPLEMENTATION_ROADMAP.md: External services list

## Architecture Diagram Updates

- Added LangGraph Runtime to orchestration layer
- Added technology labels (Type-Instance, transitions)

## Self-Hostability Table Expanded

Added entries for:
- LangGraph (MIT)
- transitions (MIT)
- DeepSeek V3.2 (MIT)
- Qwen3-235B (Apache 2.0)

## Metric Alignments

- Response time: Split into API (<200ms) and Agent (<10s/<60s)
- Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing
- Added concurrent projects (10+) and agents (50+) metrics

## Infrastructure Updates

- Celery workers: 4-8 instances (was 2-4) across 4 queues
- MCP servers: Clarified Phase 2 + Phase 5 deployment
- Sync interval: Clarified 60s fallback + 15min reconciliation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 23:35:51 +01:00

10 KiB

Syndarix Implementation Roadmap

Version: 1.0 Date: 2025-12-29 Status: Draft


Executive Summary

This roadmap outlines the phased implementation approach for Syndarix, prioritizing foundational infrastructure before advanced features. Each phase builds upon the previous, with clear milestones and deliverables.


Phase 0: Foundation (Weeks 1-2)

Goal: Establish development infrastructure and basic platform

0.1 Repository Setup

  • Fork PragmaStack to Syndarix
  • Create spike backlog in Gitea (12 issues)
  • Complete architecture documentation
  • Complete all spike research (SPIKE-001 through SPIKE-012)
  • Create all ADRs (ADR-001 through ADR-014)
  • Rebrand codebase (all URLs, names, configs updated)
  • Configure CI/CD pipelines
  • Set up development environment documentation

0.2 Core Infrastructure

  • Configure Redis for cache + pub/sub
  • Set up Celery worker infrastructure
  • Configure pgvector extension
  • Create MCP server directory structure
  • Set up Docker Compose for local development

Deliverables

  • Fully branded Syndarix repository
  • Complete architecture documentation (ARCHITECTURE.md)
  • All spike research completed (12 spikes)
  • All ADRs documented (14 ADRs)
  • Working local development environment (Docker Compose)
  • CI/CD pipeline running tests

Phase 1: Core Platform (Weeks 3-6)

Goal: Basic project and agent management without LLM integration

1.1 Data Model

  • Create Project entity and CRUD
  • Create AgentType entity and CRUD
  • Create AgentInstance entity and CRUD
  • Create Issue entity with external tracker fields
  • Create Sprint entity and CRUD
  • Database migrations with Alembic

1.2 API Layer

  • Project management endpoints
  • Agent type configuration endpoints
  • Agent instance management endpoints
  • Issue CRUD endpoints
  • Sprint management endpoints

1.3 Real-time Infrastructure

  • Implement EventBus with Redis Pub/Sub
  • Create SSE endpoint for project events
  • Implement event types enum
  • Add keepalive mechanism
  • Client-side SSE handling

1.4 Frontend Foundation

  • Project dashboard page
  • Agent configuration UI
  • Issue list and detail views
  • Real-time activity feed component
  • Basic navigation and layout

Deliverables

  • CRUD operations for all core entities
  • Real-time event streaming working
  • Basic admin UI for configuration

Phase 2: MCP Integration (Weeks 7-10)

Goal: Build MCP servers for external integrations

2.1 MCP Client Infrastructure

  • Create MCPClientManager class
  • Implement server registry
  • Add connection management with reconnection
  • Create tool call routing

2.2 LLM Gateway MCP (Priority 1)

  • Create FastMCP server structure
  • Implement LiteLLM integration
  • Add model group routing
  • Implement failover chain
  • Add cost tracking callbacks
  • Create token usage logging

2.3 Knowledge Base MCP (Priority 2)

  • Create pgvector schema for embeddings
  • Implement document ingestion pipeline
  • Create chunking strategies (code, markdown, text)
  • Implement semantic search
  • Add hybrid search (vector + keyword)
  • Per-project collection isolation

2.4 Git MCP (Priority 3)

  • Create Git operations wrapper
  • Implement clone, commit, push operations
  • Add branch management
  • Create PR operations
  • Add Gitea API integration
  • Implement GitHub/GitLab adapters

2.5 Issues MCP (Priority 4)

  • Create issue sync service
  • Implement Gitea issue operations
  • Add GitHub issue adapter
  • Add GitLab issue adapter
  • Implement bi-directional sync
  • Create conflict resolution logic

Deliverables

  • 4 working MCP servers
  • LLM calls routed through gateway
  • RAG search functional
  • Git operations working
  • Issue sync with external trackers

Phase 3: Agent Orchestration (Weeks 11-14)

Goal: Enable agents to perform autonomous work

3.1 Agent Runner

  • Create AgentRunner class
  • Implement context assembly
  • Add memory management (short-term, long-term)
  • Implement action execution
  • Add tool call handling
  • Create agent error handling

3.2 Agent Orchestrator

  • Implement spawn_agent method
  • Create terminate_agent method
  • Implement send_message routing
  • Add broadcast functionality
  • Create agent status tracking
  • Implement agent recovery

3.3 Inter-Agent Communication

  • Define message format schema
  • Implement message persistence
  • Create message routing logic
  • Add @mention parsing
  • Implement priority queues
  • Add conversation threading

3.4 Background Task Integration

  • Create Celery task wrappers
  • Implement progress reporting
  • Add task chaining for workflows
  • Create agent queue routing
  • Implement task retry logic

Deliverables

  • Agents can be spawned and communicate
  • Agents can call MCP tools
  • Background tasks for long operations
  • Agent activity visible in real-time

Phase 4: Workflow Engine (Weeks 15-18)

Goal: Implement structured workflows for software delivery

4.1 State Machine Foundation

  • Create workflow state machine base
  • Implement state persistence
  • Add transition validation
  • Create state history logging
  • Implement compensation patterns

4.2 Core Workflows

  • Requirements Discovery workflow
  • Architecture Spike workflow
  • Sprint Planning workflow
  • Story Implementation workflow
  • Sprint Demo workflow

4.3 Approval Gates

  • Create approval checkpoint system
  • Implement approval UI components
  • Add notification triggers
  • Create timeout handling
  • Implement escalation logic

4.4 Autonomy Levels

  • Implement FULL_CONTROL mode
  • Implement MILESTONE mode
  • Implement AUTONOMOUS mode
  • Create autonomy configuration UI
  • Add per-action approval overrides

Deliverables

  • Structured workflows executing
  • Approval gates working
  • Autonomy levels configurable
  • Full sprint cycle possible

Phase 5: Advanced Features (Weeks 19-22)

Goal: Polish and production readiness

5.1 Cost Management

  • Real-time cost tracking dashboard
  • Budget configuration per project
  • Alert threshold system
  • Cost optimization recommendations
  • Historical cost analytics

5.2 Audit & Compliance

  • Comprehensive action logging
  • Audit trail viewer UI
  • Export functionality
  • Retention policy implementation
  • Compliance report generation

5.3 Human-Agent Collaboration

  • Live activity dashboard
  • Intervention panel (pause, guide, undo)
  • Agent chat interface
  • Context inspector
  • Decision explainer

5.4 Additional MCP Servers

  • File System MCP
  • Code Analysis MCP
  • CI/CD MCP

Deliverables

  • Production-ready system
  • Full observability
  • Cost controls active
  • Audit compliance

Phase 6: Polish & Launch (Weeks 23-24)

Goal: Production deployment

6.1 Performance Optimization

  • Load testing
  • Query optimization
  • Caching optimization
  • Memory profiling

6.2 Security Hardening

  • Security audit
  • Penetration testing
  • Secrets management
  • Rate limiting tuning

6.3 Documentation

  • User documentation
  • API documentation
  • Deployment guide
  • Runbook

6.4 Deployment

  • Production environment setup
  • Monitoring & alerting
  • Backup & recovery
  • Launch checklist

Risk Register

Risk Impact Probability Mitigation
LLM API outages High Medium Multi-provider failover
Cost overruns High Medium Budget enforcement, local models
Agent hallucinations High Medium Approval gates, code review
Performance bottlenecks Medium Medium Load testing, caching
Integration failures Medium Low Contract testing, mocks

Success Metrics

Metric Target Measurement
Agent task success rate >90% Completed tasks / total tasks
API response time (P95) <200ms Pure API latency (per NFR-101)
Agent response time <10s simple, <60s code End-to-end including LLM (per NFR-103)
Cost per project <$100/sprint LLM + compute costs (with Opus 4.5 pricing)
Time to first commit <1 hour From requirements to PR
Client satisfaction >4/5 Post-sprint survey
Concurrent projects 10+ Active projects in parallel
Concurrent agents 50+ Agent instances running

Dependencies

Phase 0 ─────▶ Phase 1 ─────▶ Phase 2 ─────▶ Phase 3 ─────▶ Phase 4 ─────▶ Phase 5 ─────▶ Phase 6
Foundation    Core Platform   MCP Integration  Agent Orch    Workflows     Advanced       Launch
                                                  │
                                                  │
                                             Depends on:
                                             - LLM Gateway
                                             - Knowledge Base
                                             - Real-time events

Resource Requirements

Development Team

  • 1 Backend Engineer (Python/FastAPI)
  • 1 Frontend Engineer (React/Next.js)
  • 0.5 DevOps Engineer
  • 0.25 Product Manager

Infrastructure

  • PostgreSQL (managed or self-hosted)
  • Redis (managed or self-hosted)
  • Celery workers (4-8 instances across 4 queues: agent, git, sync, cicd)
  • MCP servers (7 containers, deployed in Phase 2 + Phase 5)
  • API server (2+ instances)
  • Frontend (static hosting or SSR)

External Services

  • Anthropic API (Claude Opus 4.5 - primary reasoning)
  • OpenAI API (GPT 5.1 Codex max - code generation)
  • Google API (Gemini 3 Pro/Flash - multimodal, fast)
  • Alibaba API (Qwen3-235B - cost-effective, or self-host)
  • DeepSeek V3.2 (self-hosted, open weights)
  • Gitea/GitHub/GitLab (issue tracking)

This roadmap will be refined as spikes complete and requirements evolve.