feat: Update to production model stack and fix remaining inconsistencies

## Model Stack Updates (User's Actual Models)

Updated all documentation to reflect production models:
- Claude Opus 4.5 (primary reasoning)
- GPT 5.1 Codex max (code generation specialist)
- Gemini 3 Pro/Flash (multimodal, fast inference)
- Qwen3-235B (cost-effective, self-hostable)
- DeepSeek V3.2 (self-hosted, open weights)

### Files Updated:
- ADR-004: Full model groups, failover chains, cost tables
- ADR-007: Code example with correct model identifiers
- ADR-012: Cost tracking with new model prices
- ARCHITECTURE.md: Model groups, failover diagram
- IMPLEMENTATION_ROADMAP.md: External services list

## Architecture Diagram Updates

- Added LangGraph Runtime to orchestration layer
- Added technology labels (Type-Instance, transitions)

## Self-Hostability Table Expanded

Added entries for:
- LangGraph (MIT)
- transitions (MIT)
- DeepSeek V3.2 (MIT)
- Qwen3-235B (Apache 2.0)

## Metric Alignments

- Response time: Split into API (<200ms) and Agent (<10s/<60s)
- Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing
- Added concurrent projects (10+) and agents (50+) metrics

## Infrastructure Updates

- Celery workers: 4-8 instances (was 2-4) across 4 queues
- MCP servers: Clarified Phase 2 + Phase 5 deployment
- Sync interval: Clarified 60s fallback + 15min reconciliation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-29 23:35:51 +01:00
parent f138417486
commit 88cf4e0abc
6 changed files with 98 additions and 69 deletions

View File

@@ -42,10 +42,11 @@ Syndarix is an autonomous AI-powered software consulting platform that orchestra
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
│ │ │ ORCHESTRATION LAYER │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ Agent │ │ Workflow │ │ Approval │ │ │ │
│ │ │ │ Orchestrator│ │ Engine │ │ Service │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │ │ │
│ │ │ │ Agent │ │ Workflow │ │ Approval │ │ LangGraph │ │ │ │
│ │ │ │ Orchestrator│ │ Engine │ │ Service │ Runtime │ │ │ │
│ │ │ │(Type-Inst.) │ │(transitions)│ │ │ │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └───────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │
@@ -126,21 +127,26 @@ Agent Type (Template) Agent Instance (Runtime)
**Failover Chain:**
```
Claude 3.5 Sonnet (Primary)
Claude Opus 4.5 (Primary)
▼ (on failure/rate limit)
GPT 5.1 Codex max (Code specialist)
▼ (on failure/rate limit)
Gemini 3 Pro (Multimodal)
▼ (on failure)
GPT-4 Turbo (Fallback)
▼ (on failure)
Ollama/Llama 3 (Local)
Qwen3-235B / DeepSeek V3.2 (Self-hosted)
```
**Model Groups:**
| Group | Use Case | Primary Model |
|-------|----------|---------------|
| high-reasoning | Architecture, complex analysis | Claude 3.5 Sonnet |
| fast-response | Quick tasks, status updates | Claude 3 Haiku |
| cost-optimized | High-volume, non-critical | Local Llama 3 |
| Group | Use Case | Primary Model | Fallback |
|-------|----------|---------------|----------|
| high-reasoning | Architecture, complex analysis | Claude Opus 4.5 | GPT 5.1 Codex max |
| code-generation | Code writing, refactoring | GPT 5.1 Codex max | Claude Opus 4.5 |
| fast-response | Quick tasks, status updates | Gemini 3 Flash | Qwen3-235B |
| cost-optimized | High-volume, non-critical | Qwen3-235B | DeepSeek V3.2 |
| self-hosted | Privacy-sensitive, air-gapped | DeepSeek V3.2 | Qwen3-235B |
### 3. Knowledge Base (RAG)
@@ -245,13 +251,17 @@ LLM Request → LiteLLM Callback → Redis INCR → Budget Check
**All components are fully self-hostable with no mandatory subscriptions:**
| Component | Self-Hosted | Managed Alternative (Optional) |
|-----------|-------------|--------------------------------|
| PostgreSQL | Yes | RDS, Neon, Supabase |
| Redis | Yes | Redis Cloud |
| LiteLLM | Yes | LiteLLM Enterprise |
| Celery | Yes | - |
| FastMCP | Yes | - |
| Component | License | Self-Hosted | Managed Alternative (Optional) |
|-----------|---------|-------------|--------------------------------|
| PostgreSQL | PostgreSQL | Yes | RDS, Neon, Supabase |
| Redis | BSD-3 | Yes | Redis Cloud |
| LiteLLM | MIT | Yes | LiteLLM Enterprise |
| Celery | BSD-3 | Yes | - |
| FastMCP | MIT | Yes | - |
| LangGraph | MIT | Yes | LangSmith (observability only) |
| transitions | MIT | Yes | - |
| DeepSeek V3.2 | MIT | Yes | API available |
| Qwen3-235B | Apache 2.0 | Yes | Alibaba Cloud |
---