Felipe Cardoso
88cf4e0abc
feat: Update to production model stack and fix remaining inconsistencies
## Model Stack Updates (User's Actual Models)
Updated all documentation to reflect production models:
- Claude Opus 4.5 (primary reasoning)
- GPT 5.1 Codex max (code generation specialist)
- Gemini 3 Pro/Flash (multimodal, fast inference)
- Qwen3-235B (cost-effective, self-hostable)
- DeepSeek V3.2 (self-hosted, open weights)
### Files Updated:
- ADR-004: Full model groups, failover chains, cost tables
- ADR-007: Code example with correct model identifiers
- ADR-012: Cost tracking with new model prices
- ARCHITECTURE.md: Model groups, failover diagram
- IMPLEMENTATION_ROADMAP.md: External services list
## Architecture Diagram Updates
- Added LangGraph Runtime to orchestration layer
- Added technology labels (Type-Instance, transitions)
## Self-Hostability Table Expanded
Added entries for:
- LangGraph (MIT)
- transitions (MIT)
- DeepSeek V3.2 (MIT)
- Qwen3-235B (Apache 2.0)
## Metric Alignments
- Response time: Split into API (<200ms) and Agent (<10s/<60s)
- Cost per project: Adjusted to $100/sprint for Opus 4.5 pricing
- Added concurrent projects (10+) and agents (50+) metrics
## Infrastructure Updates
- Celery workers: 4-8 instances (was 2-4) across 4 queues
- MCP servers: Clarified Phase 2 + Phase 5 deployment
- Sync interval: Clarified 60s fallback + 15min reconciliation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 23:35:51 +01:00
..
2025-12-29 04:48:25 +01:00
2025-12-29 13:16:02 +01:00
2025-12-29 13:16:02 +01:00
2025-12-29 13:16:02 +01:00
2025-12-29 23:35:51 +01:00
2025-12-29 13:16:02 +01:00
2025-12-29 13:16:02 +01:00
2025-12-29 23:35:51 +01:00
2025-12-29 14:13:26 +01:00
2025-12-29 13:54:43 +01:00
2025-12-29 13:54:43 +01:00
2025-12-29 23:35:51 +01:00
2025-12-29 23:35:51 +01:00
2025-12-29 13:54:43 +01:00
2025-12-29 14:13:26 +01:00