forked from cardosofelipe/fast-next-template

Files

Felipe Cardoso 6e3cdebbfb docs: add architecture decision records (ADRs) for key technical choices

- Added the following ADRs to `docs/adrs/` directory:
  - ADR-001: MCP Integration Architecture
  - ADR-002: Real-time Communication Architecture
  - ADR-003: Background Task Architecture
  - ADR-004: LLM Provider Abstraction
  - ADR-005: Technology Stack Selection
- Each ADR details the context, decision drivers, considered options, final decisions, and implementation plans.
- Documentation aligns technical choices with architecture principles and system requirements for Syndarix.

2025-12-29 13:16:02 +01:00

4.2 KiB

Raw Permalink Blame History

ADR-002: Real-time Communication Architecture

Status: Accepted Date: 2025-12-29 Deciders: Architecture Team Related Spikes: SPIKE-003

Context

Syndarix requires real-time communication for:

Agent activity streams
Project progress updates
Build/pipeline status
Client approval requests
Issue change notifications
Interactive chat with agents

We need to decide between WebSocket and Server-Sent Events (SSE) for real-time data delivery.

Decision Drivers

Simplicity: Minimize implementation complexity
Reliability: Built-in reconnection handling
Scalability: Support 200+ concurrent connections
Compatibility: Work through proxies and load balancers
Use Case Fit: Match communication patterns

Considered Options

Option 1: WebSocket Only

Use WebSocket for all real-time communication.

Pros:

Bidirectional communication
Single protocol to manage
Well-supported in FastAPI

Cons:

Manual reconnection logic required
More complex through proxies
Overkill for server-to-client streams

Option 2: SSE Only

Use Server-Sent Events for all real-time communication.

Pros:

Built-in automatic reconnection
Native HTTP (proxy-friendly)
Simpler implementation

Cons:

Unidirectional only
Browser connection limits per domain

Option 3: SSE Primary + WebSocket for Chat (Selected)

Use SSE for server-to-client events, WebSocket for bidirectional chat.

Pros:

Best tool for each use case
SSE simplicity for 90% of needs
WebSocket only where truly needed

Cons:

Two protocols to manage

Decision

Adopt Option 3: SSE as primary transport, WebSocket for interactive chat.

SSE Use Cases (90%)

Agent activity streams
Project progress updates
Build/pipeline status
Approval request notifications
Issue change notifications

WebSocket Use Cases (10%)

Interactive chat with agents
Real-time debugging sessions
Future collaboration features

Implementation

Event Bus with Redis Pub/Sub

FastAPI Backend ──publish──> Redis Pub/Sub ──subscribe──> SSE Endpoints
                                   │
                                   └──> Other Backend Instances

SSE Endpoint Pattern

@router.get("/projects/{project_id}/events")
async def project_events(project_id: str, request: Request):
    async def event_generator():
        subscriber = await event_bus.subscribe(f"project:{project_id}")
        try:
            while not await request.is_disconnected():
                event = await asyncio.wait_for(
                    subscriber.get_event(), timeout=30.0
                )
                yield f"event: {event.type}\ndata: {event.json()}\n\n"
        finally:
            await subscriber.unsubscribe()

    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream"
    )

Event Types

Category	Event Types
Agent	`agent_started`, `agent_activity`, `agent_completed`, `agent_error`
Project	`issue_created`, `issue_updated`, `issue_closed`
Git	`branch_created`, `commit_pushed`, `pr_created`, `pr_merged`
Workflow	`approval_required`, `sprint_started`, `sprint_completed`
Pipeline	`pipeline_started`, `pipeline_completed`, `pipeline_failed`

Client Implementation

Single SSE connection per project
Event multiplexing through event types
Exponential backoff on reconnection
Native EventSource API with automatic reconnect

Consequences

Positive

Simpler implementation for server-to-client streams
Automatic reconnection reduces client complexity
Works through all HTTP proxies
Reduced server resource usage vs WebSocket

Negative

Two protocols to maintain
WebSocket requires manual reconnect logic
SSE limited to ~6 connections per domain (HTTP/1.1)

Mitigation

Use HTTP/2 where possible (higher connection limits)
Multiplex all project events on single connection
WebSocket only for interactive chat sessions

Compliance

This decision aligns with:

FR-105: Real-time agent activity monitoring
NFR-102: 200+ concurrent connections requirement
NFR-501: Responsive UI updates

This ADR supersedes any previous decisions regarding real-time communication.

4.2 KiB Raw Permalink Blame History

ADR-002: Real-time Communication Architecture

Context

Decision Drivers

Considered Options

Option 1: WebSocket Only

Option 2: SSE Only

Option 3: SSE Primary + WebSocket for Chat (Selected)

Decision

SSE Use Cases (90%)

WebSocket Use Cases (10%)

Implementation

Event Bus with Redis Pub/Sub

SSE Endpoint Pattern

Event Types

Client Implementation

Consequences

Positive

Negative

Mitigation

Compliance

4.2 KiB

Raw Permalink Blame History