Files
syndarix/docs/adrs/ADR-002-realtime-communication.md
Felipe Cardoso 6e3cdebbfb docs: add architecture decision records (ADRs) for key technical choices
- Added the following ADRs to `docs/adrs/` directory:
  - ADR-001: MCP Integration Architecture
  - ADR-002: Real-time Communication Architecture
  - ADR-003: Background Task Architecture
  - ADR-004: LLM Provider Abstraction
  - ADR-005: Technology Stack Selection
- Each ADR details the context, decision drivers, considered options, final decisions, and implementation plans.
- Documentation aligns technical choices with architecture principles and system requirements for Syndarix.
2025-12-29 13:16:02 +01:00

4.2 KiB

ADR-002: Real-time Communication Architecture

Status: Accepted Date: 2025-12-29 Deciders: Architecture Team Related Spikes: SPIKE-003


Context

Syndarix requires real-time communication for:

  • Agent activity streams
  • Project progress updates
  • Build/pipeline status
  • Client approval requests
  • Issue change notifications
  • Interactive chat with agents

We need to decide between WebSocket and Server-Sent Events (SSE) for real-time data delivery.

Decision Drivers

  • Simplicity: Minimize implementation complexity
  • Reliability: Built-in reconnection handling
  • Scalability: Support 200+ concurrent connections
  • Compatibility: Work through proxies and load balancers
  • Use Case Fit: Match communication patterns

Considered Options

Option 1: WebSocket Only

Use WebSocket for all real-time communication.

Pros:

  • Bidirectional communication
  • Single protocol to manage
  • Well-supported in FastAPI

Cons:

  • Manual reconnection logic required
  • More complex through proxies
  • Overkill for server-to-client streams

Option 2: SSE Only

Use Server-Sent Events for all real-time communication.

Pros:

  • Built-in automatic reconnection
  • Native HTTP (proxy-friendly)
  • Simpler implementation

Cons:

  • Unidirectional only
  • Browser connection limits per domain

Option 3: SSE Primary + WebSocket for Chat (Selected)

Use SSE for server-to-client events, WebSocket for bidirectional chat.

Pros:

  • Best tool for each use case
  • SSE simplicity for 90% of needs
  • WebSocket only where truly needed

Cons:

  • Two protocols to manage

Decision

Adopt Option 3: SSE as primary transport, WebSocket for interactive chat.

SSE Use Cases (90%)

  • Agent activity streams
  • Project progress updates
  • Build/pipeline status
  • Approval request notifications
  • Issue change notifications

WebSocket Use Cases (10%)

  • Interactive chat with agents
  • Real-time debugging sessions
  • Future collaboration features

Implementation

Event Bus with Redis Pub/Sub

FastAPI Backend ──publish──> Redis Pub/Sub ──subscribe──> SSE Endpoints
                                   │
                                   └──> Other Backend Instances

SSE Endpoint Pattern

@router.get("/projects/{project_id}/events")
async def project_events(project_id: str, request: Request):
    async def event_generator():
        subscriber = await event_bus.subscribe(f"project:{project_id}")
        try:
            while not await request.is_disconnected():
                event = await asyncio.wait_for(
                    subscriber.get_event(), timeout=30.0
                )
                yield f"event: {event.type}\ndata: {event.json()}\n\n"
        finally:
            await subscriber.unsubscribe()

    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream"
    )

Event Types

Category Event Types
Agent agent_started, agent_activity, agent_completed, agent_error
Project issue_created, issue_updated, issue_closed
Git branch_created, commit_pushed, pr_created, pr_merged
Workflow approval_required, sprint_started, sprint_completed
Pipeline pipeline_started, pipeline_completed, pipeline_failed

Client Implementation

  • Single SSE connection per project
  • Event multiplexing through event types
  • Exponential backoff on reconnection
  • Native EventSource API with automatic reconnect

Consequences

Positive

  • Simpler implementation for server-to-client streams
  • Automatic reconnection reduces client complexity
  • Works through all HTTP proxies
  • Reduced server resource usage vs WebSocket

Negative

  • Two protocols to maintain
  • WebSocket requires manual reconnect logic
  • SSE limited to ~6 connections per domain (HTTP/1.1)

Mitigation

  • Use HTTP/2 where possible (higher connection limits)
  • Multiplex all project events on single connection
  • WebSocket only for interactive chat sessions

Compliance

This decision aligns with:

  • FR-105: Real-time agent activity monitoring
  • NFR-102: 200+ concurrent connections requirement
  • NFR-501: Responsive UI updates

This ADR supersedes any previous decisions regarding real-time communication.