forked from cardosofelipe/fast-next-template

Files

Felipe Cardoso 406b25cda0 docs: add remaining ADRs and comprehensive architecture documentation

Added 7 new Architecture Decision Records completing the full set:
- ADR-008: Knowledge Base and RAG (pgvector)
- ADR-009: Agent Communication Protocol (structured messages)
- ADR-010: Workflow State Machine (transitions + PostgreSQL)
- ADR-011: Issue Synchronization (webhook-first + polling)
- ADR-012: Cost Tracking (LiteLLM callbacks + Redis budgets)
- ADR-013: Audit Logging (hash chaining + tiered storage)
- ADR-014: Client Approval Flow (checkpoint-based)

Added comprehensive ARCHITECTURE.md that:
- Summarizes all 14 ADRs in decision matrix
- Documents full system architecture with diagrams
- Explains all component interactions
- Details technology stack with self-hostability guarantee
- Covers security, scalability, and deployment

Updated IMPLEMENTATION_ROADMAP.md to mark Phase 0 completed items.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-29 13:54:43 +01:00

4.9 KiB

Raw Permalink Blame History

ADR-009: Agent Communication Protocol

Status: Accepted Date: 2025-12-29 Deciders: Architecture Team Related Spikes: SPIKE-007

Context

Syndarix requires a robust protocol for inter-agent communication. 10+ specialized AI agents must collaborate on software projects, sharing context, delegating tasks, and resolving conflicts.

Decision Drivers

Auditability: All communication must be traceable
Flexibility: Support various communication patterns
Performance: Low-latency for interactive collaboration
Reliability: Messages must not be lost

Considered Options

Option 1: Pure Natural Language

Agents communicate via free-form text messages.

Pros: Simple, flexible Cons: Difficult to route, parse, and audit

Option 2: Rigid RPC Protocol

Strongly-typed function calls between agents.

Pros: Predictable, type-safe Cons: Loses LLM reasoning flexibility

Option 3: Structured Envelope + Natural Language Payload (Selected)

JSON envelope for routing/auditing with natural language content.

Pros: Best of both worlds - routeable and auditable while preserving LLM capabilities Cons: Slightly more complex

Decision

Adopt structured message envelopes with natural language payloads, inspired by Google's A2A protocol concepts.

Implementation

Message Schema

@dataclass
class AgentMessage:
    id: UUID                           # Unique message ID
    type: Literal["request", "response", "broadcast", "notification"]

    # Routing
    from_agent: AgentIdentity          # Source agent
    to_agent: AgentIdentity | None     # Target (None = broadcast)
    routing: Literal["direct", "role", "broadcast", "topic"]

    # Action
    action: str                        # e.g., "request_guidance", "task_handoff"
    priority: Literal["low", "normal", "high", "urgent"]

    # Context
    project_id: str
    conversation_id: str | None        # For threading
    correlation_id: UUID | None        # For request/response matching

    # Content
    content: str                       # Natural language message
    attachments: list[Attachment]      # Code snippets, files, etc.

    # Metadata
    created_at: datetime
    expires_at: datetime | None
    requires_response: bool

Routing Strategies

Strategy	Syntax	Use Case
Direct	`to: "agent-123"`	Specific agent
Role-based	`to: "@engineers"`	All agents of role
Broadcast	`to: "@all"`	Project-wide
Topic-based	`to: "#auth-module"`	Subscribed agents

Communication Modes

class MessageMode(str, Enum):
    SYNC = "sync"              # Await response (< 30s)
    ASYNC = "async"            # Queue, callback later
    FIRE_AND_FORGET = "fire"   # No response expected
    STREAM = "stream"          # Continuous updates

Message Bus Implementation

class AgentMessageBus:
    """Redis Streams-based message bus for agent communication."""

    async def send(self, message: AgentMessage) -> None:
        # Persist to PostgreSQL for audit
        await self.store.save(message)

        # Publish to Redis for real-time delivery
        channel = self._get_channel(message)
        await self.redis.xadd(channel, message.to_dict())

        # Publish SSE event for UI visibility
        await self.event_bus.publish(
            f"project:{message.project_id}",
            {"type": "agent_message", "preview": message.content[:100]}
        )

    async def subscribe(self, agent_id: str) -> AsyncIterator[AgentMessage]:
        """Subscribe to messages for an agent."""
        channels = [
            f"agent:{agent_id}",           # Direct messages
            f"role:{agent.role}",          # Role-based
            f"project:{agent.project_id}", # Broadcasts
        ]
        # ... Redis Streams consumer group logic

Context Hierarchy

Conversation Context (short-term): Current thread, last N exchanges
Session Context (medium-term): Sprint goals, recent decisions
Project Context (long-term): Architecture, requirements, knowledge base

Conflict Resolution

When agents disagree:

Peer Resolution: Agents attempt consensus (2 attempts)
Supervisor Escalation: Product Owner or Architect decides
Human Override: Client approval if configured

Consequences

Positive

Full audit trail of all agent communication
Flexible routing supports various collaboration patterns
Natural language preserves LLM reasoning quality
Real-time UI visibility into agent collaboration

Negative

Additional complexity vs simple function calls
Message persistence storage requirements

Mitigation

Archival policy for old messages
Compression for large attachments

Compliance

This decision aligns with:

FR-104: Inter-agent communication
FR-105: Agent activity monitoring
NFR-602: Comprehensive audit logging

This ADR establishes the agent communication protocol for Syndarix.

4.9 KiB Raw Permalink Blame History