[SPIKE-007] Agent-to-Agent Communication Protocol #7

New Issue

cardosofelipe · 2025-12-29T03:50:15Z

cardosofelipe commented

2025-12-29 03:50:15 +00:00

Objective

Define how agents communicate with each other during collaborative work.

Scenarios

Architect asks Developer for implementation estimate
Developer asks QA to review code
PO asks multiple agents for sprint planning input
Agents need to share context/artifacts

Key Questions

Synchronous vs asynchronous communication?
How do we structure messages between agents?
How do we handle "waiting for response" states?
How do we track conversation threads between agents?
How do we prevent circular dependencies?

Research Areas

Message queue patterns (direct, pub/sub, request-reply)
Protocol design for agent messages
Conversation/thread tracking
Timeout and retry handling

Expected Deliverables

Message protocol specification
Implementation using Celery or dedicated messaging
Example multi-agent workflow
ADR documenting the protocol

Acceptance Criteria

Agents can send messages to each other
Responses are correctly routed
Conversations are trackable
Timeouts handled gracefully
Works with parallel execution

Labels

spike, architecture, agents

## Objective Define how agents communicate with each other during collaborative work. ## Scenarios 1. Architect asks Developer for implementation estimate 2. Developer asks QA to review code 3. PO asks multiple agents for sprint planning input 4. Agents need to share context/artifacts ## Key Questions 1. Synchronous vs asynchronous communication? 2. How do we structure messages between agents? 3. How do we handle "waiting for response" states? 4. How do we track conversation threads between agents? 5. How do we prevent circular dependencies? ## Research Areas - [ ] Message queue patterns (direct, pub/sub, request-reply) - [ ] Protocol design for agent messages - [ ] Conversation/thread tracking - [ ] Timeout and retry handling ## Expected Deliverables - Message protocol specification - Implementation using Celery or dedicated messaging - Example multi-agent workflow - ADR documenting the protocol ## Acceptance Criteria - [ ] Agents can send messages to each other - [ ] Responses are correctly routed - [ ] Conversations are trackable - [ ] Timeouts handled gracefully - [ ] Works with parallel execution ## Labels `spike`, `architecture`, `agents`

cardosofelipe commented

2025-12-29 12:22:38 +00:00

SPIKE-007 Research Complete

The comprehensive spike document has been created at docs/spikes/SPIKE-007-agent-communication-protocol.md.

Executive Summary

After researching industry standards (Google A2A, IBM ACP, Anthropic MCP) and multi-agent system patterns, we recommend a hybrid message-based communication protocol that integrates with our existing infrastructure.

Key Decisions

Message Format: Structured JSON envelope with natural language content payload
Routing Strategies:
- Direct routing (to: "agent-123")
- Role-based routing (to: "@engineers")
- Broadcast routing (to: "@all")
- Topic-based routing (to: "#auth-module")
Communication Modes:
- Sync (request-response with timeout)
- Async (task queue + callback)
- Fire-and-forget (broadcasts)
- Streaming (long-running updates)
Infrastructure Mapping:
- Redis Pub/Sub for real-time delivery
- PostgreSQL for message persistence and audit
- Celery for async task delegation
- SSE for client notifications

Research Questions Answered

Question	Answer
Structured vs natural language?	Hybrid - structured envelope, natural language content
Async vs sync?	Pattern-based - sync for quick clarifications, async for tasks
Message routing?	Three strategies: direct, role-based, broadcast
Context management?	Three-tier: conversation, session, project (RAG)
Conflict resolution?	Hierarchical escalation: negotiation -> expert -> human
Audit/logging?	Full message persistence with content hashing

Acceptance Criteria Status

Agents can send messages to each other - MessageRouter service defined
Responses are correctly routed - Response linking with in_response_to field
Conversations are trackable - Conversation model with thread support
Timeouts handled gracefully - send_and_wait() with configurable timeout
Works with parallel execution - Redis Pub/Sub + Celery integration

Deliverables Included

Message Protocol Specification - Complete Pydantic schemas
Database Schema - AgentMessage, Conversation, MessageDelivery models
Code Examples - MessageRouter, AgentInbox, API endpoints
SSE Integration - Extended event types for message notifications
@Mentions Support - MentionParser with role and agent resolution
Priority Handling - Urgent message interrupts with PriorityMessageHandler

References

Next Steps

Create ADR-007 documenting the protocol decision
Implement AgentMessage model and migrations
Implement MessageRouter service
Add message-related API endpoints
Extend SSE events for message notifications
Integration testing with multi-agent workflows

## SPIKE-007 Research Complete The comprehensive spike document has been created at `docs/spikes/SPIKE-007-agent-communication-protocol.md`. ### Executive Summary After researching industry standards (Google A2A, IBM ACP, Anthropic MCP) and multi-agent system patterns, we recommend a **hybrid message-based communication protocol** that integrates with our existing infrastructure. ### Key Decisions 1. **Message Format**: Structured JSON envelope with natural language content payload 2. **Routing Strategies**: - Direct routing (`to: "agent-123"`) - Role-based routing (`to: "@engineers"`) - Broadcast routing (`to: "@all"`) - Topic-based routing (`to: "#auth-module"`) 3. **Communication Modes**: - Sync (request-response with timeout) - Async (task queue + callback) - Fire-and-forget (broadcasts) - Streaming (long-running updates) 4. **Infrastructure Mapping**: - Redis Pub/Sub for real-time delivery - PostgreSQL for message persistence and audit - Celery for async task delegation - SSE for client notifications ### Research Questions Answered | Question | Answer | |----------|--------| | Structured vs natural language? | Hybrid - structured envelope, natural language content | | Async vs sync? | Pattern-based - sync for quick clarifications, async for tasks | | Message routing? | Three strategies: direct, role-based, broadcast | | Context management? | Three-tier: conversation, session, project (RAG) | | Conflict resolution? | Hierarchical escalation: negotiation -> expert -> human | | Audit/logging? | Full message persistence with content hashing | ### Acceptance Criteria Status - [x] Agents can send messages to each other - MessageRouter service defined - [x] Responses are correctly routed - Response linking with `in_response_to` field - [x] Conversations are trackable - Conversation model with thread support - [x] Timeouts handled gracefully - `send_and_wait()` with configurable timeout - [x] Works with parallel execution - Redis Pub/Sub + Celery integration ### Deliverables Included 1. **Message Protocol Specification** - Complete Pydantic schemas 2. **Database Schema** - AgentMessage, Conversation, MessageDelivery models 3. **Code Examples** - MessageRouter, AgentInbox, API endpoints 4. **SSE Integration** - Extended event types for message notifications 5. **@Mentions Support** - MentionParser with role and agent resolution 6. **Priority Handling** - Urgent message interrupts with PriorityMessageHandler ### References - [Google A2A Protocol](https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/) - [A2A Linux Foundation Project](https://www.linuxfoundation.org/press/linux-foundation-launches-the-agent2agent-protocol-project-to-enable-secure-intelligent-communication-between-ai-agents) - [IBM ACP & Protocol Survey](https://arxiv.org/html/2505.02279v1) - [Multi-Agent Collaboration Mechanisms](https://arxiv.org/html/2501.06322v1) ### Next Steps 1. Create ADR-007 documenting the protocol decision 2. Implement AgentMessage model and migrations 3. Implement MessageRouter service 4. Add message-related API endpoints 5. Extend SSE events for message notifications 6. Integration testing with multi-agent workflows

cardosofelipe referenced this issue from a commit

2025-12-29 12:31:10 +00:00

docs: add architecture spikes and deep analysis documentation

cardosofelipe closed this issue

2025-12-29 12:31:47 +00:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cardosofelipe/syndarix#7