Files
syndarix/docs/adrs/ADR-011-issue-synchronization.md
Felipe Cardoso 406b25cda0 docs: add remaining ADRs and comprehensive architecture documentation
Added 7 new Architecture Decision Records completing the full set:
- ADR-008: Knowledge Base and RAG (pgvector)
- ADR-009: Agent Communication Protocol (structured messages)
- ADR-010: Workflow State Machine (transitions + PostgreSQL)
- ADR-011: Issue Synchronization (webhook-first + polling)
- ADR-012: Cost Tracking (LiteLLM callbacks + Redis budgets)
- ADR-013: Audit Logging (hash chaining + tiered storage)
- ADR-014: Client Approval Flow (checkpoint-based)

Added comprehensive ARCHITECTURE.md that:
- Summarizes all 14 ADRs in decision matrix
- Documents full system architecture with diagrams
- Explains all component interactions
- Details technology stack with self-hostability guarantee
- Covers security, scalability, and deployment

Updated IMPLEMENTATION_ROADMAP.md to mark Phase 0 completed items.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 13:54:43 +01:00

233 lines
6.6 KiB
Markdown

# ADR-011: Issue Synchronization Architecture
**Status:** Accepted
**Date:** 2025-12-29
**Deciders:** Architecture Team
**Related Spikes:** SPIKE-009
---
## Context
Syndarix must synchronize issues bi-directionally with external trackers (Gitea, GitHub, GitLab). Agents create and update issues internally, which must reflect in external systems. External changes must flow back to Syndarix.
## Decision Drivers
- **Real-time:** Changes visible within seconds
- **Consistency:** Eventual consistency acceptable
- **Conflict Resolution:** Clear rules when edits conflict
- **Multi-provider:** Support Gitea (primary), GitHub, GitLab
- **Reliability:** Handle network failures gracefully
## Considered Options
### Option 1: Polling Only
Periodically fetch all issues from external trackers.
**Pros:** Simple, reliable
**Cons:** High latency (minutes), API quota waste
### Option 2: Webhooks Only
Rely solely on external webhooks.
**Pros:** Real-time
**Cons:** May miss events during outages
### Option 3: Webhook-First + Polling Fallback (Selected)
Primary: webhooks for real-time. Secondary: polling for reconciliation.
**Pros:** Real-time with reliability
**Cons:** Slightly more complex
## Decision
**Adopt webhook-first architecture with polling fallback** and Last-Writer-Wins (LWW) conflict resolution.
External trackers are the source of truth. Syndarix maintains local mirrors for unified agent access.
## Implementation
### Sync Architecture
```
External Trackers (Gitea/GitHub/GitLab)
┌─────────┴─────────┐
│ Webhooks │ (real-time)
└─────────┬─────────┘
┌─────────┴─────────┐
│ Webhook Handler │ → Redis Queue → Sync Engine
└───────────────────┘
┌─────────┴─────────┐
│ Polling Worker │ (reconciliation every 15 min)
└───────────────────┘
┌─────────┴─────────┐
│ PostgreSQL │
│ (issues, sync_log)│
└───────────────────┘
```
### Provider Abstraction
```python
class IssueProvider(ABC):
"""Abstract interface for issue tracker providers."""
@abstractmethod
async def get_issue(self, issue_id: str) -> ExternalIssue: ...
@abstractmethod
async def list_issues(self, repo: str, since: datetime) -> list[ExternalIssue]: ...
@abstractmethod
async def create_issue(self, repo: str, issue: IssueCreate) -> ExternalIssue: ...
@abstractmethod
async def update_issue(self, issue_id: str, issue: IssueUpdate) -> ExternalIssue: ...
@abstractmethod
def parse_webhook(self, payload: dict) -> WebhookEvent: ...
# Provider implementations
class GiteaProvider(IssueProvider): ...
class GitHubProvider(IssueProvider): ...
class GitLabProvider(IssueProvider): ...
```
### Conflict Resolution
| Scenario | Resolution |
|----------|------------|
| Same field, different timestamps | Last-Writer-Wins (LWW) |
| Same field, concurrent edits | Mark conflict, notify user |
| Different fields modified | Merge both changes |
| Delete vs Update | Delete wins (configurable) |
### Database Schema
```sql
CREATE TABLE issues (
id UUID PRIMARY KEY,
project_id UUID NOT NULL,
external_id VARCHAR(100),
external_provider VARCHAR(50), -- 'gitea', 'github', 'gitlab'
external_url VARCHAR(500),
-- Canonical fields
title VARCHAR(500) NOT NULL,
body TEXT,
state VARCHAR(50) NOT NULL,
labels JSONB DEFAULT '[]',
assignees JSONB DEFAULT '[]',
-- Sync metadata
external_updated_at TIMESTAMPTZ,
local_updated_at TIMESTAMPTZ,
sync_status VARCHAR(50) DEFAULT 'synced',
sync_conflict JSONB,
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE TABLE issue_sync_log (
id UUID PRIMARY KEY,
issue_id UUID NOT NULL,
direction VARCHAR(10) NOT NULL, -- 'inbound', 'outbound'
action VARCHAR(50) NOT NULL, -- 'create', 'update', 'delete'
before_state JSONB,
after_state JSONB,
provider VARCHAR(50) NOT NULL,
sync_time TIMESTAMPTZ NOT NULL
);
```
### Webhook Handler
```python
@router.post("/webhooks/{provider}")
async def handle_webhook(
provider: str,
request: Request,
background_tasks: BackgroundTasks
):
"""Handle incoming webhooks from issue trackers."""
payload = await request.json()
# Validate signature
provider_impl = get_provider(provider)
if not provider_impl.verify_signature(request.headers, payload):
raise HTTPException(401, "Invalid signature")
# Queue for processing (deduplication in Redis)
event = provider_impl.parse_webhook(payload)
await redis.xadd(
f"sync:webhooks:{provider}",
{"event": event.json()},
id="*",
maxlen=10000
)
return {"status": "queued"}
```
### Outbox Pattern for Outbound Sync
```python
class SyncOutbox:
"""Reliable outbound sync with retry."""
async def queue_update(self, issue_id: str, changes: dict):
await db.execute("""
INSERT INTO sync_outbox (issue_id, changes, status, created_at)
VALUES ($1, $2, 'pending', NOW())
""", issue_id, json.dumps(changes))
@celery_app.task
def process_sync_outbox():
"""Process pending outbound syncs with exponential backoff."""
pending = db.query("SELECT * FROM sync_outbox WHERE status = 'pending' LIMIT 100")
for item in pending:
try:
issue = db.get_issue(item.issue_id)
provider = get_provider(issue.external_provider)
await provider.update_issue(issue.external_id, item.changes)
item.status = 'completed'
except Exception as e:
item.retry_count += 1
item.next_retry = datetime.now() + timedelta(minutes=2 ** item.retry_count)
if item.retry_count > 5:
item.status = 'failed'
```
## Consequences
### Positive
- Real-time sync via webhooks
- Reliable reconciliation via polling
- Clear conflict resolution rules
- Provider-agnostic design
### Negative
- Eventual consistency (brief inconsistency windows)
- Webhook infrastructure required
### Mitigation
- Manual refresh available in UI
- Conflict notification alerts users
## Compliance
This decision aligns with:
- FR-201-205: Issue tracking integration
- NFR-201: Multi-provider support
---
*This ADR establishes the issue synchronization architecture for Syndarix.*