Improved code formatting, line breaks, and indentation across chunking logic and multiple test modules to enhance code clarity and maintain consistent style. No functional changes made.
Added comprehensive unit and API tests to validate AgentType category and display fields:
- Category validation for valid, null, and invalid values
- Icon, color, and sort_order field constraints
- Typical tasks and collaboration hints handling (stripping, removing empty strings, normalization)
- New API tests for field creation, filtering, updating, and grouping
Add new "Category & Display" card in Basic Info tab with:
- Category dropdown to select agent category
- Sort order input for display ordering
- Icon text input with Lucide icon name
- Color picker with hex input and visual color selector
- Typical tasks tag input for agent capabilities
- Collaboration hints tag input for agent relationships
Updates include:
- TAB_FIELD_MAPPING with new field mappings
- State and handlers for typical_tasks and collaboration_hints
- Fix tests to use getAllByRole for multiple Add buttons
Frontend changes to support new AgentType category and display fields:
Types (agentTypes.ts):
- Add AgentTypeCategory union type with 8 categories
- Add CATEGORY_METADATA constant with labels, descriptions, colors
- Update all interfaces with new fields (category, icon, color, etc.)
- Add AgentTypeGroupedResponse type
Validation (agentType.ts):
- Add AGENT_TYPE_CATEGORIES constant with metadata
- Add AVAILABLE_ICONS constant for icon picker
- Add COLOR_PALETTE constant for color selection
- Update agentTypeFormSchema with new field validators
- Update defaultAgentTypeValues with new fields
Form updates:
- Transform function now maps category and display fields from API
Test updates:
- Add new fields to mock AgentTypeResponse objects
Add 6 new fields to AgentType for better organization and UI display:
- category: enum for grouping (development, design, quality, etc.)
- icon: Lucide icon identifier for UI
- color: hex color code for visual distinction
- sort_order: display ordering within categories
- typical_tasks: list of tasks the agent excels at
- collaboration_hints: agent slugs that work well together
Backend changes:
- Add AgentTypeCategory enum to enums.py
- Update AgentType model with 6 new columns and indexes
- Update schemas with validators for new fields
- Add category filter and /grouped endpoint to routes
- Update CRUD with get_grouped_by_category method
- Update seed data with categories for all 27 agents
- Add migration 0007
Major revamp of agent types based on SOTA personality design research:
- Expanded from 6 to 27 specialized agent types
- Rich personality prompts following Anthropic and CrewAI best practices
- Each agent has structured prompt with Core Identity, Expertise,
Principles, and Scenario Handling sections
Agent Categories:
- Core Development (8): Product Owner, PM, BA, Architect, Full Stack,
Backend, Frontend, Mobile Engineers
- Design (2): UI/UX Designer, UX Researcher
- Quality & Operations (3): QA, DevOps, Security Engineers
- AI/ML (5): AI/ML Engineer, Researcher, CV, NLP, MLOps Engineers
- Data (2): Data Scientist, Data Engineer
- Leadership (2): Technical Lead, Scrum Master
- Domain Specialists (5): Financial, Healthcare, Scientific,
Behavioral Psychology Experts, Technical Writer
Research applied:
- Anthropic Claude persona design guidelines
- CrewAI role/backstory/goal patterns
- Role prompting research on detailed vs generic personas
- Temperature tuning per agent type (0.2-0.7 based on role)
When default value is null but source has a value (e.g., description
field), the merge was discarding the source value because typeof null
!== typeof string. Now properly accepts source values for nullable fields.
Adds console.log statements throughout the form submission flow:
- Form submit triggered
- Current form values
- Form state (isDirty, isValid, isSubmitting, errors)
- Validation pass/fail
- onSubmit call and completion
This will help diagnose why the save button appears to do nothing.
Check browser console for '[AgentTypeForm]' logs.
Root cause: The demo data's model_params was missing `top_p`, but the
Zod schema required all three fields (temperature, max_tokens, top_p).
This caused silent validation failures when editing agent types.
Fixes:
1. Add getInitialValues() that ensures all required fields have defaults
2. Handle nested validation errors in handleFormError (e.g., model_params.top_p)
3. Add useEffect to reset form when agentType changes
4. Add console.error logging for debugging validation failures
5. Update demo data to include top_p in all agent types
The form now properly initializes with safe defaults for any missing
fields from the API response, preventing silent validation failures.
When form validation fails (e.g., personality_prompt is empty), the form
would silently not submit. Now it shows a toast with the first error
and navigates to the tab containing the error field.
When running in Docker, the frontend needs to use 'http://backend:8000'
as the backend URL for Next.js rewrites. This env var is set to use
the Docker service name for proper container-to-container communication.
The rewrite was using 'http://backend:8000' which only resolves inside
Docker network. When running Next.js locally (npm run dev), the hostname
'backend' doesn't exist, causing ENOTFOUND errors.
Now uses NEXT_PUBLIC_API_BASE_URL env var with fallback to localhost:8000
for local development. In Docker, set NEXT_PUBLIC_API_BASE_URL=http://backend:8000.
Dashboard changes:
- Update useDashboard hook to fetch real projects from API
- Calculate stats (active projects, agents, issues) from real data
- Keep pending approvals as mock (no backend endpoint yet)
Demo data additions:
- API Gateway Modernization project (active, complex)
- Customer Analytics Dashboard project (completed)
- DevOps Pipeline Automation project (active, complex)
- Added sprints, agent instances, and issues for each new project
Total demo data: 6 projects, 14 agents, 22 issues
- Update demo_data.json to use "__admin__" as owner_email for all projects
- Add admin user lookup in load_demo_data() with special "__admin__" key
- Remove notification_email from project settings (not a valid field)
This ensures demo projects are visible to the admin user when logged in.
register_vector() requires the vector type to exist in PostgreSQL before
it can register the type codec. Move CREATE EXTENSION to a separate
_ensure_pgvector_extension() method that runs before pool creation.
This fixes the "unknown type: public.vector" error on fresh databases.
Add values_callable to all enum columns so SQLAlchemy serializes using
the enum's .value (lowercase) instead of .name (uppercase). PostgreSQL
enum types defined in migrations use lowercase values.
Fixes: invalid input value for enum autonomy_level: "MILESTONE"
SQLAlchemy's Enum() auto-generates type names from Python class names
(e.g., AutonomyLevel -> autonomylevel), but migrations defined them
with underscores (e.g., autonomy_level). This mismatch caused:
"type 'autonomylevel' does not exist"
Added explicit name parameters to all enum columns to match the
migration-defined type names:
- autonomy_level, project_status, project_complexity, client_mode
- agent_status, sprint_status
- issue_type, issue_status, issue_priority, sync_status
- Delete `demo_data.json` replaced by structured logic for better modularity.
- Add support for seeding default agent types and new demo data structure.
- Ensure demo mode only executes when explicitly enabled (settings.DEMO_MODE).
- Enhance logging for improved debugging during DB initialization.
- Replace default empty list with `deque` for `memory_retrieval_latency_seconds`
- Prevents unbounded memory growth by leveraging bounded circular buffer behavior
- Skip SSE connection in demo mode (MSW doesn't support SSE).
- Remove unused `useProjectEvents` and related real-time hooks from `Dashboard`.
- Temporarily disable activity feed SSE until a global endpoint is available.
- Add ABANDONED value to core Outcome enum in types.py
- Replace duplicate OutcomeType class in mcp/tools.py with alias to Outcome
- Simplify mcp/service.py to use outcome directly (no more silent mapping)
- Add migration 0006 to extend PostgreSQL episode_outcome enum
- Add missing constraints to migration 0005 (ix_facts_unique_triple_global)
This fixes the semantic issue where ABANDONED outcomes were silently
converted to FAILURE, losing information about task abandonment.
Bug Fixes:
- Remove singleton pattern from consolidation/reflection services to
prevent stale database session bugs (session is now passed per-request)
- Add LRU eviction to MemoryToolService._working dict (max 1000 sessions)
to prevent unbounded memory growth
- Replace O(n) list.remove() with O(1) OrderedDict.move_to_end() in
RetrievalCache for better performance under load
- Use deque with maxlen for metrics histograms to prevent unbounded
memory growth (circular buffer with 10k max samples)
- Use full UUID for checkpoint IDs instead of 8-char prefix to avoid
collision risk at scale (birthday paradox at ~50k checkpoints)
Test Updates:
- Update checkpoint test to expect 36-char UUID
- Update reflection singleton tests to expect new factory behavior
- Add reset_memory_reflection() no-op for backwards compatibility
- Connect to MCP servers concurrently instead of sequentially
- Reduce retry settings in test mode (IS_TEST=True):
- 1 attempt instead of 3
- 100ms retry delay instead of 1s
- 2s timeout instead of 30-120s
Reduces MCP E2E test time from ~16s to under 1s.
- Adjusted TTL durations and sleep intervals across memory and cache tests for consistent expiration behavior.
- Prevented test flakiness caused by timing discrepancies in token expiration and cache cleanup.
- Add threading.Lock with double-check locking to ScopeManager
- Add asyncio.Lock with double-check locking to MemoryReflection
- Make reset_memory_metrics async with proper locking
- Update test fixtures to handle async reset functions
- Change source_episode_ids from JSON to JSONB for PostgreSQL consistency
- Add unique constraint for global facts (project_id IS NULL)
- Add CHECK constraint ensuring reinforcement_count >= 1
Moved tests/unit/models/memory/ to tests/models/memory/ to avoid
Python import path conflicts when pytest collects all tests.
The conflict was caused by tests/models/ and tests/unit/models/ both
having __init__.py files, causing Python to confuse app.models.memory
imports.
- Added `memory_consolidation` to the task list and updated `__all__` in test files.
- Updated `source_episode_ids` in `Fact` model to use JSON for cross-database compatibility.
- Revised related database migrations to use JSONB instead of ARRAY.
- Adjusted test concurrency in Makefile for improved test performance.
Auto-fixed linting errors and formatting issues:
- Removed unused imports (F401): pytest, Any, AnalysisType, MemoryType, OutcomeType
- Removed unused variable (F841): hooks variable in test
- Applied consistent formatting across memory service and test files
- Add MemoryConsolidationService with Working→Episodic→Semantic/Procedural transfer
- Add Celery tasks for session and nightly consolidation
- Implement memory pruning with importance-based retention
- Add comprehensive test suite (32 tests)
Add comprehensive indexing and retrieval system for memory search:
- VectorIndex for semantic similarity search using cosine similarity
- TemporalIndex for time-based queries with range and recency support
- EntityIndex for entity-based lookups with multi-entity intersection
- OutcomeIndex for success/failure filtering on episodes
- MemoryIndexer as unified interface for all index types
- RetrievalEngine with hybrid search combining all indices
- RelevanceScorer for multi-signal relevance scoring
- RetrievalCache for LRU caching of search results
Add scope management system for hierarchical memory access:
- ScopeManager with hierarchy: Global → Project → Agent Type → Agent Instance → Session
- ScopePolicy for access control (read, write, inherit permissions)
- ScopeResolver for resolving queries across scope hierarchies with inheritance
- ScopeFilter for filtering scopes by type, project, or agent
- Access control enforcement with parent scope visibility
- Deduplication support during resolution across scopes
Implements the episodic memory service for storing and retrieving
agent task execution experiences. This enables learning from past
successes and failures.
Components:
- EpisodicMemory: Main service class combining recording and retrieval
- EpisodeRecorder: Handles episode creation, importance scoring
- EpisodeRetriever: Multiple retrieval strategies (recency, semantic,
outcome, importance, task type)
Key features:
- Records task completions with context, actions, outcomes
- Calculates importance scores based on outcome, duration, lessons
- Semantic search with fallback to recency when embeddings unavailable
- Full CRUD operations with statistics and summarization
- Comprehensive unit tests (50 tests, all passing)
Closes#90
Implements session-scoped ephemeral memory with:
Storage Backends:
- InMemoryStorage: Thread-safe fallback with TTL support and capacity limits
- RedisStorage: Primary storage with connection pooling and JSON serialization
- Auto-fallback from Redis to in-memory when unavailable
WorkingMemory Class:
- Key-value storage with TTL and reserved key protection
- Task state tracking with progress updates
- Scratchpad for reasoning steps with timestamps
- Checkpoint/snapshot support for recovery
- Factory methods for auto-configured storage
Tests:
- 55 unit tests covering all functionality
- Tests for basic ops, TTL, capacity, concurrency
- Tests for task state, scratchpad, checkpoints