fix(mcp-kb): add input validation, path security, and health checks

Security fixes from deep review:
- Add input validation patterns for project_id, agent_id, collection
- Add path traversal protection for source_path (reject .., null bytes)
- Add error codes (INTERNAL_ERROR) to generic exception handlers
- Handle FieldInfo objects in validation for test robustness

Performance fixes:
- Enable concurrent hybrid search with asyncio.gather

Health endpoint improvements:
- Check all dependencies (database, Redis, LLM Gateway)
- Return degraded/unhealthy status based on dependency health
- Updated tests for new health check response structure

All 139 tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-04 01:18:50 +01:00
parent cd7a9ccbdf
commit 6bb376a336
3 changed files with 250 additions and 14 deletions

View File

@@ -5,6 +5,7 @@ Provides semantic (vector), keyword (full-text), and hybrid search
capabilities over the knowledge base.
"""
import asyncio
import logging
import time
@@ -158,6 +159,7 @@ class SearchEngine:
Execute hybrid search combining semantic and keyword.
Uses Reciprocal Rank Fusion (RRF) for result combination.
Executes both searches concurrently for better performance.
"""
# Execute both searches with higher limits for fusion
fusion_limit = min(request.limit * 2, 100)
@@ -187,9 +189,11 @@ class SearchEngine:
include_metadata=request.include_metadata,
)
# Execute searches
semantic_results = await self._semantic_search(semantic_request)
keyword_results = await self._keyword_search(keyword_request)
# Execute searches concurrently for better performance
semantic_results, keyword_results = await asyncio.gather(
self._semantic_search(semantic_request),
self._keyword_search(keyword_request),
)
# Fuse results using RRF
fused = self._reciprocal_rank_fusion(