feat(context): enhance timeout handling, tenant isolation, and budget management

- Added timeout enforcement for token counting, scoring, and compression with detailed error handling. - Introduced tenant isolation in context caching using project and agent identifiers. - Enhanced budget management with stricter checks for critical context overspending and buffer limitations. - Optimized per-context locking with cleanup to prevent memory leaks in concurrent environments. - Updated default assembly timeout settings for improved performance and reliability. - Improved XML escaping in Claude adapter for safety against injection attacks. - Standardized token estimation using model-specific ratios.
2026-01-04 15:52:50 +01:00
parent 2bea057fb1
commit 1628eacf2b
10 changed files with 271 additions and 175 deletions
--- a/backend/app/services/context/budget/allocator.py
+++ b/backend/app/services/context/budget/allocator.py
@@ -293,14 +293,18 @@ class BudgetAllocator:
        if isinstance(context_type, ContextType):
            context_type = context_type.value

-        # Calculate adjustment (limited by buffer)
+        # Calculate adjustment (limited by buffer for increases, by current allocation for decreases)
        if adjustment > 0:
-            # Taking from buffer
+            # Taking from buffer - limited by available buffer
            actual_adjustment = min(adjustment, budget.buffer)
            budget.buffer -= actual_adjustment
        else:
-            # Returning to buffer
-            actual_adjustment = adjustment
+            # Returning to buffer - limited by current allocation of target type
+            current_allocation = budget.get_allocation(context_type)
+            # Can't return more than current allocation
+            actual_adjustment = max(adjustment, -current_allocation)
+            # Add returned tokens back to buffer (adjustment is negative, so subtract)
+            budget.buffer -= actual_adjustment

        # Apply to target type
        if context_type == "system":