feat(context): Phase 2 - Token Budget Management #80

New Issue

cardosofelipe · 2026-01-04T00:51:16Z

cardosofelipe commented

2026-01-04 00:51:16 +00:00

Overview

Implement accurate token counting and budget allocation for context management.

Parent Issue

#61: Context Management Engine

Implementation Tasks

1. Create `budget/calculator.py`

Create TokenCalculator class
Integrate with LLM Gateway count_tokens tool via MCPClientManager
Implement in-memory caching for token counts
Handle edge cases (empty text, very long text)

class TokenCalculator:
    def __init__(self, mcp_manager: MCPClientManager):
        self.mcp = mcp_manager
        self._cache: dict[str, int] = {}

    async def count_tokens(
        self,
        text: str,
        model: str | None = None
    ) -> int:
        """Count tokens using LLM Gateway."""
        ...

    async def count_batch(
        self,
        texts: list[str],
        model: str | None = None
    ) -> list[int]:
        """Count tokens for multiple texts in parallel."""
        ...

2. Create `budget/allocator.py`

Create TokenBudget dataclass
Implement budget partitioning from percentages
Implement remaining() method
Implement can_fit() method
Implement allocate() method
Implement to_dict() for reporting

@dataclass
class TokenBudget:
    total: int
    system: int
    task: int
    knowledge: int
    conversation: int
    tools: int
    response_reserve: int
    buffer: int

    used: dict[ContextType, int] = field(default_factory=dict)

    @classmethod
    def from_settings(
        cls,
        total_tokens: int,
        settings: ContextSettings
    ) -> "TokenBudget":
        """Create budget from settings percentages."""
        ...

3. Create `budget/init.py`

Export TokenCalculator and TokenBudget

Files to Create

backend/app/services/context/budget/
├── __init__.py
├── calculator.py
└── allocator.py

Acceptance Criteria

Token counting via LLM Gateway works correctly
Budget allocation matches settings percentages
can_fit() and allocate() work correctly
In-memory caching prevents duplicate API calls
Unit tests with mocked LLM Gateway

Dependencies

#69 (Phase 1 - Foundation)

Labels

phase-2, context, backend

## Overview Implement accurate token counting and budget allocation for context management. ## Parent Issue - #61: Context Management Engine --- ## Implementation Tasks ### 1. Create `budget/calculator.py` - [ ] Create `TokenCalculator` class - [ ] Integrate with LLM Gateway `count_tokens` tool via MCPClientManager - [ ] Implement in-memory caching for token counts - [ ] Handle edge cases (empty text, very long text) ```python class TokenCalculator: def __init__(self, mcp_manager: MCPClientManager): self.mcp = mcp_manager self._cache: dict[str, int] = {} async def count_tokens( self, text: str, model: str | None = None ) -> int: """Count tokens using LLM Gateway.""" ... async def count_batch( self, texts: list[str], model: str | None = None ) -> list[int]: """Count tokens for multiple texts in parallel.""" ... ``` ### 2. Create `budget/allocator.py` - [ ] Create `TokenBudget` dataclass - [ ] Implement budget partitioning from percentages - [ ] Implement `remaining()` method - [ ] Implement `can_fit()` method - [ ] Implement `allocate()` method - [ ] Implement `to_dict()` for reporting ```python @dataclass class TokenBudget: total: int system: int task: int knowledge: int conversation: int tools: int response_reserve: int buffer: int used: dict[ContextType, int] = field(default_factory=dict) @classmethod def from_settings( cls, total_tokens: int, settings: ContextSettings ) -> "TokenBudget": """Create budget from settings percentages.""" ... ``` ### 3. Create `budget/__init__.py` - [ ] Export TokenCalculator and TokenBudget --- ## Files to Create ``` backend/app/services/context/budget/ ├── __init__.py ├── calculator.py └── allocator.py ``` --- ## Acceptance Criteria - [ ] Token counting via LLM Gateway works correctly - [ ] Budget allocation matches settings percentages - [ ] `can_fit()` and `allocate()` work correctly - [ ] In-memory caching prevents duplicate API calls - [ ] Unit tests with mocked LLM Gateway --- ## Dependencies - #69 (Phase 1 - Foundation) ## Labels `phase-2`, `context`, `backend`

cardosofelipe referenced this issue

2026-01-04 00:52:21 +00:00

feat(context): Phase 7 - Main Engine & Integration #85

cardosofelipe referenced this issue

2026-01-04 00:54:11 +00:00

feat(mcp): Context Management Engine #61

cardosofelipe closed this issue

2026-01-04 18:53:43 +00:00

Sign in to join this conversation.