Context Manager
Manage your AI context window efficiently. Estimate tokens, plan budgets, and optimize what stays in context for peak LLM performance.
Get the CLI Tool
Run the context manager locally as an MCP server, or try the interactive demo below.
npx @clinetools/context-mgr
- Token estimation — approximate counts for code and prose
- Priority-based context planning with 3 strategies
- Smart trimming — truncate, summarize, or drop blocks
- Actionable recommendations for context optimization
- Zero config — just run with npx
How to Use It
Three ways to manage your AI context window — pick the one that fits your workflow.
Try Online
Use the interactive demo below to add content blocks and plan your context budget — no install needed.
Use via CLI
Run as a local MCP server and connect any MCP-compatible client.
Add to Cline / Claude Code
Add the tool to your MCP settings for context management from your AI assistant.
MCP Client Configuration (Cline)
{
"mcpServers": {
"context-mgr": {
"command": "npx",
"args": ["@clinetools/context-mgr"]
}
}
}
Claude Code Configuration
# In your project's .mcp.json:
{
"mcpServers": {
"context-mgr": {
"command": "npx",
"args": ["@clinetools/context-mgr"]
}
}
}
Example Prompt: Estimate Token Usage
// Prompt to your AI agent:
"Estimate how many tokens this file uses"
// The agent calls:
estimate_tokens({
text: "...file contents..."
})
// Returns: { tokens: 1250, characters: 5000, words: 820 }
Example Prompt: Plan Context Window
// Prompt to your AI agent:
"I have 8K tokens left. Plan which files and
conversation history to keep in context."
// The agent calls:
plan_context({
blocks: '[{"id":"system","content":"...","priority":"critical","category":"system"}, ...]',
budgetTokens: 8192,
strategy: "balanced"
})
Try It Online
Add content blocks, set a token budget, and see how the context planner decides what to keep.
Context Window Planner
Add content blocks and plan your context budget
Add content blocks and click Plan Context to see the optimization plan.
Context Plan
Understanding Context Management
Key concepts for managing LLM context windows effectively.
Context Window Limits
Every LLM has a maximum context length — from 4K to 200K tokens. Exceeding it causes truncation or errors. Know your model's limit: GPT-4o has 128K, Claude has 200K, Gemini has 1M.
Token Estimation
Tokens are not words. Code produces more tokens per character than prose due to symbols and operators. A rough estimate: ~4 chars/token for English, ~3.5 chars/token for code.
Priority-Based Trimming
Not all context is equal. System prompts are critical, recent conversation is high-priority, and old reference material can often be dropped. Smart trimming keeps the most valuable content.
Conversation Compression
Long conversations accumulate tokens fast. Strategies: summarize older turns, drop low-value exchanges, keep only the most recent N turns, or compress reference material into bullet points.
Manage Every Context
Our Pro plan includes automated context management across all your AI pipelines.
View Plans