Introduction
The AI agent ecosystem is evolving rapidly, and with it comes a scaling challenge that many developers are hitting context window bloat. When building systems that integrate with multiple MCP (Model Context Protocol) servers, you're forced to load all tool definitions upfront—consuming thousands of tokens just to describe what tools could be available.
mcp-cli: a lightweight tool that changes how we interact with MCP servers. But before diving into mcp-cli, it's essential to understand the foundational protocol itself, the design trade-offs between static and dynamic approaches, and how they differ fundamentally.
Part 1: Understanding MCP (Model Context Protocol)
What is MCP?
The Model Context Protocol (MCP) is an open standard for connecting AI agents and applications to external tools, APIs, and data sources. Think of it as a universal interface that allows:
- AI Agents (Claude, Gemini, etc.) to discover and call tools
- Tool Providers to expose capabilities in a standardized way
- Seamless Integration between diverse systems without custom adapters
- New to MCP see https://aka.ms/mcp-for-beginners
How MCP Works
MCP operates on a simple premise: define tools with clear schemas and let clients discover and invoke them.
Basic MCP Flow:
Tool Provider (MCP Server) ↓ [Tool Definitions + Schemas] ↓ AI Agent / Client ↓ [Discover Tools] → [Invoke Tools] → [Get Results]
Example: A GitHub MCP server exposes tools like:
- search_repositories - Search GitHub repos
- create_issue - Create a GitHub issue
- list_pull_requests - List open PRs
Each tool comes with a JSON schema describing its parameters, types, and requirements.
The Static Integration Problem
Traditionally, MCP integration works like this:
- Startup: Load ALL tool definitions from all servers
- Context Window: Send every tool schema to the AI model
- Invocation: Model chooses which tool to call
- Execution: Tool is invoked and result returned
The Problem:
When you have multiple MCP servers, the token cost becomes substantial:
| Scenario | Token Count |
|---|---|
| 6 MCP Servers, 60 tools (static loading) | ~47,000 tokens |
| After dynamic discovery | ~400 tokens |
| Token Reduction | 99% 🚀 |
For a production system with 10+ servers exposing 100+ tools, you're burning through thousands of tokens just describing capabilities, leaving less context for actual reasoning and problem-solving.
Key Issues:
- ❌ Reduced effective context length for actual work
- ❌ More frequent context compactions
- ❌ Hard limits on simultaneous MCP servers
- ❌ Higher API costs
Part 2: Enter mcp-cli – Dynamic Context Discovery
What is mcp-cli?
mcp-cli is a lightweight CLI tool (written in Bun, compiled to a single binary) that implements dynamic context discovery for MCP servers. Instead of loading everything upfront, it pulls in information only when needed.
Static vs. Dynamic: The Paradigm Shift
Traditional MCP (Static Context):
AI Agent Says: "Load all tool definitions from all servers" ↓ Context Window Bloat ❌ ↓ Limited space for reasoning
mcp-cli (Dynamic Discovery):
AI Agent Says: "What servers exist?" ↓ mcp-cli responds AI Agent Says: "What are the params for tool X?" ↓ mcp-cli responds AI Agent Says: "Execute tool X" mcp-cli executes and responds
Result: You only pay for information you actually use. ✅
Core Capabilities
mcp-cli provides three primary commands:
1. Discover - What servers and tools exist?
mcp-cli
Lists all configured MCP servers and their tools.
2. Inspect - What does a specific tool do?
mcp-cli info <server> <tool>
Returns the full JSON schema for a tool (parameters, descriptions, types).
3. Execute - Run a tool
mcp-cli call <server> <tool> '{"arg": "value"}'
Executes the tool and returns results.
Key Features of mcp-cli
| Feature | Benefit |
|---|---|
| Stdio & HTTP Support | Works with both local and remote MCP servers |
| Connection Pooling | Lazy-spawn daemon avoids repeated startup overhead |
| Tool Filtering | Control which tools are available via allowedTools/disabledTools |
| Glob Searching | Find tools matching patterns: mcp-cli grep "*mail*" |
| AI Agent Ready | Designed for use in system instructions and agent skills |
| Lightweight | Single binary, minimal dependencies |
Part 3: Detailed Comparison Table
| Aspect | Traditional MCP | mcp-cli |
|---|---|---|
| Protocol | HTTP/REST or Stdio | Stdio/HTTP (via CLI) |
| Context Loading | Static (upfront) | Dynamic (on-demand) |
| Tool Discovery | All at once | Lazy enumeration |
| Schema Inspection | Pre-loaded | On-request |
| Token Usage | High (~47k for 60 tools) | Low (~400 for 60 tools) |
| Best For | Direct server integration | AI agent tool use |
| Implementation | Server-side focus | CLI-side focus |
| Complexity | Medium | Low (CLI handles it) |
| Startup Time | One call | Multiple calls (optimized) |
| Scaling | Limited by context | Unlimited (pay per use) |
| Integration | Custom implementation | Pre-built mcp-cli |
Part 4: When to Use Each Approach
Use Traditional MCP (HTTP Endpoints) when:
- ✅ Building a direct server integration
- ✅ You have few tools (< 10) and don't care about context waste
- ✅ You need full control over HTTP requests/responses
- ✅ You're building a specialized integration (not AI agents)
- ✅ Real-time synchronous calls are required
Use mcp-cli when:
- ✅ Integrating with AI agents (Claude, Gemini, etc.)
- ✅ You have multiple MCP servers (> 2-3)
- ✅ Token efficiency is critical
- ✅ You want a standardized, battle-tested tool
- ✅ You prefer CLI-based automation
- ✅ Connection pooling and lazy loading are beneficial
- ✅ You're building agent skills or system instructions
Conclusion
MCP (Model Context Protocol) defines the standard for tool sharing and discovery. mcp-cli is the practical tool that makes MCP efficient for AI agents by implementing dynamic context discovery.
The fundamental difference:
| MCP | mcp-cli | |
|---|---|---|
| What | The protocol standard | The CLI tool |
| Where | Both server and client | Client-side CLI |
| Problem Solved | Tool standardization | Context bloat |
| Architecture | Protocol | Implementation |
Think of it this way: MCP is the language, mcp-cli is the interpreter that speaks fluently.
For AI agent systems, dynamic discovery via mcp-cli is becoming the standard. For direct integrations, traditional MCP HTTP endpoints work fine. The choice depends on your use case, but increasingly, the industry is trending toward mcp-cli for its efficiency and scalability.