GitHub
Context & Memory emerging medium maturing

Tool Search Lazy Loading

Dynamically load tools via search instead of preloading all available tools to reduce context usage

By Niko
Add to Pack
or

Saved locally in this browser for now.

Cite This Pattern
APA
Niko (2026). Tool Search Lazy Loading. In *Awesome Agentic Patterns*. Retrieved March 11, 2026, from https://agentic-patterns.com/patterns/tool-search-lazy-loading
BibTeX
@misc{agentic_patterns_tool-search-lazy-loading,
  title = {Tool Search Lazy Loading},
  author = {Niko},
  year = {2026},
  howpublished = {\url{https://agentic-patterns.com/patterns/tool-search-lazy-loading}},
  note = {Awesome Agentic Patterns}
}
01

Problem

As the Model Context Protocol (MCP) has grown, MCP servers may expose 50+ tools that consume significant context space. Documented setups with 7+ servers have been documented consuming 67k+ tokens just for tool descriptions. This creates a fundamental scalability issue:

  • Context bloat: Preloading all tool descriptions consumes tokens that could be used for the actual task
  • Latency: More tools means more processing overhead on every request
  • Discovery challenges: Agents must scan through many irrelevant tools to find relevant ones
  • Memory pressure: Large tool catalogs can exceed practical context limits
02

Solution

Implement Tool Search: a lazy-loading mechanism where tools are dynamically loaded into context via search only when needed, rather than preloaded on initialization.

The pattern works by:

  1. Threshold detection: Monitor when tool descriptions would exceed a context threshold (e.g., 10% of context window)
  2. Search interface: Provide a ToolSearchTool that allows agents to search tool metadata and selectively load tools
  3. Server instructions: Leverage MCP server instruction fields to guide the agent on when to search for specific tools
  4. Agentic search: Use intelligent search rather than basic RAG to find relevant tools
graph TD A[Agent Request] --> B{Would tools exceed 10% context?} B -->|No| C[Preload All Tools] B -->|Yes| D[Enable Tool Search Mode] D --> E[Load Tool Metadata Only] E --> F[Agent Determines Tool Need] F --> G[Search via ToolSearchTool] G --> H[Load Specific Tool on Demand] H --> I[Execute Tool] C --> I

Implementation approach:

function initialize_mcp_servers(servers) {
    total_tool_tokens = calculate_tool_tokens(servers)

    if (total_tool_tokens > CONTEXT_THRESHOLD) {
        // Lazy loading mode
        tool_registry = load_tool_metadata_only(servers)
        return ToolSearchTool(tool_registry)
    } else {
        // Traditional preload mode
        return preload_all_tools(servers)
    }
}

function tool_search(query: string, tool_registry) {
    // Agentic search - not basic RAG
    relevant_tools = agentically_search(tool_registry, query)
    return load_tool_definitions(relevant_tools)
}
03

How to use it

For MCP server creators:

  • Enhance server instructions: The "server instructions" field becomes more critical with tool search enabled. It helps the agent know when to search for your tools.
  • Descriptive metadata: Include rich descriptions and tags to improve searchability
  • Logical grouping: Organize related tools to make discovery more intuitive

For MCP client creators:

  • Implement ToolSearchTool: Provide a search interface for tool discovery
  • Use agentic search: Implement intelligent search rather than basic vector RAG
  • Set appropriate thresholds: Choose context thresholds based on your use case (Claude Code uses 10%)
  • Provide opt-out: Allow users to disable search if they prefer preloading

Usage scenarios:

  • Development environments with many specialized tools (file operations, git, database access, API clients)
  • Multi-server setups where each server provides domain-specific capabilities
  • Agents that only need a subset of available tools for any given task
04

Trade-offs

  • Pros:

    • Dramatically reduces baseline context usage (67k+ tokens to just metadata)
    • Enables scaling to 100+ tools without context issues
    • Faster cold-start times when tools aren't needed
    • Better tool discovery through intentional search
    • Allows more MCP servers to be enabled simultaneously
  • Cons:

    • Adds latency when tools need to be dynamically loaded
    • Requires search infrastructure and metadata management
    • May miss serendipitous tool discovery that happens when browsing full catalogs
    • Server instructions become more critical and require careful authoring
    • Additional complexity in client implementation
06

References