GitHub 3.6K
Tool Use & Environment established

Code-Over-API Pattern

Agents write and execute code that processes data in execution environment instead of making direct API calls, dramatically reducing token consumption by keeping intermediate data out of context window (150K → 2K tokens).

By Nikola Balic (@nibzard)
Add to Pack
or

Saved locally in this browser for now.

Cite This Pattern
APA
Nikola Balic (@nibzard) (2026). Code-Over-API Pattern. In *Awesome Agentic Patterns*. Retrieved March 11, 2026, from https://agentic-patterns.com/patterns/code-over-api-pattern
BibTeX
@misc{agentic_patterns_code-over-api-pattern,
  title = {Code-Over-API Pattern},
  author = {Nikola Balic (@nibzard)},
  year = {2026},
  howpublished = {\url{https://agentic-patterns.com/patterns/code-over-api-pattern}},
  note = {Awesome Agentic Patterns}
}
01

Problem

When agents make direct API or tool calls, all intermediate data must flow through the model's context window. For data-heavy workflows (processing spreadsheets, filtering logs, transforming datasets), this creates massive token consumption and increased latency. A workflow that fetches 10,000 spreadsheet rows and filters them can easily consume 150,000+ tokens just moving data through the context.

02

Solution

Instead of making direct tool calls, agents write and execute code that interacts with tools. Data processing, filtering, and transformation happens in the execution environment, with only results flowing back to the model context.

Core insight: LLMs are better at writing code to call APIs than at calling APIs directly—due to training data alignment with millions of open-source code repositories.

Direct API approach (high token cost):

# Agent makes tool call
rows = api_call("spreadsheet.getRows", sheet_id="abc123")
# All 10,000 rows flow through context → 150K tokens

# Agent processes in context
filtered = [row for row in rows if row.status == "active"]
# More tokens for processing

return filtered

Code-Over-API approach (low token cost):

# Agent writes code that executes in environment
def process_spreadsheet():
    # Tool call happens in execution environment
    rows = spreadsheet.getRows(sheet_id="abc123")

    # Filtering happens in code, not in context
    filtered = [row for row in rows if row.status == "active"]

    # Only log summary for agent visibility
    print(f"Processed {len(rows)} rows, found {len(filtered)} active")
    print(f"First 5 active rows: {filtered[:5]}")

    return filtered

result = process_spreadsheet()
# Only summary and sample flow to context → ~2K tokens

The agent sees the log output and return value, but the full dataset never enters its context window.

03

How to use it

Best for:

  • Data-heavy workflows (spreadsheets, databases, logs)
  • Multi-step transformations or aggregations
  • Workflows with intermediate results that don't need model inspection
  • Cost-sensitive applications where token usage matters

Prerequisites:

  • Secure code execution environment with sandboxing
  • Access to tools/APIs from within the execution environment
  • Resource limits (CPU, memory, time) to prevent runaway execution

Implementation pattern:

  1. Agent analyzes task and determines data processing needs
  2. Agent writes code that:
  • Calls tools/APIs within the execution environment
    • Performs filtering, transformation, aggregation in code
    • Logs only summaries or samples for visibility
    • Returns final results
  1. Execution environment runs code with tool access
  2. Only logs and return values flow back to agent context
04

Trade-offs

Pros:

  • Dramatic token reduction (150K → 2K in reported cases)
  • Lower latency (fewer large context API calls)
  • Natural fit for data processing tasks
  • Intermediate data stays contained in execution environment

Cons:

  • Requires secure code execution infrastructure
  • More complex setup than direct tool calls
  • Agents must be capable of writing correct code
  • Debugging can be harder (errors happen in execution, not in context)
  • Needs monitoring, resource limits, and sandboxing

Operational requirements:

  • Sandboxed execution environment (containers, VMs, V8 isolates, WebAssembly)
  • Resource limits (CPU, memory, execution time)
  • Monitoring and logging infrastructure
  • Error handling and recovery mechanisms

Execution environment options:

  • V8 isolates: Millisecond startup, minimal memory, strong isolation (Cloudflare Code Mode)
  • Containers: 2-5 second startup, full language flexibility (Modal, Docker)
  • VMs: Complete isolation for destructive operations (Cognition/Devon)
06

References

  • Anthropic Engineering: Code Execution with MCP (2024)
  • Cloudflare: Code Mode - V8 isolate-based execution (2025)
  • Beurer-Kellner et al.: Code-Then-Execute security framework (2025)
  • Related: Code-Then-Execute Pattern (focuses on security/formal verification vs token optimization)