Action Caching & Replay Pattern

Problem

LLM-based agent execution is expensive (both in costs and latency) and non-deterministic. Running the same workflow multiple times yields different results and incurs repeated LLM costs.

This creates several issues:

Cost explosion: Every workflow run burns LLM tokens even for identical tasks
Non-determinism: Same input produces different outputs across runs
No regression testing: Impossible to verify fixes don't break existing workflows
Slow iteration: Can't quickly test changes without paying LLM costs
No CI/CD integration: Automated testing of agent workflows is impractical

Solution

Record every action during execution with precise metadata (XPaths, frame indices, execution details), enabling deterministic replay without LLM calls. The cache captures enough information to replay actions even when page structure changes slightly.

This pattern builds on experience replay from reinforcement learning, where agents learn by reusing past successful actions rather than exploring anew each time.

How to use it

Trade-offs

Pros:

Dramatic cost reduction: Replay costs near-zero (no LLM calls) if XPaths work; documented cost reductions range from 43-97% across implementations; cache hit rates of 85%+ indicate excellent effectiveness
Deterministic regression testing: Verify fixes don't break existing workflows
Performance: Cached replays are 10-100x faster than LLM execution
Debugging: Cache provides complete execution history
Script generation: Export workflows as standalone automation scripts
Graceful degradation: LLM fallback handles page structure changes

Cons:

Cache management overhead: Need to store, version, and invalidate caches
Brittle to significant UI changes: Major redesigns break XPaths
Initial LLM cost: First run still requires full LLM execution
Storage complexity: Caches accumulate and need cleanup
Not universal: Only works for deterministic workflows

Mitigation strategies:

Implement cache versioning and automatic expiration
Use LLM fallback with cache update for failed replays
Store caches alongside workflow definitions in version control
Set up automated cache validation in CI pipelines

References

HyperAgent GitHub Repository - Original implementation
HyperAgent Documentation - Usage guide
Cost-Efficient Serving of LLM Agents via Test-Time Plan Caching (Zhang et al., 2025) - Academic foundation showing 46.62% average cost reduction
Docker Cagent - Proxy-and-cassette model for deterministic agent testing
Related patterns: Structured Output Specification, Schema Validation Retry