01
Problem
Off-the-shelf coding agents (e.g., Devin, Claude Code, Cursor) are either:
- Too generic - Not deeply integrated with company-specific dev environments, tools, and workflows
- Vendor-locked - Tightly coupled to one model provider, limiting flexibility and creating dependency
- Limited context - Cannot access internal infrastructure, private repos, or company-specific tooling
Companies need coding agents that:
- Work within their specific development environment
- Can iterate with closed feedback loops (compiler, linter, tests)
- Provide real-time visibility into agent progress
- Are model-agnostic to switch providers as needed
02
Solution
Build a custom background agent that runs in a sandboxed environment identical to developers, with:
1. Sandboxed Execution Environment
- Use infrastructure like Modal for ephemeral, sandboxed dev environments
- Agent runs with same context: codebase, dependencies, dev tools
- Isolated from production but mirroring development setup
2. Real-Time Communication Layer
- WebSocket connection streams stdout/stderr to client
- User sees agent progress in real-time (not polling)
- Two-way communication for prompts and status updates
3. Closed Feedback Loop
- Agent iterates autonomously with machine-readable feedback
- Compiler errors, linter warnings, test failures guide iterations
- Not 1-shot implementation - but iterative refinement
4. Model-Agnostic Architecture
- Support multiple frontier models via pluggable interface
- Switch between providers without rewriting infrastructure
- Leverage best model for specific task type
5. Company-Specific Integration
- Deep integration with internal tooling, scripts, and workflows
- Access to private repos, internal APIs, documentation
- Customized to team's specific development practices
6. Security Best Practices
- Network isolation with egress lockdown (default-deny outbound rules)
- Tool authorization with pattern-based allow/deny lists
- Execution limits: timeouts, resource caps, privilege restrictions
- Comprehensive audit logging for all agent actions
03
How to use it
Architecture Components:
- Sandbox provider: Modal, sprites.dev, or custom container orchestration
- WebSocket server: For real-time bidirectional communication
- Model abstraction layer: Interface supporting multiple LLM providers
- Feedback loop integrations: Parser for compiler/linter/test output
Implementation Steps:
- Create a sandbox service that spins up isolated dev environments
- Build WebSocket layer for prompt submission and progress streaming
- Implement model-agnostic agent interface
- Add feedback loop integrations (test parsing, error ingestion)
- Connect to version control for branch/PR creation
Security Considerations:
- Configure egress firewall rules (allow-list required destinations only)
- Implement deny-by-default tool authorization with policy inheritance
- Set execution timeouts with SIGTERM → SIGKILL progression
- Enforce per-sandbox resource quotas (CPU, memory, concurrent executions)
Why Custom vs. Off-the-Shelf:
- Off-the-shelf agents (Devin, Cursor) work great for generic tasks
- But they can't deeply integrate with your company's specific infrastructure
- Building custom lets you optimize for your workflows, tools, and security requirements
04
Trade-offs
-
Pros:
- Deep integration with company-specific tools and workflows
- Model flexibility - not locked into one provider
- Real-time visibility into agent progress and intermediate steps
- Same context as developers - agent works in identical environment
- Custom feedback loops tailored to your stack
-
Cons:
- Engineering overhead - requires building and maintaining infrastructure
- Security considerations - agent needs access to repos, credentials
- Ongoing maintenance - unlike SaaS solutions, you own the ops burden
- Requires devops expertise - sandbox management, websocket scaling
05
Example
flowchart TD
subgraph Client["Developer's Machine"]
UI[Web UI / CLI]
end
subgraph Infrastructure["Company Infrastructure"]
Svc[Agent Service]
Sandbox[Sandboxed Dev Environment<br/>Modal / OpenCode]
VCS[Git / Version Control]
end
UI -->|"WebSocket: Send Prompt"| Svc
UI <-->|"WebSocket: Real-time Stream<br/>stdout/stderr"| Svc
Svc -->|"Spin up sandbox"| Sandbox
Sandbox <-->|"Read/Write Files"| VCS
loop Iterative Refinement
Sandbox -->|"Compiler/Linter/Test Feedback"| Sandbox
end
Sandbox -->|"Final PR/Result"| Svc
Svc -->|"Notification"| UI
06