Problem
Model Context Protocol (MCP) and agent frameworks often combine three capability classes in a single tool: private-data readers (email, filesystem), web fetchers (HTTP clients), and writers (API mutators). This creates the "lethal trifecta"—malicious input can trigger chains that read sensitive data, exfiltrate it, and modify systems in one operation.
Solution
Adopt capability compartmentalization at the tool layer:
- Split monolithic tools into reader, processor, and writer micro-tools.
- Require explicit, per-call user consent when composing tools across capability classes.
- Run each class in an isolated subprocess with scoped API keys and file permissions.
Treat each capability class as a separate trust zone with its own runtime identity and policy checks. Cross-zone composition should require explicit policy evaluation and short-lived delegation tokens so the agent cannot silently chain read+fetch+write into a high-risk path.
# tool-manifest.yml
email_reader:
capabilities: [private_data, untrusted_input]
permissions:
fs: read-only:/mail
net: none
issue_creator:
capabilities: [external_comm]
permissions:
net: allowlist:github.com
How to use it
- Generate the manifest automatically from CI.
- Your agent runner consults the manifest before constructing action plans.
- Flag any attempt to chain tools that would recreate the lethal trifecta.
- Group tools by capability class (fs, web, runtime, memory) and assign profiles (minimal, coding, messaging) to prevent mixing.
- Validate tool chains at call time: reject if all three capability classes are present.
// Cross-zone validation
function validateToolChain(tools: string[]): boolean {
const classes = new Set(tools.map(t => getCapabilityClass(t)));
if (classes.has("PRIVATE_DATA") &&
classes.has("UNTRUSTED_INPUT") &&
classes.has("EXTERNAL_COMM")) {
return false; // Lethal trifecta detected
}
return true;
}
Trade-offs
Pros: Fine-grained; plays well with modular architectures. Cons: More tooling overhead; risk of permission creep over time.
References
- Willison's warning that "one MCP mixed all three patterns in a single tool."
- Primary source: https://simonwillison.net/2025/Jun/16/lethal-trifecta/
- Clawdbot (validated-in-production reference implementation with profile-based policies): https://github.com/clawdbot/clawdbot
- Action Selector pattern (Beurer-Kellner et al., 2025): https://arxiv.org/abs/2506.08837
- NVIDIA NeMo Guardrails (policy-based enforcement): https://github.com/NVIDIA/NeMo-Guardrails