GitHub
Reliability & Eval best practice

Lethal Trifecta Threat Model

By Nikola Balic (@nibzard)
Add to Pack
or

Saved locally in this browser for now.

Cite This Pattern
APA
Nikola Balic (@nibzard) (2026). Lethal Trifecta Threat Model. In *Awesome Agentic Patterns*. Retrieved March 11, 2026, from https://agentic-patterns.com/patterns/lethal-trifecta-threat-model
BibTeX
@misc{agentic_patterns_lethal-trifecta-threat-model,
  title = {Lethal Trifecta Threat Model},
  author = {Nikola Balic (@nibzard)},
  year = {2026},
  howpublished = {\url{https://agentic-patterns.com/patterns/lethal-trifecta-threat-model}},
  note = {Awesome Agentic Patterns}
}
01

Problem

Combining three agent capabilities—

  1. Access to private data
  2. Exposure to untrusted content
  3. Ability to externally communicate

—creates a straightforward path for prompt-injection attackers to steal sensitive information.
LLMs cannot reliably distinguish "good" instructions from malicious ones once they appear in the same context window.

02

Solution

Adopt a Trifecta Threat Model:

  • Audit every tool an agent can call and classify it against the three capabilities.

  • Guarantee that at least one circle is missing in any execution path. Options include:

  • Remove external network access (no exfiltration).

    • Deny direct file/database reads (no private data).
    • Sanitize or segregate untrusted inputs (no hostile instructions).
  • Enforce this at orchestration time, not with brittle prompt guardrails.

# pseudo-policy
if tool.can_externally_communicate and
   tool.accesses_private_data and
   input_source == "untrusted":
       raise SecurityError("Lethal trifecta detected")
03

How to use it

  • Maintain a machine-readable capability matrix for every tool.
  • Add a pre-execution policy check in your agent runner.
  • Fail closed: if capability metadata is missing, treat the tool as high-risk.
04

Trade-offs

Pros: Simple mental model; eliminates entire attack class. Cons: Limits powerful "all-in-one" agents; requires disciplined capability tagging.

06

References

  • Willison, The Lethal Trifecta for AI Agents (June 16 2025).
  • Beurer-Kellner et al., Design Patterns for Securing LLM Agents against Prompt Injections (arXiv:2506.08837, June 2025).

Note on terminology: This pattern describes Simon Willison's prompt injection threat model (private data + untrusted content + external communication), distinct from the AI safety literature's "lethal trifecta" (advanced capabilities + agentic behavior + situational awareness).