GitHub
Security & Safety validated in production

Hook-Based Safety Guard Rails for Autonomous Code Agents

By yurukusa (@yurukusa)
Add to Pack
or

Saved locally in this browser for now.

Cite This Pattern
APA
yurukusa (@yurukusa) (2026). Hook-Based Safety Guard Rails for Autonomous Code Agents. In *Awesome Agentic Patterns*. Retrieved March 11, 2026, from https://agentic-patterns.com/patterns/hook-based-safety-guard-rails
BibTeX
@misc{agentic_patterns_hook-based-safety-guard-rails,
  title = {Hook-Based Safety Guard Rails for Autonomous Code Agents},
  author = {yurukusa (@yurukusa)},
  year = {2026},
  howpublished = {\url{https://agentic-patterns.com/patterns/hook-based-safety-guard-rails}},
  note = {Awesome Agentic Patterns}
}
01

Problem

Autonomous code agents running unattended can execute destructive commands (rm -rf, git reset --hard), exhaust their context window without saving state, leak secrets via git push, or silently produce syntax errors that cascade into later failures.

These failures share a pattern: the agent took an action that a simple automated check would have caught. The agent itself can't reliably self-police because it operates within the same context that produces the mistakes.

02

Solution

Use the agent framework's hook system (PreToolUse / PostToolUse events) to inject safety checks that run outside the agent's reasoning loop. Each hook is a small shell script that inspects tool inputs or outputs and either warns, blocks, or logs.

Four guard rails that cover the most common failure modes:

  1. Dangerous command blocker (PreToolUse: Bash) — Pattern-matches commands for rm -rf, git reset --hard, DROP TABLE, etc. Blocks or warns before execution.

  2. Syntax checker (PostToolUse: Edit/Write) — After every file edit, runs the appropriate linter (python -m py_compile, bash -n, jq empty). Catches errors immediately instead of 50 tool calls later.

  3. Context window monitor (PostToolUse: all) — Counts tool calls as a proxy for context consumption. Issues graduated warnings (soft → hard → critical) and auto-generates a checkpoint file when context is dangerously low.

  4. Autonomous decision enforcer (PreToolUse: AskUserQuestion) — Blocks the agent from asking "should I continue?" during unattended sessions. Forces the agent to decide and log uncertainty instead.

# Example: dangerous command blocker (simplified)
#!/bin/bash
INPUT="$(cat)"
CMD="$(echo "$INPUT" | jq -r '.tool_input.command // empty')"

if echo "$CMD" | grep -qE 'rm\s+-rf|git\s+reset\s+--hard|git\s+clean\s+-fd'; then
  echo "BLOCKED: Destructive command detected: $(echo "$CMD" | head -c 100)"
  exit 2  # non-zero = block the tool call
fi
exit 0  # 0 = allow
03

How to use it

  • Register hooks in the agent's settings file (e.g., settings.json for Claude Code).
  • Each hook is a standalone shell script — no dependencies on the agent's internal state.
  • PreToolUse hooks receive the tool name and input as JSON on stdin. Return exit code 2 to block.
  • PostToolUse hooks receive the tool name, input, and output. Return exit code 0 (hooks can only warn, not block after execution).
  • Start with the dangerous command blocker (highest impact, lowest effort).
04

Trade-offs

  • Pros:

    • Runs outside the agent's context — immune to prompt injection or reasoning failures
    • Language-agnostic (shell scripts work with any agent framework that supports hooks)
    • Each hook is independently deployable — add one at a time
    • Zero performance overhead for the agent (hooks run in milliseconds)
  • Cons/Considerations:

    • Pattern matching is inherently incomplete — creative destructive commands may slip through
    • Context monitoring via tool-call counting is a proxy, not exact measurement
    • Hooks can't prevent the agent from reasoning about dangerous actions, only from executing them
    • Requires the agent framework to support PreToolUse/PostToolUse events
06

References