GitHub
Feedback Loops established

Iterative Prompt & Skill Refinement

By Nikola Balic (@nibzard)
Add to Pack
or

Saved locally in this browser for now.

Cite This Pattern
APA
Nikola Balic (@nibzard) (2026). Iterative Prompt & Skill Refinement. In *Awesome Agentic Patterns*. Retrieved March 11, 2026, from https://agentic-patterns.com/patterns/iterative-prompt-skill-refinement
BibTeX
@misc{agentic_patterns_iterative-prompt-skill-refinement,
  title = {Iterative Prompt & Skill Refinement},
  author = {Nikola Balic (@nibzard)},
  year = {2026},
  howpublished = {\url{https://agentic-patterns.com/patterns/iterative-prompt-skill-refinement}},
  note = {Awesome Agentic Patterns}
}
01

Problem

Agent usage reveals gaps in prompts, skills, and tools—but how do you systematically improve them? When a workflow fails or behaves sub-optimally, you need multiple mechanisms to capture feedback and iterate. Single approaches aren't enough; you need a multi-pronged refinement strategy.

02

Solution

Implement multiple complementary refinement mechanisms that work together. No single mechanism catches all issues—you need layered approaches. This approach is grounded in RLHF research showing that human feedback is irreplaceable for alignment, while RLAIF demonstrates AI-assisted feedback enables scale.

Four key mechanisms:

1. Responsive Feedback (Primary)

  • Monitor internal #ai channel for issues
  • Skim workflow interactions daily
  • This is the most valuable ongoing source of improvement

2. Owner-Led Refinement (Secondary)

  • Store prompts in editable documents (Notion, Google Docs)
  • Most prompts editable by anyone at the company
  • Include prompt links in workflow outputs (Slack messages, Jira comments)
  • Prompts must be discoverable + editable

3. Claude-Enhanced Refinement (Specialized)

  • Use Datadog MCP to pull logs into skill repository
  • Skills are a "platform" used by many workflows
  • Often maintained by central AI team, not individual owners

4. Dashboard Tracking (Quantitative)

  • Track workflow run frequency and errors
  • Track tool usage (how often each skill loads)
  • Data-driven prioritization of improvements
graph TD A[Workflow Runs] --> B[Feedback Channel: #ai] A --> C[Owner Edits Prompts] A --> D[Datadog Logs → Claude] A --> E[Dashboards: Metrics] B --> F[Identify Issues] C --> F D --> F E --> F F --> G[Update Prompts/Skills] G --> A style B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px style E fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
03

How to use it

Implementation checklist:

  • [ ] Feedback channel: Internal Slack/Discord for agent issues
  • [ ] Editable prompts: Store in Notion/docs, not code
  • [ ] Prompt links: Include in every workflow output
  • [ ] Log access: Datadog/observability with MCP integration
  • [ ] Dashboards: Track workflow runs, errors, tool usage

Refinement workflow:

# After each workflow run, include link
workflow_result = {
    "output": "...",
    "prompt_link": "https://notion.so/prompt-abc123"
}

Discovery strategy:

  • Daily: Skim feedback channel, review workflow interactions
  • Weekly: Review dashboard metrics for error spikes
  • Ad-hoc: Pull logs when specific issues reported
  • Quarterly: Comprehensive prompt/skill audit

Post-run evals (next step):

Include subjective eval after each run:

  • Was this workflow effective?
  • What would have made it better?
  • Human-in-the-loop to nudge evolution
04

Trade-offs

Pros:

  • Multi-layered: Catches issues different mechanisms miss
  • Continuous: Always improving, not episodic
  • Accessible: Anyone can contribute to improvement
  • Data-driven: Dashboards prioritize what matters
  • Skill-sharing: Central team can maintain platform-level skills

Cons:

  • No silver bullet: Can't eliminate any mechanism
  • Maintenance overhead: Multiple systems to manage
  • Permission complexity: Need balanced edit access
  • Alert fatigue: Too many signals can overwhelm

Workflow archetypes:

Type Refinement Strategy
Chatbots Post-run evals + human-in-the-loop
Well-understood workflows Code-driven (deterministic)
Not-yet-understood workflows The open question

Open challenge: How to scalably identify and iterate on "not-yet-well-understood" workflows without product engineers implementing each individually?

06

References