Create Agent

Purpose

Design AI agents with a scope-first discipline. Before writing any prompt, config, or tool binding, you must pass a structured scope interview. The depth of the interview depends on the complexity of the agent being designed.

This is not a prompt-writing skill. It is a scope-validation skill. The output is a validated Agent Spec — a blueprint that another AI or human can use to build the agent correctly.

Every agent spec defines 8 dimensions:

Archetype → Personality → Capabilities → I/O → Ecosystem → Memory → Lifecycle → Risk/Metrics

Iron Rules

Scope before build — No code, no prompt, no tool config until scope is validated. Zero exceptions.

Correction Authority: ใช้กับ draft/spec assumption เท่านั้น — แก้สมมติฐานใน spec ที่ user พลาด (missing dep, contradiction, safety risk) ให้เลย ไม่ต้องถามก่อน แค่บอกว่าปรับอะไรและทำไม ห้ามแก้ไฟล์, config, tool, หรือ workflow จริง โดยไม่ได้รับ approval ขออนุมัติก่อน ถ้า solution ที่ถูกต้องต้องเปลี่ยน tool ใหม่ หรือ เปลี่ยน workflow ที่มีอยู่ — แต่ถ้า common sense แก้ spec ได้ จัดการเลย
Minimum sufficient depth — Default to Quick. Promote to Standard/Deep/ Critical only when explicit signals appear. Do not start at Standard.
Infer first, ask later — Before asking a question, try to infer the answer from context, codebase, existing agents, and what the user already said. Only ask about what remains genuinely uncertain.
Codebase-first — If the answer to a question exists in config, docs, or existing agents — go find it. Do not ask. Greenfield exception: If no codebase exists (no index.md, AGENTS.md, opencode.jsonc, or existing agents), skip codebase scan and go directly to the interview.
Minimize round-trips — Infer spec → present draft → let user mark up. Ask one-at-a-time only for genuine gaps the draft can't resolve. Never dump 10 questions at once.
Recommend, don't open-ended — Every question must include a recommended answer. Give something to react to.
Spec must be approved — Never build until the Agent Spec is explicitly approved. For Quick layer, user confirmation of the inferred spec counts as approval. Document it as Status: Approved via Quick.
Stop on signal — When the user says "enough" or "stop" — stop. Do not argue, do not push for more.
Output an Agent Spec — Every session produces a validated spec, even at Quick layer.

Agent Archetype System

Agents follow patterns based on their archetype. Use this table to pick the right archetype and understand which interview domains to focus on or skip.

Archetype Catalog

#	Archetype	Core Purpose	Default Personality	Typical Tools	Suggested Layer
🛠️	System	Automate, config, maintain	Practical, concise, direct	CLI, filesystem, Docker, git	Quick → Standard
🎨	Creative	Design, visual, motion	Expressive, detail-oriented	Tailwind, GSAP, image APIs	Quick → Standard
💻	Developer	Code, architecture, debug	Technical, precise, thorough	MCPs, linters, test runners	Quick → Deep
📝	Scribe	Document, write, organize	Clear, structured, patient	Obsidian, markdown, search	Quick
🔬	Specialist	Deep single domain	Authoritative, meticulous	Domain-specific CLI/MCP	Quick → Deep
🎯	Orchestrator	Coordinate, delegate, manage	Systematic, process-driven	Agent list, handoff protocol	Quick → Critical
🔍	Researcher	Search, analyze, synthesize	Curious, analytical, thorough	Exa, Firecrawl, web tools	Quick
🎮	Game	Game-specific behavior	Playful, adaptive	Game API, bot framework	Quick → Deep

Archetype Minimize List

ไม่ใช่ "ข้าม" แต่บอกว่า domain ไหนควร สั้น หรือ ข้ามได้ถ้า obvious

มี 3 ระดับ:

Deep-dive — ถามให้ครบ
Brief — ถามแค่ 1-2 คำถาม แล้วสรุป
Skip if obvious — ไม่ถาม ถ้า common sense บอกได้ (เช่น Scribe → Memory = session-only)

Archetype	Brief	Skip if obvious	Deep-dive
System	Identity, I/O	Memory	Capabilities, Ecosystem
Creative	Risk/Metrics, Lifecycle	Memory	Identity, I/O, Personality
Developer	Lifecycle	Identity	Capabilities, Ecosystem, I/O
Scribe	Ecosystem	Risk/Metrics, Memory	Identity, I/O
Specialist	I/O, Lifecycle	—	Capabilities, Risk/Metrics
Orchestrator	—	—	ทุก domain
Researcher	Lifecycle, Personality	Memory	Capabilities, I/O, Identity
Game	Ecosystem	Risk/Metrics	Capabilities, I/O, Personality

Developer + Identity: Brief ไม่ใช่ Skip — identity (frontmatter) ยังต้องมี เพราะเป็นตัวกำหนดว่า agent ถูกเรียกเมื่อไหร่ แค่ทำให้สั้น (1-2 sentences)

Archetype Selection

When interviewing, recommend an archetype based on the agent's purpose. If multiple archetypes fit, pick the dominant one and note secondary.

Example: "This sounds like a Developer archetype with some Specialist overlap.
         Per minimize list: Lifecycle = brief, Identity = brief (frontmatter
         only), focus deep-dive on Capabilities + Ecosystem."

Depth Layer System

Not every agent needs the same depth. This skill defines 4 layers. Default to Quick. Start light, escalate only when signals demand it.

Layer 1 — Quick       (infer → confirm → build, ถามเพิ่มถ้า uncertain)
                       │                          ← DEFAULT
Layer 2 — Standard    (Present draft → user marks up → approve)
                       │
Layer 3 — Deep        (Standard + Ecosystem Map + Dependency Trace)
                       │
Layer 4 — Critical    (Deep + Risk Model + Cost Model + Safety Gates)

Layer Selection

Signal	Quick	Standard	Deep	Critical
Agent is new or modified (1-2 caps), scope clear	✅	—	—	—
Existing agent, small change	✅	—	—	—
New agent, single purpose, no deps	—	✅	—	—
New agent, 1-2 external integrations	—	✅	✅	—
Agent in multi-agent ecosystem	—	—	✅	✅
Handles payment / auth / PII	—	—	—	✅
Deploys to production	—	—	—	✅
High token cost per call	—	—	—	✅
Safety or compliance implications	—	—	—	✅
Archetype = Orchestrator	—	—	—	✅
Archetype = Specialist	—	—	✅	—

Default to Quick. Promote only when explicit signals are present.

Escalation rule: If during a lower-layer interview you discover signals that warrant a higher layer, promote explicitly:

Promotion note:
  Trigger: [what you found]
  Evidence: [where you found it]
  Risk of staying: [why lower layer is unsafe]

Layer 1 — Quick

For simple agents and small changes. Infer the spec from context, confirm with the user, build. No full interview, no separate approval gate.

Don't overthink this layer. If uncertain, still start here — complexity will surface during inference. Promote if signals appear.

Entry Criteria

Agent is new (1-2 capabilities) or existing with small modification.
No external integrations beyond what already exists.
No multi-agent coordination.
No safety, cost, or compliance risk.
Requirements are clear enough that an 8-domain interview would feel like over-engineering.

Process

1. Infer spec from context    ← อ่านจากที่ user บอก + codebase + existing agents
2. Present mini-spec          ← ใช้ format ตายตัวด้านล่าง
3. User marks up              ← "ตรงไหนแก้บอก"
   ถ้า markup เป็น trivial (1-2 คำ, แก้ชื่อ format เล็กน้อย)
     → Build ได้เลย
   ถ้า markup เป็น structural (เพิ่ม cap, เปลี่ยน tool, แก้ boundary)
     → present revised mini-spec ก่อน → user confirm → Build
4. Build

Mini-Spec Format (ตายตัว)

## Quick Spec: [Name]

Job: [หนึ่งประโยค — "ทำอะไร เพื่อใคร ให้ได้อะไร"]
Capabilities:
  - [action 1]
  - [action 2]
Boundaries:
  - [ห้ามทำอะไร]
Authority: [Read-only / Suggest / Modify in scope / Execute / Autonomous]
Tools: [tool / MCP / none needed]
Approval: [Confirmed via Quick / Revised & Confirmed]

ใช้ไฟล์ templates/quick-spec.md เป็น template จริง ห้ามย่อหรือข้าม fields นี้ — ถ้า field ไหน obvious ให้เขียน "—" (ไม่ต้องถาม user) ถ้า modify existing agent → ใช้ format เดียวกัน แต่แสดงเฉพาะ field ที่เปลี่ยน

Example

User: "อยากได้ agent ที่สรุปเมล์ให้หน่อย"

Infer:
  Archetype: Scribe
  Capability: read email → bullet summary
  Tools: Gmail MCP (มีอยู่แล้ว)
  Ecosystem: อยู่กับ steward

Present mini-spec:
  ## Quick Spec: mail-summarizer

  Job: อ่าน Gmail → สรุป bullet point ให้ Mike ทุกเช้า
  Capabilities:
    - อ่านเมล์จาก Gmail MCP → สรุปเป็น bullet points
    - แยกตาม sender / subject
  Boundaries:
    - ห้ามตอบเมล์, ห้ามลบเมล์
  Authority: Read-only (อ่านเท่านั้น)
  Tools: Gmail MCP (มีอยู่แล้ว)
  Approval: —

  "ภาษาไทย/อังกฤษ? ทุกฉบับหรือเฉพาะหัวข้อ?"

User: "ไทย, ทุกฉบับ, ไม่ต้อง log"
  (trivial markup — ส่ง Build)

Build.
  ## Quick Spec: mail-summarizer
  ...
  Authority: Read-only
  Approval: Confirmed via Quick

Layer 2 — Standard

For new agents and significant changes. Promote to Standard when Quick is insufficient (complex integrations, ecosystem impact, or unclear scope).

Process

1. Intent Gate       → Restate, identify archetype, user + goal + constraint
2. Codebase Scan     → Read index.md, opencode.jsonc, AGENTS.md, existing agents
                       (ถ้าไม่มี → greenfield: skip, ไป present draft เลย)
3. Archetype Match   → Select archetype, apply minimize list
4. Present Draft     → Infer full spec from known context → present 1-page draft
                       ถามเฉพาะ uncertainty (ไม่ใช่ถามทีละข้อ)
5. User Mark Up      → User edits / adds / removes from draft
6. Iterate (if needed) → ถ้ามี gap → present revised draft
7. Propose Spec      → Full Agent Spec (see templates/agent-spec.md)
8. Approval Gate     → User approves before build

Scope Interview (8 Domains)

สำหรับ Quick layer: ข้ามทั้งหมด — infer + confirm ตั้งแต่ต้น

สำหรับ Standard+:

Infer draft spec จาก context ที่มี → present single-page draft ให้ user
User อ่านแล้ว mark up (แก้/เพิ่ม/ลบ)
ใช้ checklist ด้านล่างตรวจสอบว่ามีอะไรตก — ถามเฉพาะที่ draft ไม่ครอบคลุม
Apply Archetype Minimize List — brief / skip-if-obvious / deep-dive ตาม archetype

ห้ามถามทีละข้อจาก checklist ไล่ไปเรียง — present draft ก่อน, ถามเฉพาะ gap

Domain 1: Identity & Purpose

Name — what is the agent called? (propose if unnamed)
Archetype — which archetype per catalog?
Role — one clear sentence: "This agent [does X] for [user Y] so that [Z]."

Persona —

Field	What	Example
Tone	formal / casual / technical / playful / authoritative / warm	"technical but approachable"
Example phrase	คำพูดจริงที่ agent พูด	"Let me check the config... Found it. The issue is in line 42."

Always recommend a persona.

"I suggest a technical but approachable tone —
 precise on facts, relaxed in conversation."

Knows context — what must it know before working?
Narrowest job — what is the smallest useful definition of its role?

Domain 2: Capabilities & Boundaries

Capabilities — core actions, ranked by priority. Each capability includes:
- Description (1 sentence)
- Trigger (how is this capability invoked?)
- Expected output (what does success look like?)
- Failure mode (what if it doesn't work?)

Boundaries — explicit list of what it must never do. Format each as a Critical Rule with scenario:

Rule: Never modify files outside the project directory.
Scenario: User asks "fix my desktop settings" — agent has filesystem access.
Consequence: Refuse politely, explain scope limit.

Tool Mapping — for each capability, determine what tools are needed. Do this after capabilities and boundaries are clear.

Capability: "Search documentation"
↓
Search codebase for existing tools → found Exa MCP (web search)
↓
Propose: "Knowledge retrieval via Exa MCP (already configured).
          If Exa is unavailable, fallback to web search via Firecrawl."

Rule: Propose a specific tool if one already exists in the codebase. Only propose a category when no existing tool fits. แต่: ถ้า tool ที่มีอยู่ mismatch กับความต้องการ (เช่น ต้อง local doc search แต่มีแค่ Exa web search) — ห้าม force-fit Mark as partial fit or gap และเสนอ category ที่ตรงกว่า

✅ "Uses Exa MCP for web search — already configured"
✅ "Exa MCP มีอยู่สำหรับ web search แต่ work log search ควรใช้
    Obsidian + grep แทน — เป็น partial fit สำหรับ external search"
✅ "No existing vector DB found. Needs a semantic search tool
    such as Chroma or Qdrant."
❌ "Needs a knowledge retrieval tool" (vague — Exa exists, use it)
❌ "Uses Exa MCP for everything" (force-fit — Exa ไม่เหมาะกับ local search)

Tool categories (reference for gaps):

Knowledge retrieval    → search, vector DBs, document stores, web scraping
Code execution         → sandboxes, interpreters, compilers, linters
Communication          → email, messaging, notifications, webhooks
System operations      → file I/O, process management, network, registry
Data processing        → ETL, transformation, validation, storage
Visual/creative        → image generation, layout engines, animation
Monitoring             → logs, metrics, tracing, alerting
AI/LLM                 → model APIs, embeddings, RAG pipelines, vector DBs
Testing                → test runners, coverage, mocking, E2E browsers

Skills — what skills must it load before work?
Delegation — when should it hand off to another agent? Which one?
Failure handling — what happens when a capability fails? (retry? escalate? log? fallback?)

Domain 3: Input & Output

Input formats — text, file, command, webhook, voice, structured data?
Output formats — message, file, API call, action, email, state change?
Format constraints — JSON schema? Markdown structure? Naming convention?

Deliverables — what concrete outputs does this agent produce?

For each deliverable:

Deliverable: Bug report
Format: Markdown
Template:
  ## Bug: [title]
  ### Environment
  ### Steps to reproduce
  ### Expected vs Actual
  ### Severity
  ### Suggested fix

Timing — synchronous (wait for response) or async (fire and forget)?
Error output — how does it report errors?

Domain 4: Ecosystem

Platform — OpenCode? ZCode? Claude Code? Custom?
Placement — subagent? plugin? standalone? MCP server?
Agent team — works alone? reports to supervisor? peers with others?
Dependencies — env vars? API keys? config files? services? databases?
Communication — how does it receive work?
Knowledge — what files/repos/docs must it read before first use?

Domain 5: Memory & Learning

Session memory — what must it remember during one conversation?
Persistent memory — what crosses sessions?
Knowledge accumulation — what should it learn and keep?
Forgetting — what should it explicitly NOT remember?

Domain 6: Lifecycle

Trigger — always active? on-demand? scheduled? event-driven?
Auto-load — must be loaded automatically or only when called?
Scope duration — how long does this agent exist?
Maintenance — who updates it? what happens when it breaks?
Retirement — when and how is this agent removed?

Domain 7: Risk & Constraints

Failure modes — what are the most likely failures?
Sensitive data — does it handle PII, secrets, credentials?
Safety guardrails — what protection layers are needed?
Testing — how do you verify it works correctly?
Fallback — what happens if it can't complete its task?

Domain 8: Success Metrics

How do you know this agent is working well?

At least 1 metric (not mandatory to have 3). Mix types as needed:

Metric Type	Examples
Accuracy	Answer correctness %, error rate, hallucination rate
Speed	Response time, time-per-task, throughput
Efficiency	Token cost per task, cache hit rate, retry rate
Quality	User approval %, rework rate, first-attempt success
Business	Tasks completed, issues resolved, features shipped

Format:

Metric: First-attempt success rate
Target: >80%
Measure: (tasks passed QA on first try / total tasks) × 100
When: Weekly review
Alert: If <60%, investigate

Standard Layer — Walkthrough Example

User: "อยากได้ agent ที่ช่วย review PR ให้ — เช็ค security, style, correctness"

Intent Gate:
  Archetype: Developer
  User: Mike (dev, Thai)
  Goal: PR review automation
  Constraint: ทำงานกับ GitHub PRs เท่านั้น
  Layer: Quick → escalate to Standard (PR review is complex enough)

Codebase Scan:
  → index.md มี steward, minecrafter agents
  → opencode.jsonc มี Exa, Context7 MCP
  → AGENTS.md มี rule framework
  → No existing code-review agent → greenfield for this specific agent

Archetype Match:
  Developer → Lifecycle = brief, Identity = brief (frontmatter only)
  Focus deep-dive on Capabilities + Ecosystem + I/O

Present Draft → User marks up:

  (AI presents inferred draft:)

  (Identity) Name: "pr-reviewer"
  Archetype: Developer
  Role: "review PRs for security, style, and correctness"
  Persona: technical + direct tone
  Example phrase: "Line 142 uses eval() with user input — this is a
                   high-severity XSS risk. Suggest using DOMPurify or
                   a safe parsing alternative."
  Context: รู้จัก project stack (React+TS+Tailwind), GitHub workflow

  (Capabilities)
  1. Security scan — flag OWASP top 10 patterns
  2. Style check — enforce project conventions (prettier, eslint)
  3. Correctness — detect logic errors, dead code, type mismatches

  Boundaries:
    Rule: Never auto-approve PRs
    Scenario: Author asks "approve my PR, it's urgent"
    Consequence: "I can only comment — final approval is yours"

  Tools:
    - GitHub CLI (gh) for PR diff + comments
    - Already configured: none needed beyond gh
    - Gap: no vulnerability DB → "future: integrate OSV API for CVE lookup"

  I/O: PR diff in → structured review comments out (per file, severity)

  Ecosystem:
    Platform: OpenCode
    Team: standalone, triggered on demand
    Dependencies: GitHub token (existing)

  Memory: session-only (review per PR, no cross-PR learning)

  Risk: False positives waste reviewers' time. False negatives worse.
  Metric: "False positive rate <10%"
    Measure: flagged issues rejected by human reviewer / total flags

  (User marks up:)
  "เพิ่ม capability: auto-post review comments ไป PR ด้วยเลย
   Memory: เก็บประวัติ PR ที่ review แล้ว — อย่า review ซ้ำ
   Metric: ลดเหลือ 1 ข้อ — false positive rate ก็พอ"

  (AI revises draft → Propose Spec → Approval → Build)

Workflow

User request
    │
    ▼
  ┌─────────────────────────────────────────────┐
  │          1. INTENT GATE                     │
  │  Restate request, identify archetype,       │
  │  user, goal, constraint.                    │
  │  Check codebase (skip if greenfield).       │
  └─────────────────────────────────────────────┘
    │
    ▼
  ┌─────────────────────────────────────────────┐
  │          2. SELECT LAYER                    │
  │  Use signal table → Quick / Standard /      │
  │  Deep / Critical. Write justification.      │
  └─────────────────────────────────────────────┘
    │
    ├── DEFAULT: Layer = Quick ────────────────┤
    │  (start here for every session)           │
    │                                           │
    │  ┌─────────────────────────────────────┐  │
    │  │  3. INFER → PRESENT DRAFT           │  │
    │  │  Infer spec จาก context + codebase  │  │
    │  │  → เสนอ condensed spec               │  │
    │  │  → user mark up                     │  │
    │  │  → ถ้า scope ชัด → BUILD             │  │
    │  └─────────────────────────────────────┘  │
    │                                           │
    ├── IF complexity detected ────────────────┤
    │  Promote to Standard / Deep / Critical    │
    │                                           │
    ├── Layer = Standard / Deep / Critical ─────┤
    │                                           │
    │  ┌─────────────────────────────────────┐  │
    │  │  3. PRESENT DRAFT                   │  │
    │  │  Infer + apply minimize list → 1pg  │  │
    │  │  draft → user marks up              │  │
    │  └─────────────────────────────────────┘  │
    │                                           │
    │  ┌─────────────────────────────────────┐  │
    │  │  4. CHECKLIST GAPS                  │  │
    │  │  Standard → check 8 domains         │  │
    │  │  Deep → + Ecosystem + deps           │  │
    │  │  Critical → + Risk + Cost + Safety  │  │
    │  │  Ask ONLY what draft doesn't cover  │  │
    │  └─────────────────────────────────────┘  │
    │                                           │
    │  ┌─────────────────────────────────────┐  │
    │  │  5. PROPOSE SPEC                    │  │
    │  │  Full Agent Spec (templates/agent-  │  │
    │  │  spec.md). Present to user.         │  │
    │  └─────────────────────────────────────┘  │
    │                                           │
    │  ┌─────────────────────────────────────┐  │
    │  │  6. APPROVAL GATE                   │  │
    │  │  User approves / revise / reject.   │  │
    │  │  No build until approved.           │  │
    │  └─────────────────────────────────────┘  │
    │                                           │
    └───────────────────────────────────────────┘
    │
    ▼
  ┌─────────────────────────────────────────────┐
  │          6. BUILD OR HANDOFF                │
  │  If approved → build OR write handoff       │
  │  notes for next agent.                      │
  │  Update index.md + journal.                 │
  └─────────────────────────────────────────────┘

Questioning Technique

Infer → Present draft → User marks up → Ask only gaps.

T0: Infer-First (Primary, always)

Infer spec from available context before you open your mouth. Present the draft to the user. Let them edit it.

User: "อยากได้ agent ที่สรุปเมล์หน่อย"

Infer (จาก context + codebase):
  Archetype: Scribe
  Capability: read mail → bullet summary
  Tools: Gmail MCP (มีอยู่แล้ว)
  Ecosystem: อยู่กับ steward

Present draft:
  "Infer มาแบบนี้:
   - สรุปภาษาไทย
   - สรุปทุกฉบับ (ไม่เฉพาะหัวข้อ)
   - ไม่เก็บ log

   ถูกไหม หรืออยากแก้ตรงไหน?"

User: "ภาษาไทยถูก แต่ขอเฉพาะเมล์จากผู้รับบางคน"

T1: Decision Tree Walk (fallback — for genuine gaps)

Presenting the draft didn't clear everything. Now walk in order, but still recommend not ask.

Identity → Capabilities → I/O → Ecosystem → Memory → Lifecycle → Risk/Metrics

T2: Recommend an Answer (always)

ทุกคำถามควรมี recommendation — อย่าถามปลายเปิด

❌ "personality แบบไหน?"
✅ "แนะนำ tone กลางๆ ทางการ — หรืออยากได้แนวอื่น?"

Agent Test Plan

Every agent capability needs a test prompt. Behavioral validation: "Does the agent do what we designed it to do?"

When to Create

Standard layer and above. At Quick layer, skip tests for unchanged caps.

Test Prompt Format

Capability: [name]
Test prompt: "[exact prompt to give the agent]"
Expected output: "[what a correct response looks like]"
Edge cases:
  - [edge case 1] → [expected handling]
  - [edge case 2] → [expected handling]

Example

Capability: Search knowledge base
Test prompt: "หาข้อมูลเกี่ยวกับ Docker networking"
Expected output: สรุปสั้นๆ 1-3 ข้อ + แหล่งที่มา
Edge cases:
  - ไม่พบข้อมูล → "ไม่พบข้อมูลในฐานความรู้ แนะนำให้ค้นจากภายนอก"
  - คำค้นกำกวม → ถามให้ชัดก่อน
  - ขอข้อมูลที่เป็นความลับ → "ขออภัย ข้อมูลนี้ไม่สามารถเปิดเผยได้"

Test Plan Placement

Spec approved → Build agent → Run test prompts
                   │              │
                   │         ┌─────┴──────┐
                   │         ▼            ▼
                   │       PASS          FAIL
                   │         │            │
                   │    ┌────┘            ▼
                   │    │         Iterate & fix
                   ▼    ▼         → rerun → pass → done
              Deploy / Handoff

Agent Iteration Loop

The Loop

Build  →  Run test prompts  →  Review results
  ↑                              │
  │                         ┌────┴────┐
  │                         ▼         ▼
  │                      PASS?      FAIL?
  │                         │         │
  │                    ┌────┘         ▼
  │                    │      Identify gap
  │                    │         │
  │                    │    ┌─────┴──────┐
  │                    │    ▼            ▼
  │                    │  Scope gap?  Behavior gap?
  │                    │               │
  │                    │    Redesign   Fix instruction
  │                    │    spec       / rules
  │                    └────┴────────────┘
  │                              │
  └──────────────────────────────┘
        (loop until stable)

When to Stop

Signal	Action
All test prompts pass	✅ Deploy
Critical path passes, minor edge cases fail	⚠️ Log issues, deploy
Same failure after 3 redesigns	🛑 Escalate — may need scope change
User says "enough"	🛑 Stop

Checkpoint Gates

Intent Gate

Spec Gate (before build)

When Not To Use This Skill

Editing an existing agent's prompt without changing its scope or identity.
Debugging or fixing a broken agent (scope is already defined).
System administration or DevOps tasks unrelated to agent design.
Research or exploration without a design intent.

templates/quick-spec.md — Quick layer output (6-field fixed format)
templates/agent-spec.md — Standard+ layer output (full template)
references/deep-critical.md — Deep & Critical layer details
idea-to-architecture-agent
senior-architect-agent
Aetox-Agents-Team
Agency-Agents

Ask in your favorite AI

Documentation

Create Agent

Purpose

Iron Rules

Agent Archetype System

Archetype Catalog

Archetype Minimize List

Archetype Selection

Depth Layer System

Layer Selection

Layer 1 — Quick

Entry Criteria

Process

Mini-Spec Format (ตายตัว)

Example

Layer 2 — Standard

Process

Scope Interview (8 Domains)

Domain 1: Identity & Purpose

Domain 2: Capabilities & Boundaries

Domain 3: Input & Output

Domain 4: Ecosystem

Domain 5: Memory & Learning

Domain 6: Lifecycle

Domain 7: Risk & Constraints

Domain 8: Success Metrics

Standard Layer — Walkthrough Example

Workflow

Questioning Technique

T0: Infer-First (Primary, always)

T1: Decision Tree Walk (fallback — for genuine gaps)

T2: Recommend an Answer (always)

Agent Test Plan

When to Create

Test Prompt Format

Example

Test Plan Placement

Agent Iteration Loop

The Loop

When to Stop

Checkpoint Gates

Intent Gate

Spec Gate (before build)

When Not To Use This Skill

Related

Related Skills

steipete/sag

steipete/oracle

steipete/peekaboo

obra/brainstorming

affaan-m/prisma-patterns

affaan-m/django-celery