CommunityArte y diseñogithub.com

Thulr/agent-skill-kit

Research-grounded Agent Skills for coding agents: DX review, test strategy, and agent-readiness heuristics. Install via npx skills add Thulr/informed-skills.

Compatible con~Claude Code~Codex CLI~Cursor
npx skills add Thulr/agent-skill-kit

Documentación


name: agent-dx description: Use for AGENT DX — the surface an AI agent consumes as a developer. DO — keep an in-progress AI/Agent SDK, tool, structured-output, error, or telemetry change minimal and agent-consumable. REVIEW — audit an agent-facing SDK/tool/error/telemetry surface for stable contracts, agent recovery, and trust-boundary safety, then score it. DESIGN — shape a new agent-facing surface (loop, tools, MCP, structured output, error envelope, SDK telemetry) or explain a principle. Triggers on 'design our Agent SDK', 'are our tool schemas agent-safe', 'review our MCP tool surface', 'shape retry-able tool errors for the model', 'is our agent telemetry leaking PII'. Do NOT use for human developer API/SDK/CLI surfaces (use dx-audit / dx-design), agent-readable docs/AGENTS.md/llms.txt (use agent-docs), repo agent-hardening gates/hooks (use harden-repo-for-coding-agents), or operating eval/observability loops (use agent-ops). license: MIT

Agent DX

The developer experience of a surface where an AI agent is the developer — an SDK, tool, structured output, error envelope, or telemetry an agent consumes to build with or act through. Provenance lives in skill.json; this file is runtime routing only.

Produces: a change-plan.md (DO), an audit-report.md plus a findings-ledger + workflow-state when tracked (REVIEW), or a design-doc.md / refactor-runbook.md / explanation.md (DESIGN).

Core principle

An agent consumes the surface alone, at machine speed, with no tooltip to fall back on. So the bar for any SDK, tool, error, or span is whether a stochastic consumer can parse it, recover from it, and not be harmed by it: typed contracts over prose; recovery designed in, not hoped for; the trust boundary closed by default. It sits atop the HTTP-client floor that dx-audit / dx-design own, never replacing it.

Activation

  • Bare invocation ("use agent-dx", "review our MCP tool surface", "start"): load references/intent-router.csv, show the intent menu, offer the mode. Wait. No file inspection, network calls, or writes.
  • Concrete invocation with intent and surface inferable: skip to step 3.
  • Ambiguous scope: ask one blocker question naming the candidate intent or surface; do not inspect private systems first.

Workflow

  1. Pick intent. Load references/intent-router.csv; match to do, review, or design. Ambiguous → ask once.
  2. Pick surface. Load references/intents/<intent>.csv; match one or more of sdk-design, tools-and-mcp, structured-output, errors-and-retry, sdk-telemetry — or all for a REVIEW fan-out. Ambiguous → ask once with the menu.
  3. Load grounded context. Load only the chosen row's playbook plus its core_refs. Do not load other playbooks (the review/all row carries none — each surface agent loads its own; see step 6).
  4. Identify the target persona from references/core/personas.md.
  5. Then calibrate to project scale (REVIEW / DESIGN) per references/calibration.md: below Load-bearing, narrow scope, collapse same-mechanism gaps into one systemic finding, and split Now vs Later (as it grows). The tier feeds emission, never severity.
  6. Dispatch lenses in parallel (REVIEW default when permitted): one surface per agent for all; otherwise the three lenses below. Sequential fallback if no delegation primitive.
  7. Apply the playbook. Use the heuristics tagged for the intent. For REVIEW, score each surface 0–10 (references/core/score-rubric.md); synthesize lens findings, preserving disagreements as open questions. Apply severity and stable AGENT-DX-<surface>-NNN IDs (references/core/severity-rubric.md): AGENT-DX-SDK, AGENT-DX-TOOL, AGENT-DX-OUT, AGENT-DX-ERR, AGENT-DX-TEL.
  8. Emit. DO → templates/change-plan.md. REVIEW → templates/audit-report.md. DESIGN → templates/design-doc.md (shape), templates/refactor-runbook.md (sequence a hardening), or templates/explanation.md (explain a principle).
  9. Create, resume, or close tracking state (REVIEW). For an audit with 7+ findings, any severity 3–4, or a save/track/closeout request, load references/trackable-findings.md. If the request names an existing ledger, workflow-state, PR, or AGENT-DX-* ID, read saved state first; update statuses only after each verification rule passes. Otherwise write the ledger at docs/audits/agent-dx-findings-ledger-<YYYY-MM-DD>-<scope-slug>.md and workflow state at docs/audits/agent-dx-workflow-state-<YYYY-MM-DD>-<scope-slug>.json (fall back to audit-artifacts/agent-dx-... if docs/audits/ is unwritable). Report both paths; keep roadmaps, issues, and non-tracking edits opt-in.

Modes

Guided Draft (default), Autopilot, Grill Me — see references/modes.md. Offer at bare invocation; default to Guided Draft on concrete invocations.

Output requirements

Every output names the intent, surface(s), persona, playbook(s) applied, and grounding sources from skill.json; REVIEW adds severity + verification per finding and the project tier. Name the agent-consumability guarantees you are deliberately NOT adding (YAGNI), not only the ones you are.

Subagent dispatch

Default for REVIEW when delegation is permitted; skip tiny or secret-bound work. Spawn three lenses in parallel — contract-and-schema (typed schemas from code, stable error envelope, validated output), agent-recovery (stop+verify, retry-shaped errors, semantic retry, bounded loop), trust-and-isolation (tool-metadata injection, guardrail checkpoints, credential walls, content redaction) — per references/subagent-dispatch.md, then synthesize, ordering by severity.

Reference map

  • references/intent-router.csv + references/intents/<intent>.csv — the two routing layers.
  • references/playbooks/<surface>.md — sdk-design, tools-and-mcp, structured-output, errors-and-retry, sdk-telemetry.
  • references/subagent-dispatch.md — three-lens prompts and synthesis.
  • references/core/{severity-rubric,score-rubric,personas,glossary}.md — scales, audience, terms.
  • references/{calibration,trackable-findings,modes}.md — shared (symlinks).
  • templates/*.md — change-plan, audit-report, design-doc, refactor-runbook, explanation, plus the shared tracking artifacts.
  • evals/, skill.json — gates and provenance.

Skills relacionados