name: swe description: Apply an opinionated software-engineering governance lens to a build/spec/PRD document — especially a "build v1.x" doc — so the plan enforces minimal scope, designs for diagnosis, accounts for blast radius, and is verifiable before anyone writes code. Use this when authoring or reviewing a v1.x build plan, implementation spec, architecture doc, RFC, or AGENTS.md/CLAUDE.md, or when the user says "review this build doc," "apply our SWE standards," "make this plan production-grade," "gate this spec," "is this plan ready to build," "bake in our engineering philosophy," or pastes a plan and asks whether it embodies good engineering discipline. Also self-trigger before drafting a v1.x build document so the standards shape it from the first draft instead of being bolted on after. This is the standard (the rubric), not the pipeline — it supplies the engineering invariants a plan is checked against.

SWE

Vibe the build; engineer the plan. This lens is the discipline that lets a fast v1.x ship without becoming a liability.

A governance overlay for build/spec documents — the "build v1.x" doc, the implementation spec, the architecture RFC. It does not debug code or pick a tradeoff in the moment; it reads the plan and asks whether the plan already embodies the engineering standards before a single line is written. Run it two ways: as an authoring gate (write the doc against it) or as a review rubric (read a doc, emit findings + a verdict). The whole bet: most plans fail not on the feature but on the four things below, smuggled past in prose.

The four pillars

Each pillar is a lens on the document. A v1.x doc that satisfies a pillar contains the thing explicitly; a doc that "implies" it fails the pillar — implied is unbuilt.

1. Minimal (Ponytail) — does the plan earn each part it adds?

The plan's default answer to "add a thing" is no. Scope, dependencies, and abstractions are all liabilities until justified in the doc.

Every new component answers "does this need to exist?" (YAGNI) — speculative scope is cut or deferred, not built.
Sourcing ladder is honored before any new dependency: stdlib → native platform feature → already-installed dep → one line of our own. A new dep names what it buys that the rung above does not.
No premature abstraction — the plugin layer / framework / generic engine is justified by ≥2 concrete present uses, not one hypothetical future one.
Bias is stated: delete > add, boring > clever, shortest diff that works. A complexity cap is named (e.g. stdlib-only, ~600-line ceiling) where it applies.

Planning translation: this is the editor pass on scope. Most v1.x bloat is decided here, in the doc, long before code.

2. Diagnosable (Mantra) — does the plan say how it will fail and be found?

A build doc that provisions zero observability is a debugging session deferred to production. Bake the diagnosis path into v1.x, not v1.next.

Instrumentation is in scope, not assumed: structured logs, a per-request/per-run correlation id, and the trace path (logs → stack → source) are part of the build, not a follow-up.
Every iterate/retry loop has a stop condition (e.g. 5-failure hard stop, 10-total cap). An unbounded "retry until it works" is a defect in the plan.
Failures are made reproducible: the plan names how a failure is repro'd, and treats intermittent failure as a signal (concurrency / ordering / env / TOCTOU), not noise to retry away.
State changes are auditable — append-only event log over in-place mutation where the history matters.
The plan names debug-mantra as its execution-time debugging protocol — "we'll figure it out when it breaks" is a Diagnosable failure.

Planning translation: the runtime debugging ritual, pulled forward. If the doc can't say how you'll see it break, you'll see it break in prod.

3. Blast — does the plan price its irreversible moves before committing?

For every wide-impact or hard-to-undo step, the doc must already carry the cost. Don't dress a one-way door as a tweak.

Each risky step names its undo class: easy / costly / one-way door. One-way doors are flagged, never silent.
Blast radius is named — the exact systems, data, and people that break if this step goes wrong (not "various downstream").
A shield is specified — flag, adapter, pilot/canary, dual-write, or an explicit "none."
A tripwire exists for anything costly or one-way: how you'll know to pull it and by when (the point of no return). A shield with no tripwire is a brake with no warning light.

For the full per-decision accounting, defer to the blast-radius skill — this pillar only enforces that the v1.x doc contains that accounting for its irreversible steps.

4. Proof (Done) — can the plan prove it's finished, separately from claiming it?

Editor and grader are different roles. The plan must define "done" in terms something other than the author can check.

Every task has a measurable done-criterion — a checkable output or metric, not "works" / "improved" / "robust."
Tests are specified and the plan requires they actually run — "tests pass" means an execution artifact, not an assertion.
No orphan tasks: every step maps to a success criterion, and every success criterion is covered by a step.
Observed vs. predicted is kept separate — the doc never launders a projection ("this will reduce load 40%") as evidence. Predictions are labeled as such.

Planning translation: PlanProof's editor/grader separation at document scale. The grader reads only what's written, not what the author meant.

House invariants

Non-negotiable conventions a v1.x doc must satisfy regardless of pillar. These are cheap to check and expensive to skip.

FSM threshold — model an explicit state machine only past ~4 states; below that a flag or enum is leaner. Past it, an ad-hoc tangle of booleans is the defect.
Single write path — one writer per piece of state. Multiple write paths to the same table/file are a race waiting to happen; name the single path.
Append-only event log for anything whose history or audit matters (JSONL or equivalent), over destructive in-place updates.
UTC-only time handling end to end; local time only at the display edge. "Nightly," "daily," "expires in 24h" all imply a timezone — pin it.
Crash-safe / idempotent jobs — cron and background work are resumable and safe to run twice. A job that corrupts state on a mid-run crash is unshipped.
Checklist standard — actionable items use the - [ ] hyphen prefix in GitHub-flavored Markdown, never bare [ ] in tables or lists.
Agent contract current — for agent-built work, AGENTS.md / CLAUDE.md exists and matches the plan (conventions, loop caps, honesty constraints). The project's AGENTS.md must name SOLID compliance as a coding standard; if it doesn't, the plan has no enforceable code-design contract.

How this differs from the sibling skills

swe — "Does this plan embody our engineering standards before we build?" The standard/rubric, applied to a whole document.
plan-adversarial-serial — the pipeline (generate → review → revise → review → judge). It is the machinery; swe is one of the standards the machinery can enforce. Compose them: run the adversarial pipeline with swe as the lens content.
blast-radius — "How big is this one decision and what breaks?" swe's Blast pillar defers per-decision accounting to it.
iron-triangle — "Which of speed/cost/quality is this choice trading?" When a plan forces fast/cheap/good tension, hand that node to it.
debug-mantra — the four-step runtime debugging discipline (reproduce → fail path → falsify → breadcrumb). Diagnosable enforces the plan names it; debug-mantra governs live execution — composing across the doc/session boundary.
take-a-step-back — "Is this the right problem/frame at all?" Runs before there is a plan to govern.
bottom-line / linear — compress or sequence output. swe evaluates a plan's substance; those reshape its presentation.

Reach for swe the moment there is a build/spec document to hold to a standard — authoring one or judging one.

How to apply

Author mode — you are writing the v1.x doc. Use the four pillars and house invariants as the doc's skeleton: each feature passes Minimal before it earns a section; each risky step ships with its Blast block; each task ships with its Proof criterion. The lens is the gate, not a later edit.

Review mode — you are handed a v1.x doc. Walk the four pillars then the invariants. For each gap, emit one finding keyed to the doc location (§section, or file:line for code-adjacent specs), tagged by severity, with the cheapest fix first. Close with a verdict. Do not rewrite the doc unless asked — surface the checkable gaps and let the author act.

Output format (Review mode)

Lead with the verdict in one line. Then findings, ordered by severity, then quick wins first within a severity. Keep it tight — clean plans get a short list, not a manufactured one.

Verdict: Ship · Ship with conditions · Block — [one sentence: the load-bearing reason].

Findings:

Loc	Pillar	Severity	What breaks	Cheapest fix
§x.y	Blast	Block	One-way migration with no rollback named	Add a dual-write window + a row-count tripwire before cutover
§x.z	Proof	Fix	"Improves sync" has no done-criterion	State the measurable signal (e.g. drift count → 0 over 3 runs)

Severity: Block (cannot build safely as written — unguarded one-way door, no done-criteria, unbounded loop) · Fix (change before v1.x ships) · Note (improvement, optional).

Quick wins: [the 1–3 findings fixable in a line of doc, listed as - [ ] so the author can tick them off.]

Principles

Implied is unbuilt. A plan gets credit only for what it states. "Obviously we'll log it" is a Diagnosable failure; write it down or it won't exist.

Editor and grader are different roles. The strongest constraint in the whole lens is Proof's separation: done is defined by something the author can't fudge. Keep them apart.

Price the one-way doors loudest. Reversible mistakes are cheap; irreversible ones are the only ones worth slowing down for. A tweak that can't be undone is not a tweak.

Cut scope before breaking a corner. When the plan is overloaded, the Minimal pillar and scope-reduction beat heroics. Defer to iron-triangle when the tension is explicitly fast/cheap/good.

Stop conditions are not optional. Any loop in the plan — retry, fix-iterate, agent relay — names its cap. Unbounded means undefined.

Separate what you saw from what you expect. Observed evidence and predicted behavior are different epistemic objects; a plan that blends them is lying to its future reader.

Don't manufacture findings. A genuinely clean v1.x doc gets "Ship" and a short list. The lens enforces a standard, not a quota of objections.

Example — reviewing a v1.x build doc excerpt

Input (abridged):

§3 Nightly order reconciler. A cron job runs each night to reconcile order state against the upstream system. For each order it pulls upstream status and updates wp_orders to match. On mismatch it retries until the write succeeds. This makes order state reliable.

Output:

Verdict: Block — a nightly DELETE/UPDATE against the orders table with an unbounded retry, no tripwire, and no measurable "reliable" can corrupt order state with no way to know or stop it.

Findings:

Loc Pillar Severity What breaks Cheapest fix
§3 Diagnosable Block "retries until the write succeeds" is an unbounded loop Add a 5-failure hard stop / 10-total cap, then dead-letter the order
§3 Blast Block Bulk overwrite of wp_orders is costly-to-one-way with no shield/tripwire Run in observe mode first; add a row-count + mismatch-rate tripwire before any write mode
§3 Proof Fix "makes order state reliable" has no done-criterion Define it: drift count → 0 across 3 consecutive runs, logged
§3 House (UTC) Fix "each night" has no timezone Pin the schedule to UTC; document the local display offset separately
§3 House (write path) Note Reconciler is a second writer to wp_orders Name the single write path, or gate the reconciler behind the same one

Quick wins:

Add the loop cap (§3) — one clause.

Pin "each night" to UTC (§3) — one word.

Replace "reliable" with the drift-count criterion (§3) — one sentence.

What success looks like

The author sees, before any code exists, exactly where the plan is a wish rather than a build: the loop with no cap, the migration with no rollback, the task that can't prove it's done. The best outcome is a v1.x doc that ships fast because its risky parts were priced up front — not one that ships fast and pays later.

Claude-AI-Tools-Ventura-County/giant-brains-claude-skills

Documentation

SWE

The four pillars

1. Minimal (Ponytail) — does the plan earn each part it adds?

2. Diagnosable (Mantra) — does the plan say how it will fail and be found?

3. Blast — does the plan price its irreversible moves before committing?

4. Proof (Done) — can the plan prove it's finished, separately from claiming it?

House invariants

How this differs from the sibling skills

How to apply

Output format (Review mode)

Principles

Example — reviewing a v1.x build doc excerpt

What success looks like

Skills associés

codexskills/codexskills

jfrog/jfrog-skills

darshjme/arsenal-omc-plugin

ayhhb/agent-dispatch-sidebar-skill

backnotprop/plannotator

toby-king/claude-product-wiki

Loc	Pillar	Severity	What breaks	Cheapest fix
§3	Diagnosable	Block	"retries until the write succeeds" is an unbounded loop	Add a 5-failure hard stop / 10-total cap, then dead-letter the order
§3	Blast	Block	Bulk overwrite of `wp_orders` is costly-to-one-way with no shield/tripwire	Run in observe mode first; add a row-count + mismatch-rate tripwire before any write mode
§3	Proof	Fix	"makes order state reliable" has no done-criterion	Define it: drift count → 0 across 3 consecutive runs, logged
§3	House (UTC)	Fix	"each night" has no timezone	Pin the schedule to UTC; document the local display offset separately
§3	House (write path)	Note	Reconciler is a second writer to `wp_orders`	Name the single write path, or gate the reconciler behind the same one