CommunityArt et designgithub.com

Claude-AI-Tools-Ventura-County/giant-brains-claude-skills

name: swe

Compatible avecClaude Code~Codex CLI~Cursor
npx skills add Claude-AI-Tools-Ventura-County/giant-brains-claude-skills

Documentation


name: swe description: Apply an opinionated software-engineering governance lens to a build/spec/PRD document — especially a "build v1.x" doc — so the plan enforces minimal scope, designs for diagnosis, accounts for blast radius, and is verifiable before anyone writes code. Use this when authoring or reviewing a v1.x build plan, implementation spec, architecture doc, RFC, or AGENTS.md/CLAUDE.md, or when the user says "review this build doc," "apply our SWE standards," "make this plan production-grade," "gate this spec," "is this plan ready to build," "bake in our engineering philosophy," or pastes a plan and asks whether it embodies good engineering discipline. Also self-trigger before drafting a v1.x build document so the standards shape it from the first draft instead of being bolted on after. This is the standard (the rubric), not the pipeline — it supplies the engineering invariants a plan is checked against.

SWE

Vibe the build; engineer the plan. This lens is the discipline that lets a fast v1.x ship without becoming a liability.

A governance overlay for build/spec documents — the "build v1.x" doc, the implementation spec, the architecture RFC. It does not debug code or pick a tradeoff in the moment; it reads the plan and asks whether the plan already embodies the engineering standards before a single line is written. Run it two ways: as an authoring gate (write the doc against it) or as a review rubric (read a doc, emit findings + a verdict). The whole bet: most plans fail not on the feature but on the four things below, smuggled past in prose.

The four pillars

Each pillar is a lens on the document. A v1.x doc that satisfies a pillar contains the thing explicitly; a doc that "implies" it fails the pillar — implied is unbuilt.

1. Minimal (Ponytail) — does the plan earn each part it adds?

The plan's default answer to "add a thing" is no. Scope, dependencies, and abstractions are all liabilities until justified in the doc.

  • Every new component answers "does this need to exist?" (YAGNI) — speculative scope is cut or deferred, not built.
  • Sourcing ladder is honored before any new dependency: stdlib → native platform feature → already-installed dep → one line of our own. A new dep names what it buys that the rung above does not.
  • No premature abstraction — the plugin layer / framework / generic engine is justified by ≥2 concrete present uses, not one hypothetical future one.
  • Bias is stated: delete > add, boring > clever, shortest diff that works. A complexity cap is named (e.g. stdlib-only, ~600-line ceiling) where it applies.

Planning translation: this is the editor pass on scope. Most v1.x bloat is decided here, in the doc, long before code.

2. Diagnosable (Mantra) — does the plan say how it will fail and be found?

A build doc that provisions zero observability is a debugging session deferred to production. Bake the diagnosis path into v1.x, not v1.next.

  • Instrumentation is in scope, not assumed: structured logs, a per-request/per-run correlation id, and the trace path (logs → stack → source) are part of the build, not a follow-up.
  • Every iterate/retry loop has a stop condition (e.g. 5-failure hard stop, 10-total cap). An unbounded "retry until it works" is a defect in the plan.
  • Failures are made reproducible: the plan names how a failure is repro'd, and treats intermittent failure as a signal (concurrency / ordering / env / TOCTOU), not noise to retry away.
  • State changes are auditable — append-only event log over in-place mutation where the history matters.
  • The plan names debug-mantra as its execution-time debugging protocol — "we'll figure it out when it breaks" is a Diagnosable failure.

Planning translation: the runtime debugging ritual, pulled forward. If the doc can't say how you'll see it break, you'll see it break in prod.

3. Blast — does the plan price its irreversible moves before committing?

For every wide-impact or hard-to-undo step, the doc must already carry the cost. Don't dress a one-way door as a tweak.

  • Each risky step names its undo class: easy / costly / one-way door. One-way doors are flagged, never silent.
  • Blast radius is named — the exact systems, data, and people that break if this step goes wrong (not "various downstream").
  • A shield is specified — flag, adapter, pilot/canary, dual-write, or an explicit "none."
  • A tripwire exists for anything costly or one-way: how you'll know to pull it and by when (the point of no return). A shield with no tripwire is a brake with no warning light.

For the full per-decision accounting, defer to the blast-radius skill — this pillar only enforces that the v1.x doc contains that accounting for its irreversible steps.

4. Proof (Done) — can the plan prove it's finished, separately from claiming it?

Editor and grader are different roles. The plan must define "done" in terms something other than the author can check.

  • Every task has a measurable done-criterion — a checkable output or metric, not "works" / "improved" / "robust."
  • Tests are specified and the plan requires they actually run — "tests pass" means an execution artifact, not an assertion.
  • No orphan tasks: every step maps to a success criterion, and every success criterion is covered by a step.
  • Observed vs. predicted is kept separate — the doc never launders a projection ("this will reduce load 40%") as evidence. Predictions are labeled as such.

Planning translation: PlanProof's editor/grader separation at document scale. The grader reads only what's written, not what the author meant.

House invariants

Non-negotiable conventions a v1.x doc must satisfy regardless of pillar. These are cheap to check and expensive to skip.

  • FSM threshold — model an explicit state machine only past ~4 states; below that a flag or enum is leaner. Past it, an ad-hoc tangle of booleans is the defect.
  • Single write path — one writer per piece of state. Multiple write paths to the same table/file are a race waiting to happen; name the single path.
  • Append-only event log for anything whose history or audit matters (JSONL or equivalent), over destructive in-place updates.
  • UTC-only time handling end to end; local time only at the display edge. "Nightly," "daily," "expires in 24h" all imply a timezone — pin it.
  • Crash-safe / idempotent jobs — cron and background work are resumable and safe to run twice. A job that corrupts state on a mid-run crash is unshipped.
  • Checklist standard — actionable items use the - [ ] hyphen prefix in GitHub-flavored Markdown, never bare [ ] in tables or lists.
  • Agent contract current — for agent-built work, AGENTS.md / CLAUDE.md exists and matches the plan (conventions, loop caps, honesty constraints). The project's AGENTS.md must name SOLID compliance as a coding standard; if it doesn't, the plan has no enforceable code-design contract.

How this differs from the sibling skills

  • swe — "Does this plan embody our engineering standards before we build?" The standard/rubric, applied to a whole document.
  • plan-adversarial-serial — the pipeline (generate → review → revise → review → judge). It is the machinery; swe is one of the standards the machinery can enforce. Compose them: run the adversarial pipeline with swe as the lens content.
  • blast-radius — "How big is this one decision and what breaks?" swe's Blast pillar defers per-decision accounting to it.
  • iron-triangle — "Which of speed/cost/quality is this choice trading?" When a plan forces fast/cheap/good tension, hand that node to it.
  • debug-mantra — the four-step runtime debugging discipline (reproduce → fail path → falsify → breadcrumb). Diagnosable enforces the plan names it; debug-mantra governs live execution — composing across the doc/session boundary.
  • take-a-step-back — "Is this the right problem/frame at all?" Runs before there is a plan to govern.
  • bottom-line / linear — compress or sequence output. swe evaluates a plan's substance; those reshape its presentation.

Reach for swe the moment there is a build/spec document to hold to a standard — authoring one or judging one.

How to apply

Author mode — you are writing the v1.x doc. Use the four pillars and house invariants as the doc's skeleton: each feature passes Minimal before it earns a section; each risky step ships with its Blast block; each task ships with its Proof criterion. The lens is the gate, not a later edit.

Review mode — you are handed a v1.x doc. Walk the four pillars then the invariants. For each gap, emit one finding keyed to the doc location (§section, or file:line for code-adjacent specs), tagged by severity, with the cheapest fix first. Close with a verdict. Do not rewrite the doc unless asked — surface the checkable gaps and let the author act.

Output format (Review mode)

Lead with the verdict in one line. Then findings, ordered by severity, then quick wins first within a severity. Keep it tight — clean plans get a short list, not a manufactured one.

Verdict: Ship · Ship with conditions · Block — [one sentence: the load-bearing reason].

Findings:

LocPillarSeverityWhat breaksCheapest fix
§x.yBlastBlockOne-way migration with no rollback namedAdd a dual-write window + a row-count tripwire before cutover
§x.zProofFix"Improves sync" has no done-criterionState the measurable signal (e.g. drift count → 0 over 3 runs)

Severity: Block (cannot build safely as written — unguarded one-way door, no done-criteria, unbounded loop) · Fix (change before v1.x ships) · Note (improvement, optional).

Quick wins: [the 1–3 findings fixable in a line of doc, listed as - [ ] so the author can tick them off.]

Principles

Implied is unbuilt. A plan gets credit only for what it states. "Obviously we'll log it" is a Diagnosable failure; write it down or it won't exist.

Editor and grader are different roles. The strongest constraint in the whole lens is Proof's separation: done is defined by something the author can't fudge. Keep them apart.

Price the one-way doors loudest. Reversible mistakes are cheap; irreversible ones are the only ones worth slowing down for. A tweak that can't be undone is not a tweak.

Cut scope before breaking a corner. When the plan is overloaded, the Minimal pillar and scope-reduction beat heroics. Defer to iron-triangle when the tension is explicitly fast/cheap/good.

Stop conditions are not optional. Any loop in the plan — retry, fix-iterate, agent relay — names its cap. Unbounded means undefined.

Separate what you saw from what you expect. Observed evidence and predicted behavior are different epistemic objects; a plan that blends them is lying to its future reader.

Don't manufacture findings. A genuinely clean v1.x doc gets "Ship" and a short list. The lens enforces a standard, not a quota of objections.

Example — reviewing a v1.x build doc excerpt

Input (abridged):

§3 Nightly order reconciler. A cron job runs each night to reconcile order state against the upstream system. For each order it pulls upstream status and updates wp_orders to match. On mismatch it retries until the write succeeds. This makes order state reliable.

Output:

Verdict: Block — a nightly DELETE/UPDATE against the orders table with an unbounded retry, no tripwire, and no measurable "reliable" can corrupt order state with no way to know or stop it.

Findings:

LocPillarSeverityWhat breaksCheapest fix
§3DiagnosableBlock"retries until the write succeeds" is an unbounded loopAdd a 5-failure hard stop / 10-total cap, then dead-letter the order
§3BlastBlockBulk overwrite of wp_orders is costly-to-one-way with no shield/tripwireRun in observe mode first; add a row-count + mismatch-rate tripwire before any write mode
§3ProofFix"makes order state reliable" has no done-criterionDefine it: drift count → 0 across 3 consecutive runs, logged
§3House (UTC)Fix"each night" has no timezonePin the schedule to UTC; document the local display offset separately
§3House (write path)NoteReconciler is a second writer to wp_ordersName the single write path, or gate the reconciler behind the same one

Quick wins:

  • Add the loop cap (§3) — one clause.
  • Pin "each night" to UTC (§3) — one word.
  • Replace "reliable" with the drift-count criterion (§3) — one sentence.

What success looks like

The author sees, before any code exists, exactly where the plan is a wish rather than a build: the loop with no cap, the migration with no rollback, the task that can't prove it's done. The best outcome is a v1.x doc that ships fast because its risky parts were priced up front — not one that ships fast and pays later.

Skills associés