claw-score

Audit or refresh OpenClaw maturity scorecard docs from root taxonomy, maturity scores, and QA evidence artifacts without using maintainer discrawl data or committed inventory reports.

Compatible avec~Claude Code~Codex CLI~Cursor
npx add-skill https://github.com/clawdbot/clawdbot/tree/main/.agents/skills/claw-score

claw-score

Use this skill when working on the OpenClaw maturity scorecard in this repo. This is the openclaw-local version of the maintainer claw-score workflow: it keeps the taxonomy and scorecard concepts, but excludes discrawl and the old committed inventory/ report tree.

Authority

This skill owns the operational workflow for:

  • taxonomy.yaml
  • docs/maturity-scores.yaml
  • docs/maturity-scorecard.md
  • docs/taxonomy.md
  • docs/taxonomy-outline.md
  • scripts/render-maturity-docs.mjs
  • .github/workflows/maturity-scorecard.yml

Keep person-specific, maintainer-private, Discord archive, and discrawl facts out of this repo. If a score needs private evidence, use the redacted qa-evidence.json artifact shape generated by OpenClaw QA workflows.

Source Model

  • taxonomy.yaml is the hand-edited source of truth for surfaces, levels, QA profiles, categories, feature coverage IDs, docs refs, LTS overrides, and completeness-instruction paths.
  • docs/maturity-scores.yaml is the aggregate score source committed in this repo. It is the only committed score data; do not add generated inventory directories.
  • docs/maturity-scorecard.md, docs/taxonomy.md, and docs/taxonomy-outline.md are deterministic docs generated from the root taxonomy and aggregate score source.
  • qa-evidence.json artifacts provide per-run QA scorecard evidence. They can enrich generated artifact docs, but they are not committed as inventory.

Commands

Run from the openclaw repo root.

Render committed docs:

pnpm maturity:render

Check generated docs are current:

pnpm maturity:check

Render an evidence-enriched docs artifact from downloaded QA artifacts:

pnpm maturity:render -- --evidence-dir .artifacts/maturity-evidence --output-dir .artifacts/maturity-docs

Scoring Workflow

When asked to score or refresh a surface:

  1. Read the surface in taxonomy.yaml.
  2. Read the surface completeness rubric under .agents/skills/claw-score/references/completeness/.
  3. Gather public repo evidence from docs, source, tests, and QA scenario metadata.
  4. Prefer existing qa-evidence.json artifacts for executed proof. Do not use discrawl or unredacted private archives.
  5. Update docs/maturity-scores.yaml only when the score change is backed by public or redacted artifact evidence.
  6. Run pnpm maturity:render.
  7. Run pnpm maturity:check.

For subjective score changes, make the smallest defensible edit and leave the evidence path in the PR or task summary. The deterministic renderer owns Markdown structure; manual prose tweaks belong in taxonomy, score source, or the renderer rather than in generated docs.

Default Completeness Process

Completeness is scored against the intended operator-visible workflow for each category, not against test breadth or implementation quality. The completeness reference files under references/completeness/ define the category scope and any surface-specific variation from this default process.

By default, Completeness measures how fully OpenClaw exposes the intended surface capability set to the user, operator, author, or maintainer persona for that surface. Score whether each category delivers the full expected workflow, including setup, normal use, status or inspection, recovery, and important platform, provider, channel, security, or lifecycle variants where they apply.

Treat Surface-Specific Scoring Questions and Surface-Specific Guidance as higher-priority instructions for that surface. The surface instructions may flesh out, narrow, or intentionally conflict with the default ideas here; when they do, follow the surface instructions and make the score rationale reflect that surface-specific instruction. If a reference file does not include surface-specific questions or guidance, apply this default process to the surface's Category Scope.

For each category, ask:

  • Can the intended user or operator complete the category workflow end to end?
  • Are the taxonomy features present as supported capabilities rather than isolated implementation fragments?
  • Are the important lifecycle stages represented: setup, normal operation, status/inspection, recovery, and upgrade or removal where relevant?
  • Are the important environment, provider, platform, channel, or security branches present for this surface?
  • Do the known gaps leave major user-visible capability branches missing?

Default guidance:

  • Favor higher Completeness when the category supports the full operator-visible workflow described by taxonomy and category evidence.
  • Lower Completeness when only the happy path exists, when important variants are undocumented or unimplemented, or when recovery/status paths are missing.
  • Do not lower Completeness because tests are thin; that is Coverage.
  • Do not lower Completeness because implementation quality is fragile; that is Quality.

Default Completeness bands:

  • Lovable (95-100): complete across expected workflows, variants, and recovery branches, with only minor polish gaps.
  • Stable (80-95): the expected workflow set is broadly present, with only bounded missing branches.
  • Beta (70-80): the main workflow exists, but meaningful branches or recovery paths are still absent.
  • Alpha (50-70): only a partial capability set is present; users can complete some core tasks but not the full expected workflow.
  • Experimental (0-50): the category exposes only fragments of the intended capability.

Score Semantics

  • Coverage: public or redacted proof that the feature is exercised by docs, tests, QA scenarios, live lanes, or release evidence.
  • Quality: reliability, maintainability, operator safety, and regression confidence for the category.
  • Completeness: how much of the intended operator-visible workflow exists for the category. Use the default completeness process plus any surface-specific variation before changing this score.
  • LTS: derived from score thresholds and human_lts_override; do not hand-edit generated Markdown to change LTS status.

Bands:

  • Lovable: 95-100
  • Stable: 80-95
  • Beta: 70-80
  • Alpha: 50-70
  • Experimental: 0-50

GitHub Action

The Maturity scorecard workflow verifies committed generated docs on PRs and pushes. Manual dispatch can also download QA artifacts from another workflow run with source_run_id and artifact_pattern, render evidence-enriched docs into .artifacts/maturity-docs, and upload them as a GitHub artifact.

Do not add the maintainer repo's docs/kevinslin/maturity-scorecard/inventory/ tree to openclaw. Those generated reports are intentionally replaced here by short-lived artifact docs and the committed aggregate scorecard pages.

Individual skills in this repo

This repo contains 20 individual skills — each has its own dedicated page.

1password

Set up and use 1Password CLI for sign-in, desktop integration, and reading or injecting secrets.

acp-router

Route plain-language requests for Claude Code, Cursor, Copilot, OpenClaw ACP, OpenCode, Gemini CLI, Qwen, Kiro, Kimi, iFlow, Factory Droid, Kilocode, or explicit ACP harness work into either OpenClaw ACP runtime sessions or direct acpx-driven sessions ("telephone game" flow). For coding-agent thread requests, read this skill first, then use only `sessions_spawn` for thread creation. Codex chat binding defaults to the native Codex app-server plugin unless ACP is explicit or background spawn needs ACP.

agent-transcript

Add a redacted agent transcript section to GitHub PR or issue bodies during OpenClaw agent-created PR/issue workflows.

apple-notes

Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.

apple-reminders

List, add, edit, complete, or delete Apple Reminders and reminder lists via remindctl.

autoreview

Auto Review closeout. Codex review is the default when no engine is set and is the recommended reviewer.

bear-notes

Create, search, and manage Bear notes via grizzly CLI.

blacksmith-testbox

Run Blacksmith Testbox for CI-parity checks, secrets, hosted services, migrations, or builds local cannot reproduce.

blogwatcher

Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.

blucli

BluOS CLI (blu) for discovery, playback, grouping, and volume.

bluebubbles

Send and manage iMessages via BlueBubbles, including attachments, tapbacks, edits, replies, and groups.

browser-automation

Use when controlling web pages with the OpenClaw browser tool, especially multi-step flows, login checks, tab management, or recovery from stale refs/timeouts.

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

canvas

Present HTML on connected OpenClaw node canvases, navigate/eval/snapshot, and debug canvas host URLs.

channel-message-flows

Use when previewing local channel message flow fixtures.

clawdtributor

Use for OpenClaw clawtributors PR/issue triage: Discrawl discovery, live-open rechecks, deep review, topic grouping, and compact @handle/LOC/type/blast/verification summaries.

clawhub

Search, install, update, sync, or publish agent skills with the ClawHub CLI and registry.

clawsweeper

Use for all ClawSweeper work: OpenClaw issue/PR sweep reports, commit-review reports, repair jobs, cloud fix PRs, @clawsweeper maintainer mention commands, trusted ClawSweeper-reviewed autofix/automerge, GitHub Actions monitoring, permissions, gates, and manual backfills.

clownfish-cloud-pr

Use when launching Clownfish in GitHub Actions to create or update one guarded GitHub implementation PR from issue/PR refs, a ClawSweeper report, a custom maintainer prompt, or to opt an existing Clownfish PR into ClawSweeper-reviewed cloud automerge.

codex-review

Codex code review closeout: local dirty changes, PR branch vs main, parallel tests.

Skills associés