Community藝術與設計github.com

EurecaMoment/BenchClaw

BenchClaw is a Codex/OpenCode skill workflow for benchmark construction, evaluation, and maintenance. It standardizes the full pipeline—from idea drafting and data generation to evaluation, reporting, failure diagnosis, and skill refinement—so agents can build reproducible, auditable benchmarks with clear quality gates, lineage, and rollback.

相容平台~Claude CodeCodex CLI~CursorOpenCode
npx add-skill EurecaMoment/BenchClaw

EurecaMoment/BenchClaw

BenchClaw is a Codex/OpenCode skill workflow for benchmark construction, evaluation, and maintenance. It standardizes the full pipeline—from idea drafting and data generation to evaluation, reporting, failure diagnosis, and skill refinement—so agents can build reproducible, auditable benchmarks with clear quality gates, lineage, and rollback.

相關技能