Community아트 & 디자인github.com

EurecaMoment/BenchClaw

BenchClaw is a Codex/OpenCode skill workflow for benchmark construction, evaluation, and maintenance. It standardizes the full pipeline—from idea drafting and data generation to evaluation, reporting, failure diagnosis, and skill refinement—so agents can build reproducible, auditable benchmarks with clear quality gates, lineage, and rollback.

지원 대상~Claude CodeCodex CLI~CursorOpenCode
npx skills add EurecaMoment/BenchClaw

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

문서

EurecaMoment/BenchClaw

BenchClaw is a Codex/OpenCode skill workflow for benchmark construction, evaluation, and maintenance. It standardizes the full pipeline—from idea drafting and data generation to evaluation, reporting, failure diagnosis, and skill refinement—so agents can build reproducible, auditable benchmarks with clear quality gates, lineage, and rollback.

관련 스킬