derberg/eval-bench

Benchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts

Compatible avecClaude Code~Codex CLI~Cursor
npx add-skill derberg/eval-bench

derberg/eval-bench

Benchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts

Skills associés