derberg/eval-bench
Benchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts
Benchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts
npx add-skill derberg/eval-benchBenchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts
Let your AI agent grind on projects 24x7 fully autonomously with continuous task execution.
An open-source, one-shot AI coding cli-agent that turns natural language into tested, committed code — safely sandboxed in Docker.
CLI tool for managing multi-repo AI agent workspaces with plugin synchronization across multiple AI clients.
Automate Callerapi tasks via Rube MCP (Composio). Always search tools first for current schemas.
リポジトリの AGENTS.md / .agents/skills と Claude Code の CLAUDE.md / .claude/skills を相対 symlink で共通化する Swift 製 CLI
Universal agent skill and sub-agent manager with TUI.