trajectoryRL/trajrl-bench

TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench

Compatible avec~Claude Code~Codex CLI~Cursor
npx add-skill trajectoryRL/trajrl-bench

trajectoryRL/trajrl-bench

TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench

Skills associés