trajectoryRL/trajrl-bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
npx skills add trajectoryRL/trajrl-benchTrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
Run AI coding agents unattended for hours and ship PRs worth merging. Cybernetics-based multi-agent orchestration + cross-LLM peer review for Claude Code, Codex, and Gemini. Engine-enforced gates, fresh agent per checkpoint, cross-vendor review before every PR.
Agent skill for saving Chinese flomo memo drafts from AI conversations
📱 Monitor and control your Claude Code sessions remotely from your iPhone with real-time terminal access and session management.
Automate Owl Protocol tasks via Rube MCP (Composio). Always search tools first for current schemas.
Claude Skill for generating Terraform Tests
A massive, self-updating local archive of AI tools — 11,000+ agent skills, 240+ MCP servers, 2,200+ IDE rules (Cursor/Cline), and 30+ system prompt collections. One repo to rule them all.