trajectoryRL/trajrl-bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
npx skills add trajectoryRL/trajrl-benchTrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
Track how much time you spend using Claude Code to AI-code
12-stage gated SDLC orchestrator for Claude Code — deterministic gates, REQ-ID traceability, cost-capped background agents, brownfield adoption
The exact Claude Code skill stack I ship with as a solo founder dev — 11 original skills + curated upstream manifest
Fix Terraform hallucinations in LLMs by enforcing best practices, modular code, and security for Terraform and OpenTofu configurations.
Local-first Codex agent workspace scaffold with skills, runtime checks, IM routing, and filesystem boundaries.
Automate Aeroleads tasks via Rube MCP (Composio). Always search tools first for current schemas.