trajectoryRL/trajrl-bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
TrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
npx add-skill trajectoryRL/trajrl-benchTrajRL-Bench: AI agent skills benchmark. SSH sandbox with mock services, LLM judge scoring, split-half delta evaluation. Leaderboard at trajrl.com/bench
Automate Wati tasks via Rube MCP (Composio). Always search tools first for current schemas.
🚀 Self-host your AI coding agent on AWS — fully serverless with ECS Fargate. Zero idle cost, per-conversation isolation, Bedrock LLM. Deploy with CDK in minutes.
One-shot 'upgrade anything' skill for Claude Code: leads with new features, writes the brief, plans, builds with subagents, reviews via an expert panel + codex, improves until world-class, and self-learns. Works on code, UI, copy, and strategy. Self-contained with built-in fallbacks.
Benchmark Claude Code plugins/skills/agents/MCPs by A/B comparing versions with LLM-judged evaluation prompts
Clone complète d'une configuration OpenCode : plugins, MCPs, skills, AGENTS.md
Generate production-ready AI agent skills through a verified 7-step process with automated auditing, optimization, and packaging in a full web app.