bettyguo/agent_eval
An open-source benchmark for Claude Code skill bundles (.claude/skills/) and CLAUDE.md configs. Pass@k + cost + reliability, content-addressed leaderboard, runs on Anthropic / OpenAI / Google.
An open-source benchmark for Claude Code skill bundles (.claude/skills/) and CLAUDE.md configs. Pass@k + cost + reliability, content-addressed leaderboard, runs on Anthropic / OpenAI / Google.
npx skills add bettyguo/agent_evalAn open-source benchmark for Claude Code skill bundles (.claude/skills/) and CLAUDE.md configs. Pass@k + cost + reliability, content-addressed leaderboard, runs on Anthropic / OpenAI / Google.
Open source Claude skill for automated GitLab MR reviews. Configurable lint rules, semantic analysis, and inline comment posting for self-hosted GitLab instances.
A practical, no-hype workflow for AI coding agents: context, plan, implement, review, QA, ship, retro. Templates, two Claude Code skills, and a 40% context rule - every claim traced to official docs.
Build AI agent skills for paid media, direct-response copy, funnel design, and ad testing that improve conversion and scale campaigns
AI-powered post-writing toolkit for academic papers — format validation, grammar/style polishing, de-AI editing, reference checking, and reviewer-style paper audits. 5 skills for LaTeX, Typst & PDF. Focused on enhancing existing text quality, not generating from scratch.
Portable agent skills (SKILL.md) for Claude Code, Copilot CLI, Codex & Cursor — reviewloop (clear every PR reviewer), ciloop (fix red CI), standup (daily standup from your PRs). Install per-skill or bundled.
Gemini CLI skills