CommunityResearch & Data Analysisgithub.com

linny006/agent-eval-harness

Live, open-source benchmark for comparing AI coding agents on real GitHub issues

What is agent-eval-harness?

agent-eval-harness is a Claude Code agent skill that live, open-source benchmark for comparing AI coding agents on real GitHub issues.

Works with~Claude Code~Codex CLI~Cursor

npx skills add linny006/agent-eval-harness

Installed? Explore more Research & Data Analysis skills: obra/superpowers, affaan-m/quarkus-verification, affaan-m/uspto-database · View all 6 →

View original→Browse all skills

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

ChatGPT Claude Gemini Grok Perplexity DeepSeek

Documentation

What does agent-eval-harness do?

Live, open-source benchmark for comparing AI coding agents on real GitHub issues

Related Skills

obra/superpowers

Agent skill repository discovered by 10x-chat research.

community

affaan-m/quarkus-verification

Verification loop for Quarkus projects: build, static analysis, tests with coverage, security scans, native compilation, and diff review before release or PR.

community

affaan-m/uspto-database

USPTO patent and trademark data workflow for official record lookup, PatentSearch queries, TSDR checks, assignment data, and reproducible IP research logs.

community

affaan-m/scholar-evaluation

Structured scholarly-work evaluation for papers, proposals, literature reviews, methods sections, evidence quality, citation support, and research-writing feedback.

community

affaan-m/literature-review

Systematic literature-review workflow for academic, biomedical, technical, and scientific topics, including search planning, source screening, synthesis, citation checks, and evidence logging.

community

Evidence-first current-state research workflow for ECC. Use when the user wants fresh facts, comparisons, enrichment, or a recommendation built from current public evidence and any supplied local context.

community

← More Research & Data Analysis skills

Ask in your favorite AI

Documentation

What does agent-eval-harness do?

Related Skills

obra/superpowers

affaan-m/quarkus-verification

affaan-m/uspto-database

affaan-m/scholar-evaluation

affaan-m/literature-review

affaan-m/research-ops