linny006/agent-eval-harness

Live, open-source benchmark for comparing AI coding agents on real GitHub issues

agent-eval-harness 是什麼？

agent-eval-harness is a Claude Code agent skill that live, open-source benchmark for comparing AI coding agents on real GitHub issues.

相容平台~Claude Code~Codex CLI~Cursor

npx skills add linny006/agent-eval-harness

Installed? Explore more 研究與資料分析 skills: obra/superpowers, affaan-m/quarkus-verification, affaan-m/uspto-database · View all 6 →

查看原文→瀏覽所有技能

在你喜歡的 AI 中提問

開啟一個已預先載入此 Agent Skill 的新對話。

ChatGPT Claude Gemini Grok Perplexity DeepSeek

說明文件

agent-eval-harness 是做什麼的？

Live, open-source benchmark for comparing AI coding agents on real GitHub issues

相關技能

obra/superpowers

Agent skill repository discovered by 10x-chat research.

community

affaan-m/quarkus-verification

Verification loop for Quarkus projects: build, static analysis, tests with coverage, security scans, native compilation, and diff review before release or PR.

community

affaan-m/uspto-database

USPTO patent and trademark data workflow for official record lookup, PatentSearch queries, TSDR checks, assignment data, and reproducible IP research logs.

community

affaan-m/scholar-evaluation

Structured scholarly-work evaluation for papers, proposals, literature reviews, methods sections, evidence quality, citation support, and research-writing feedback.

community

affaan-m/literature-review

Systematic literature-review workflow for academic, biomedical, technical, and scientific topics, including search planning, source screening, synthesis, citation checks, and evidence logging.

community

Evidence-first current-state research workflow for ECC. Use when the user wants fresh facts, comparisons, enrichment, or a recommendation built from current public evidence and any supplied local context.

community

← More 研究與資料分析 skills

在你喜歡的 AI 中提問

說明文件

agent-eval-harness 是做什麼的？

相關技能

obra/superpowers

affaan-m/quarkus-verification

affaan-m/uspto-database

affaan-m/scholar-evaluation

affaan-m/literature-review

affaan-m/research-ops