linny006/agent-eval-harness
Live, open-source benchmark for comparing AI coding agents on real GitHub issues
Live, open-source benchmark for comparing AI coding agents on real GitHub issues
npx add-skill linny006/agent-eval-harnessLive, open-source benchmark for comparing AI coding agents on real GitHub issues
论文 SkillTab-Bench: Benchmarking Skill-Driven Agents for Multi-Turn Industrial Table Analysis 代码
北京花粉实时监测与预报 Agent Skill — supports Claude Code, Codex to query live pollen data, trends, and allergy risk for all 16 Beijing districts.
Analyze Claude Code’s BUDDY pet system from leaked source, listing all 18 ASCII pets, render logic, and system details
Financial Research Analyst Agent Skills that creates profesional-grade research reports powered by Bigdata.com Services
🐙 Accelerating Scientific Discovery — Turn one researcher into an autonomous research army that never sleeps
Professional multi-agent deep research skill for Claude Code. 100+ sources, citation chasing, credibility scoring, evidence triangulation.