prototypebench/prototypebench
Open benchmark for AI coding agents on full-stack feature shipping (React+Vite+Tailwind/FastAPI+SQLModel). 71 PR-mined tasks · 32k tests · execution-based scoring (pytest+Playwright) · no LLM-as-judge.
Open benchmark for AI coding agents on full-stack feature shipping (React+Vite+Tailwind/FastAPI+SQLModel). 71 PR-mined tasks · 32k tests · execution-based scoring (pytest+Playwright) · no LLM-as-judge.
npx add-skill prototypebench/prototypebenchOpen benchmark for AI coding agents on full-stack feature shipping (React+Vite+Tailwind/FastAPI+SQLModel). 71 PR-mined tasks · 32k tests · execution-based scoring (pytest+Playwright) · no LLM-as-judge.
A collection of DESIGN.md files inspired by popular brand design systems. Drop one into your project and let coding agents generate a matching UI.
Automate Farming Simulator 25 mod development using a Claude skill trained on game APIs and common coding patterns.
Real AI agent for your vault. Coworker, Copilot & thinking partner, that maintains your memory & knowledge, adapts to your workflows, uses plugins, skills & tools with full safety controls. BYOK & MCP.
BenchClaw is a Codex/OpenCode skill workflow for benchmark construction, evaluation, and maintenance. It standardizes the full pipeline—from idea drafting and data generation to evaluation, reporting, failure diagnosis, and skill refinement—so agents can build reproducible, auditable benchmarks with clear quality gates, lineage, and rollback.
single repo for all backend distributed system engineering skills
Exercise: Build applications with GitHub Copilot agent mode