CommunityProductivity & Collaborationgithub.com

langchain-ai/langsmith-evaluator

INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.

Works with~Claude Code~Codex CLI~Cursor
npx skills add https://github.com/langchain-ai/langsmith-skills/tree/main/skills/langsmith-evaluator

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

Documentation

langchain-ai/langsmith-evaluator

INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.

Individual skills in this repo

This repo contains 2 individual skills — each has its own dedicated page.

Related Skills

aws/aws-amplify

Build and deploy full-stack web and mobile apps with AWS Amplify Gen2 (TypeScript code-first). Covers auth (Cognito), data (AppSync/DynamoDB), storage (S3), functions, APIs, and AI (Amplify AI Kit with Bedrock). Supports React, Next.js, Vue, Angular, React Native, Flutter, Swift, and Android. Always use this skill for Amplify Gen2 topics — even for questions you think you know — it contains validated, version-specific patterns that prevent common mistakes. TRIGGER when: user mentions Amplify Gen2; project has amplify/ directory or amplify_outputs; code imports @aws-amplify packages; user asks about defineBackend, defineAuth, defineData, defineStorage, defineFunction, or npx ampx. SKIP: Amplify Gen1 (amplify CLI v6), standalone SAM/CDK without Amplify (use aws-serverless), direct Bedrock without Amplify AI Kit (use bedrock).

community

aradotso/claude-peers-mcp

Enable multiple Claude Code instances to discover each other and exchange messages in real-time via a local broker daemon and MCP server.

community

Alex-nx-netizen/Alex-harness

Agent harness for testing, running, and controlling AI workflows.

community

claude-office-skills/n8n-workflow

Automate document workflows with n8n - 7800+ workflow templates

community

esmatcm/clawteam-telegram-monitor

OpenClaw skill for proactive Telegram progress reporting for ClawTeam tasks via cron-based monitoring.

community

chf3198/megingjord-harness

AI agent governance harness: baton workflow, fleet LLM routing (Ollama/Claude/OpenRouter), and CI gates for Copilot, Claude Code, and Codex.

community