promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
Source: https://github.com/promptfoo/promptfoo
Discovered during the daily awesomeskills.dev agent-skill hunt.