which-llm
Use this skill when model knowledge may be stale. It queries a checked-in Artificial Analysis plus OpenRouter snapshot and can refresh it on demand.
Workflow
- Run commands from this directory with
python. - If freshness matters, run
python query.py data status. If the snapshot is stale, runpython query.py data refresh. - Use the narrowest command:
python query.py models [pattern] [filters]for shortlists.python query.py compare <model>...for side-by-side comparisons.python query.py slug <model>for OpenRouter endpoint names.python query.py show <model>before recommending a specific model.
- Explain cost fields correctly:
idx-run$is the estimated cost to run the AA benchmark suite.idx-tokis total benchmark-run token use.in$/1mandout$/1mare API prices per million tokens.
- Prefer
openrouter_slugfor production. Mentionopenrouter_free_slugonly as a prototype option because:freeendpoints can be rate-limited or served differently.
Fast Recipes
python query.py models --intel-min 50 --reasoning --sort cost --top 8
python query.py models --modality text,image --max-cost 500 --sort intel --top 8
python query.py models --no-reasoning --max-latency 6 --sort intel --top 8
python query.py models --context-min 256000 --sort cost --top 8
python query.py models --open-weights --sort intel --top 8
python query.py models --free --sort cost --top 20
python query.py compare claude-opus-4-7 gpt-5 gemini-3-1-pro
python query.py slug claude-opus-4-7
Use python query.py models --help for all filters, including --json.
Do Not Use For
- Domain evals or private benchmarks that AA does not track.
- Models so new that AA has not indexed them yet.
- Authoritative non-OpenRouter provider pricing. Verify those prices with the provider.