LLM Capability Matrix

Model	Provider	Multimodal	Vision	Audio	JSON mode	Tool calling
GPT-4o	OpenAI	●	●	●	●	●
GPT-4o mini	OpenAI	●	●	○	●	●
GPT-4 Turbo	OpenAI	●	●	○	●	●
GPT-4	OpenAI	○	○	○	○	●
GPT-3.5 Turbo	OpenAI	○	○	○	●	●
Claude 3.7 Sonnet	Anthropic	●	●	○	●	●
Claude 3.7 Opus	Anthropic	●	●	○	●	●
Claude 3.5 Haiku	Anthropic	○	○	○	●	●
Gemini 2.0 Pro	Google AI	●	●	●	●	●
Gemini 2.0 Flash	Google AI	●	●	●	●	●
Mistral Large 2	Mistral	○	○	○	●	●
DeepSeek V3	DeepSeek	○	○	○	●	●

Pick the cheapest model that has every flag you need

For most production workloads the right question isn't "which model is best" but "which model is cheapest that supports the flags I need". If you need vision + tool calling + JSON mode, GPT-4o-mini ($0.15 / $0.60) gives you all three at a fraction of GPT-4o's price.

Eval the cheapest qualifying model on your task before reaching for a flagship. The capability matrix narrows the list; an eval picks the winner.

FAQ

What does "JSON mode" mean exactly?: A guarantee that the model returns syntactically valid JSON. Doesn't guarantee it conforms to your schema — that's what tool calling + JSON Schema gives you. OpenAI calls it response_format: { type: "json_object" }; Anthropic surfaces it via tool calling.
What about reasoning / chain-of-thought modes?: Not yet a column. Reasoning capability is currently a model-level distinction (o1, o3, Claude's extended thinking) rather than a flag. We'll add a "reasoning" column when the row count of reasoning-equipped models warrants it.
Why does Claude show audio = false but it can transcribe audio in some clients?: The matrix tracks API-native capability, not what client wrappers (e.g., Claude Desktop with audio attachments) layer on top. Claude's API doesn't take raw audio yet.

LLM Capability Matrix

Pick the cheapest model that has every flag you need

FAQ

Related tools